PhilGoetz comments on The Power of Noise - LessWrong

28 Post author: jsteinhardt 16 June 2014 05:26PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: IlyaShpitser 29 October 2015 09:14:51PM *  2 points [-]

We know what property we want (that randomization will give you), good balance in relevant covariates between two groups. I can use a deterministic algorithm for this, and in fact people do, e.g. matching algorithms. Another thing people do is try all possible assignments (see: permutation tests for the null).

Discussion of AI and omniscience is a complete red herring, you don't need that to show that you don't need randomness for this. We aren't randomizing for the sake of randomizing, we are doing it because we want some property that we can directly target deterministically.


I don't think EY can possibly know enough math to make his claim go through, I think this is an "intellectual marketing" claim. People do this a lot, if we are talking about your claim, you won the game.

Comment author: PhilGoetz 31 October 2015 06:44:10PM *  2 points [-]

If you sort all the subjects on one criteria, it may be correlated in an unexpected way with another criteria you're unaware of. Suppose you want to study whether licorice causes left-handedness in a population from Tonawanda, NY. So you get a list of addresses from Tonawanda New York, sort them by address, and go down the list throwing them alternately into control and experimental group. Then you mail the experimental group free licorice for a ten years. Voila, after 10 years there are more left-handers in the experimental group.

But even and odd addresses are on opposite sides of the street. And it so happens that in Tonawanda, NY, the screen doors on the front of every house are hinged on the west side, regardless of which way the house faces, because the west wind is so strong it would rip the door off its hinges otherwise. So people on the north side of the street, who are mostly in your experimental group, open the door with their left hand, getting a lot of exercise from this (the wind is very strong), while people on the south side open the screen door with their right hand.

It seems unlikely to me that many hidden correlations would survive alternating picks from a sorted list like this rigged example, but if the sample size is large enough, you'd still be better off randomizing than following any deterministic algorithm, because "every other item from a list sorted on X" has low Kolmogorov complexity and can be replicated by an unknown correlate of your observable variable by chance.