Unnamed comments on Less Wrong Polls in Comments - Less Wrong

79 Post author: jimrandomh 19 September 2012 04:19PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (302)

Sort By: Popular

You are viewing a single comment's thread.

Comment author: Unnamed 20 September 2012 06:49:20AM 6 points [-]

Pick your answer to this poll at random:

Submitting...

Comment author: Unnamed 20 October 2012 10:13:43PM 1 point [-]

After one month and 118 responses, I'm considering this poll closed. The results are:

1) 17%
2) 21%
3) 20%
4) 24%
5) 18%

A chi-squared test says that these results do not differ significantly from uniform random responding, with a p-value of 0.78.

The main reason why I ran this poll was because I thought it might have implications for the trickier poll above. It is interesting the option #4 was the most common response in this poll, that poll, and the gamefaqs poll which that poll was based on. #4 may seem especially random, and some respondents in the other polls may have just been trying to answer at random. But this poll ended up not providing much information about that; to test it we'd need a larger sample size, and preferably a poll where respondents did not use external sources of randomness.

Comment author: mfb 30 September 2012 05:32:46PM 1 point [-]

I think this would be even more interesting as "pick at random, without an external source of randomness". Sure you can get random numbers from random.org, your computer or the seconds on your watch (a nice idee), but those just blur the effect of mind-generated random numbers.

Comment author: gwern 26 September 2012 04:25:32PM 1 point [-]

For convenience: http://www.random.org/ or in Bash, echo $(($RANDOM % 5 + 1))

Comment author: RobinZ 26 September 2012 02:09:05PM 1 point [-]

Question: what's a reasonable prior over the probability distribution of poll answers? Because I downloaded the raw data, and it says:

  1. 15
  2. 22
  3. 21
  4. 24
  5. 18

...and I'm not sure what would constitute reasonable priors for the uniform distribution hypothesis versus the "aversion toward First Answer" hypothesis versus the "aversion toward First Answer and Fifth Answer" hypothesis.

Comment author: othercriteria 30 September 2012 03:34:09PM 3 points [-]

Your question is confused. The uniform distribution hypothesis only requires that the (assumed infinite) population picks the answers independently with equal probability. Under this hypothesis, the observed poll answers (for a fixed number of respondents) will follow a multinomial distribution with parameters (0.2, 0.2, 0.2, 0.2, 0.2). A typical realization will not have an equal number of respondents giving each answer, although asymptotically the empirical frequencies will converge to equality.

Anyways, as a Bayesian, the better question is what should my posterior belief about the response probabilities be after running the poll and updating off the answers? The canonical way to do this would be to put a Dirichlet prior over the response probabilities. By the miracle of conjugacy, your posterior distribution will itself by a (generally different) Dirichlet distribution.

By taking the expectation of indicator variables like I{"probability of First Answer under 0.2"} under the posterior, you can figure out what degree of belief you must give to statements like "respondents have an aversion toward First Answer".

Comment author: RobinZ 30 September 2012 03:40:20PM 3 points [-]

That makes sense - I had imagined doing something similar, but I had never heard of Dirichlet priors.

Comment author: othercriteria 30 September 2012 04:00:20PM 2 points [-]

Happy this helped. The Dirichlet-multinomial model gets relatively little attention because it adds nothing really new to the beta-binomial model for polls with just two responses. It's easy to find lots of introductory, chatty introductions to the beta-binomial like this one or this one if you want to learn more...

Comment author: Kindly 26 September 2012 04:32:30PM 3 points [-]

My own feelings on the matter are that if you don't know what prior to have, compute worst-case bounds.

In this case, the model that maximizes the probability of seeing this data is that each answer is 15% likely to be 1, 22% likely to be 2, 21% likely to be 3, 24% likely to be 4, and 18% likely to be 5. We can compute the probability of seeing this data under this model, and also under the "all answers are equally likely" model, and conclude that our worst-case model makes us only 3.61 times as likely to see this data.

In particular, any other hypothesis you might have can only receive this little evidence, relative to the uniform distribution hypothesis; and I believe in close-to-uniformity enough that I'm not going to be swayed by what is fewer than 2 bits of evidence.

Comment author: RobinZ 27 September 2012 02:16:11AM 1 point [-]

Thanks! I didn't think of that particular brainhack - I'll be sure to use it in the future.

Comment author: Bugmaster 20 September 2012 08:00:26PM 5 points [-]

I used random.org to generate my answer.

But, when I submitted it, I got the following:

First Answer 0 (0%)
Second Answer 0 (0%)
Third Answer 0 (0%)
Fourth Answer 1 (2%)
Fifth Answer 0 (0%)
Total 58 (100%)

The raw data contained all the 58 rows, however. Seems like there might be a bug in the result-rendering code.

Comment author: royf 20 September 2012 05:03:05PM *  5 points [-]

To anyone thinking this is not random, with 42 votes in:

  • The p-value is 0.895 (this is the probability of seeing at least this much non-randomness, assuming a uniform distribution)

  • The entropy is 2.302bits instead of log(5) = 2.322bits, for 0.02bits KL-distance (this is the number of bits you lose for encoding one of these votes as if it was random)

If you think you see a pattern here, you should either see a doctor or a statistician.

Comment author: gwern 26 September 2012 08:00:42PM *  1 point [-]

Well, it's worth noting people seem to be trainable to choose randomly: http://dl.dropbox.com/u/85192141/1986-neuringer.pdf

Apropos of the PRNG discussion in http://blog.yunwilliamyu.net/2011/08/14/mindhack-mental-math-pseudo-random-number-generators/ for which I wrote some flashcards: http://pastebin.com/CKif0fEf

Comment author: DanArmak 20 September 2012 06:19:33PM 3 points [-]

I wish I could see a doctor-statistician. Or at least a doctor who understood statistics.

Comment author: shminux 20 September 2012 06:34:14PM 6 points [-]

Yvain might some day have his own practice.

Comment author: kerspoon 25 September 2012 12:29:23PM 0 points [-]
Comment author: [deleted] 20 September 2012 07:29:59PM 1 point [-]

Looks like we're better at randomness than the rest of the population. If I asked random people for a random number from 1 to 10, I wouldn't be surprised to see substantially less than 3.322 bits of entropy per number (e.g., many more than 10% of the people choosing 7).

Comment author: RobinZ 20 September 2012 06:40:48PM 1 point [-]

I rolled 1d6, intending to reroll any 6s.

Comment author: [deleted] 20 September 2012 10:47:36AM 2 points [-]

Is (the seconds' figure in my watch) mod 5 random enough?

Comment author: Luke_A_Somers 20 September 2012 01:43:12PM 1 point [-]

I used the least significant digit on my time-remaining-to-full-charge. And ended up propping up the most populated entry.

Comment author: BlazeOrangeDeer 25 September 2012 12:59:59AM *  1 point [-]

I needed 3 random bits (and threw out any overflow), which I got by checking whether arbitrary words or phrases I thought of had an even or odd number of letters. That's the most random completely mental (heh) way I know of, I wonder if there are others.

Comment author: Luke_A_Somers 25 September 2012 10:22:27AM 2 points [-]

... you could have done it more-reliably evenly by taking the mod 5 of the phrase/word length.

Comment author: [deleted] 25 September 2012 10:26:36PM 0 points [-]

Considering that the average word length in English is about five letters, I suspect that'd be quite far from being uniformly distributed.

Comment author: Luke_A_Somers 26 September 2012 01:39:01PM 1 point [-]

Average is irrelevant. What's relevant is the standard deviation.

Since standard deviation goes as the square root of the number of items being added, phrase length for any reasonably-sized phrase, so long as it wasn't a line of poetry, should be pretty evenly distributed.

Comment author: [deleted] 25 September 2012 10:23:00PM 0 points [-]

It's not obvious to me that it's unbiased. My gut feeling suspects that if I randomly chose a word it'd be more likely to have an odd than an even number of letters.

Comment author: scav 20 September 2012 07:54:02AM 3 points [-]

Ha. I fail at random. In my defence, the universe is probably deterministic anyway.

Comment author: BlazeOrangeDeer 25 September 2012 12:58:22AM 0 points [-]

it's probably not, but you're still excused ;)