An information cascade is a problem in group rationality. Wikipedia has excellent introductions and links about the phenomenon, but here is a meta-ish example using likelihood ratios.
Suppose in some future version of this site, there are several well-known facts:
- All posts come in two kinds, high quality (insightful and relevant) and low quality (old ideas rehashed, long hypotheticals).
- There is a well-known prior 60% chance of anything being high quality, rather than low quality. (We're doing well!)
- Readers get a private signal, either "high" or "low", their personal judgement of quality, which is wrong 20% of the time.
- The number of up and down votes is displayed next to each post. (Note the difference from the present system, which only displays up minus down. This hypothesis makes the math easier.)
- Readers are competent in Bayesian statistics and strive to vote the true quality of the post.
Let's talk about how the very first reader would vote. If they judged the post high quality, then they would multiply the prior likelihood ratio (6:4) times the bayes factor for a high private signal (4:1), get (6*4:4*1) = (6:1) and vote the post up. If they judged the post low quality then they would instead multiply by the bayes factor for a low private signal (1:4), get (6*1:4*4) = (3:8) and vote the post down.
There were two scenarios for the first reader (private information high or low). If we speculate that the first reader did in fact vote up, then there are two scenarios for the second scenario: There are two scenarios for the second reader:
- Personal judgement high: (6:4)*(4:1)*(4:1) = (24:1), vote up.
- Personal judgement low: (6:4)*(1:4)*(4:1) = (6:4), vote up against personal judgement.
Note that now there are two explanations for ending up two votes up. It could be that the second reader actually agreed, or it could be that the second reader was following the first reader and the prior against their personal judgement. That means that the third reader gets zero information from the second reader's personal judgement! The two scenarios for the third reader, and every future reader, are exactly analogous to the two scenarios for the second reader.
- Personal judgement high: (6:4)*(4:1)*(4:1) = (24:1), vote up.
- Personal judgement low: (6:4)*(1:4)*(4:1) = (6:4), vote up against personal judgement.
This has been a nightmare scenario of groupthink afflicting even diligent bayesians. Possible conclusions:
- Don't strive to vote the true quality of the post, strive to vote your personal judgement.
- Try to avoid even noticing the score. (Maybe scores could even be occluded, like spoiler-text?)
- Information cascades are dangerous and interesting. We should develop good cognitive citizenship techniques.
- Broadcast novel evidence, not conclusions.
Note: Olle found an error that necessitated a rewrite. I apologize.
The endpoints 1,2 and 4 are more or less equivalent; they are worth repeating though. There isn't really any worth in a score of votes on the true quality, at least not for bayesians. A score of votes on individual judgments would contain all useful information.
A thought experiment: You could use a double voting system: you make one vote on your beliefs before updating on the consensus and another vote in a separate count on your updated belief. The point would be to update on the consensus of the first vote count and use the second vote count for all other purposes, eg. promoting on the front page. This would allow broadcasting of each persons novel evidence (their individual judgement) as well as keeping some kind of aggregate score for the sites algorithms to work with. It would probably be easy to create an algorithm that makes full use of the first score though and as long as one can't think of a good use of the second count, one shouldn't vote on ones updated beliefs in a single vote system I guess.
A minor point about the calculations: An ideal bayesian wouldn't do the calculation you did. Knowing the voting procedure, they would dismiss any votes not contributing new information. As the order of the votes isn't public, they would have to keep a prior for the different orders and update on that. This is of course a minor quibble as this would lead to far too much calculations to be a reasonable model for any real reader.
"An ideal bayesian wouldn't..." I apologize, I'm not following.
I was dismissing votes not contributing new information. The order of the votes is partly deduced. Regarding the part that isn't deduced, there is no evidence to update on, and the prior is included - it's the (6:4) factor.
Would you mind posting what the ideal bayesian's calculations would look like?