Drazen Prelec's Bayesian truth serum 1 is becoming well-known on LW as a means of divining truth from biased opinions. The method exploits the tendency for opinions to correlate with predictions of the proportion of others with the same opinion, known as the false consensus effect or the typical mind fallacy. 2 Even though BTS seems appealing and I've seen a couple people hope for an online implementation, it can fail badly when monetary transfers aren't present. I'm going to present a better method that operates on weaker assumptions and doesn't require money to change hands.
Original Bayesian Truth Serum
Suppose n people are asked a question with m possible answers. Each person will answer the question and predict the proportion of other people giving each answer. If the i-th person gives answer k, let xik = 1 and otherwise xik = 0. Let the prediction by person i of the proprotion of others answering k be yik. For each answer k, use these to compute the actual proportions and the geometric mean of predictions :
Then, compute individual payments si as
The first sum is the information score, rewarding a choice of a surprisingly common answer. The second sum is the prediction score, rewarding accurate predictions of others' answers. With a sufficiently large number of participants, honest reporting is a Bayes-Nash equilibrium. The people with the correct opinion will tend to have the highest scores on average, even if they are in the minority, so it's possible to learn the truth even in the face of bias.
Some potential issues with this procedure:
- The participants should be Bayesians with a common prior. The common prior assumption is mostly for convenience in proving equilibrium, and it's unclear how necessary it is for truth-telling. Without assuming common priors, tricky issues arise with higher-order beliefs about the priors of others. Setting aside incentives for honesty, a common prior isn't necessary to distinguish the truth in simulations.
- The number of respondents has to be sufficiently large to guarentee truthfulness, but this number depends on the unknown common prior of the participants. I find this the least troubling assumption, since it's not obvious how to extract extra profit on this basis alone. Witkowski and Parkes (2012) construct a similar mechanism that is incentive compatible with as few as three participants, but appears to be more sensitive to the common prior assumption.
- Participants must care only about maximizing their score. In particular, participants don't care about which answer ends up being favored by the mechanism. This is the really troubling assumption, particularly when money isn't involved. If payments are trivial or not present, here's a simple manipulation: if you want answer k to win, pick some other answer at random and give your prediction yk close to zero. Your own score will be very negative, but anyone who gave answer k will have a huge score.
- Participants can't be too exposed to the opinions of others. Keeping a public tally of answers makes this useless.
Robust Bayesian truth serum
Suppose participants care about influencing the final result rather than their score. Then the mechanism has to be constructed in such a way that influence is maximized by being honest. For yes/no questions, majority vote has this property. You can't do better than giving your true opinion. As discussed above, Prelec's BTS does not have this property. Instead, I'm going to rely on asymmetric polynomial scoring rules.
Suppose there are two answers, a and b. Ask people for their opinion and the proportion yi of others they expect to answer a. Let na be the number of people answering a and nb be the number of b answers. Then, the scores for a endorsers are:
and the scores for b endorsers are:
where t is some positive integer. Finally, average the scores for each answer. The answer with the highest mean score is most likely to be correct.
Unlike BTS, this works with any number of participants, does not depend on a common prior, and can't be directly manipulated. Of course, it is still susceptible to false-name attacks or inside knowledge of the answers of others. In simulations, this performs about as well or better than BTS for t = 5 or 6. I've only tested this for binary questions, but I have in mind a generalization to multiple answers.
Why exactly does this work? I'm still trying to figure that out. My results are primarily numerical, not analytical. More details can be found in my working paper. The source for the paper and simulations 3 is here, if you want to dig in even further. An online implementation is in progress, although going slowly with my scanty web development skills.
This effect exists even with perfect Bayesian rationalists, coming from an update on a single datapoint (and hence only fallicious when overdone). ↩
Written literately in R and LaTeX, it should run easily once you install the knitr, compiler, xtable, and nloptr packages. With the current settings, the computations take ~5 minutes. Accurate results take hours (which means I should probably write in something other than R...). ↩