Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
I was recently disturbed by my perception that, despite years of studying and debating probability problems, the LessWrong community as a whole has not markedly improved its ability to get the right answer on them.
I had expected that people would read posts and comments by other people, and take special note of comments by people who had a prior history of being right, and thereby improve their own accuracy.
But can that possibly work? How can someone who isn't already highly-accurate, identify other people who are highly accurate?
Aumann's agreement theorem (allegedly) says that Bayesians with the same priors agree. But it doesn't say that doing so helps. Under what circumstances does revising your opinions, by updating in response to people you consider reliable, actually improve your accuracy?
To find out, I built a model of updating in response to the opinions of others. It did, eventually, show that Bayesians improve their collective opinions by updating in response to the opinions of other Bayesians. But this turns out not to depend on them satisfying the conditions of Aumann's theorem, or on doing Bayesian updating. It depends only on a very simple condition, established at the start of the simulation. Can you guess what it is?
I'll write another post describing and explaining the results if this post receives a karma score over 10.
The first part of this post describes a way of interpreting the basic mathematics of Bayesianism. Eliezer already presented one such view at http://lesswrong.com/lw/hk/priors_as_mathematical_objects/, but I want to present another one that has been useful to me, and also show how this view is related to the standard formalism of probability theory and Bayesian updating, namely the probability space.
The second part of this post will build upon the first, and try to explain the math behind Aumann's agreement theorem. Hal Finney had suggested this earlier, and I'm taking on the task now because I recently went through the exercise of learning it, and could use a check of my understanding. The last part will give some of my current thoughts on Aumann agreement.
The majority of people would hold more accurate beliefs if they simply believed the majority. To state this in a way that doesn't risk information cascades, we're talking about averaging impressions and coming up with the same belief.
To the degree that you come up with different averages of the impressions, you acknowledge that your belief was just your impression of the average, and you average those metaimpressions and get closer to belief convergence. You can repeat this until you get bored, but if you're doing it right, your beliefs should get closer and closer to agreement, and you shouldn't be able to predict who is going to fall on which side.
Of course, most of us are atypical cases, and as good rationalists, we need to update on this information. Even if our impressions were (on average) no better than the average, there are certain cases where we know that the majority is wrong. If we're going to selectively apply majoritarianism, we need to figure out the rules for when to apply it, to whom, and how the weighting works.
This much I think has been said again and again. I'm gonna attempt to describe how.
Two ideal Bayesians cannot have common knowledge of disagreement; this is a theorem. If two rationalist-wannabes have common knowledge of a disagreement between them, what could be going wrong?
The obvious interpretation of these theorems is that if you know that a cognitive machine is a rational processor of evidence, its beliefs become evidence themselves.
If you design an AI and the AI says "This fair coin came up heads with 80% probability", then you know that the AI has accumulated evidence with an likelihood ratio of 4:1 favoring heads - because the AI only emits that statement under those circumstances.
It's not a matter of charity; it's just that this is how you think the other cognitive machine works.
And if you tell an ideal rationalist, "I think this fair coin came up heads with 80% probability", and they reply, "I now think this fair coin came up heads with 25% probability", and your sources of evidence are independent of each other, then you should accept this verdict, reasoning that (before you spoke) the other mind must have encountered evidence with a likelihood of 1:12 favoring tails.
But this assumes that the other mind also thinks that you're processing evidence correctly, so that, by the time it says "I now think this fair coin came up heads, p=.25", it has already taken into account the full impact of all the evidence you know about, before adding more evidence of its own.
I've always been annoyed at the notion that the bias-variance decomposition tells us something about modesty or Philosophical Majoritarianism. For example, Scott Page rearranges the equation to get what he calls the Diversity Prediction Theorem:
Collective Error = Average Individual Error - Prediction Diversity
I think I've finally come up with a nice, mathematical way to drive a stake through the heart of that concept and bury it beneath a crossroads at midnight, though I fully expect that it shall someday rise again and shamble forth to eat the brains of the living.
The Modesty Argument states that when two or more human beings have common knowledge that they disagree about a question of simple fact, they should each adjust their probability estimates in the direction of the others'. (For example, they might adopt the common mean of their probability distributions. If we use the logarithmic scoring rule, then the score of the average of a set of probability distributions is better than the average of the scores of the individual distributions, by Jensen's inequality.)
Put more simply: When you disagree with someone, even after talking over your reasons, the Modesty Argument claims that you should each adjust your probability estimates toward the other's, and keep doing this until you agree. The Modesty Argument is inspired by Aumann's Agreement Theorem, a very famous and oft-generalized result which shows that genuine Bayesians literally cannot agree to disagree; if genuine Bayesians have common knowledge of their individual probability estimates, they must all have the same probability estimate. ("Common knowledge" means that I know you disagree, you know I know you disagree, etc.)
I've always been suspicious of the Modesty Argument. It's been a long-running debate between myself and Robin Hanson.