Related: Horrible LHC Inconsistency, The Proper Use of Humility
Overconfidence, I've noticed, is a big fear around these parts. Well, it is a known human bias, after all, and therefore something to be guarded against. But I am going to argue that, at least in aspiring-rationalist circles, people are too afraid of overconfidence, to the point of overcorrecting -- which, not surprisingly, causes problems. (Some may detect implications here for the long-standing Inside View vs. Outside View debate.)
Here's Eliezer, voicing the typical worry:
[I]f you asked me whether I could make one million statements of authority equal to "The Large Hadron Collider will not destroy the world", and be wrong, on average, around once, then I would have to say no.
I now suspect that misleading imagery may be at work here. A million statements -- that sounds like a lot, doesn't it? If you made one such pronouncement every ten seconds, a million of them would require you to spend months doing nothing but pontificating, with no eating, sleeping, or bathroom breaks. Boy, that would be tiring, wouldn't it? At some point, surely, your exhausted brain would slip up and make an error. In fact, it would surely make more than one -- in which case, poof!, there goes your calibration.
No wonder, then, that people claim that we humans can't possibly hope to attain such levels of certainty. Look, they say, at all those times in the past when people -- even famous scientists! -- said they were 99.999% sure of something, and they turned out to be wrong. My own adolescent self would have assigned high confidence to the truth of Christianity; so where do I get the temerity, now, to say that the probability of this is 1-over-oogles-and-googols?
[EDIT: Unnecessary material removed.]
A probability estimate is not a measure of "confidence" in some psychological sense. Rather, it is a measure of the strength of the evidence: how much information you believe you have about reality. So, when judging calibration, it is not really appropriate to imagine oneself, say, judging thousands of criminal trials, and getting more than a few wrong here and there (because, after all, one is human and tends to make mistakes). Let me instead propose a less misleading image: picture yourself programming your model of the world (in technical terms, your prior probability distribution) into a computer, and then feeding all that data from those thousands of cases into the computer -- which then, when you run the program, rapidly spits out the corresponding thousands of posterior probability estimates. That is, visualize a few seconds or minutes of staring at a rapidly-scrolling computer screen, rather than a lifetime of exhausting judicial labor. When the program finishes, how many of those numerical verdicts on the screen are wrong?
I don't know about you, but modesty seems less tempting to me when I think about it in this way. I have a model of the world, and it makes predictions. For some reason, when it's just me in a room looking at a screen, I don't feel the need to tone down the strength of those predictions for fear of unpleasant social consequences. Nor do I need to worry about the computer getting tired from running all those numbers.
In the vanishingly unlikely event that Omega were to appear and tell me that, say, Amanda Knox was guilty, it wouldn't mean that I had been too arrogant, and that I had better not trust my estimates in the future. What it would mean is that my model of the world was severely stupid with respect to predicting reality. In which case, the thing to do would not be to humbly promise to be more modest henceforth, but rather, to find the problem and fix it. (I believe computer programmers call this "debugging".)
A "confidence level" is a numerical measure of how stupid your model is, if you turn out to be wrong.
The fundamental question of rationality is: why do you believe what you believe? As a rationalist, you can't just pull probabilities out of your rear end. And now here's the kicker: that includes the probability of your model being wrong. The latter must, paradoxically but necessarily, be part of your model itself. If you're uncertain, there has to be a reason you're uncertain; if you expect to change your mind later, you should go ahead and change your mind now.
This is the first thing to remember in setting out to dispose of what I call "quantitative Cartesian skepticism": the view that even though science tells us the probability of such-and-such is 10-50, well, that's just too high of a confidence for mere mortals like us to assert; our model of the world could be wrong, after all -- conceivably, we might even be brains in vats.
Now, it could be the case that 10-50 is too low of a probability for that event, despite the calculations; and it may even be that that particular level of certainty (about almost anything) is in fact beyond our current epistemic reach. But if we believe this, there have to be reasons we believe it, and those reasons have to be better than the reasons for believing the opposite.
I can't speak for Eliezer in particular, but I expect that if you probe the intuitions of people who worry about 10-6 being too low of a probability that the Large Hadron Collider will destroy the world -- that is, if you ask them why they think they couldn't make a million statements of equal authority and be wrong on average once -- they will cite statistics about the previous track record of human predictions: their own youthful failures and/or things like Lord Kelvin calculating that evolution by natural selection was impossible.
To which my reply is: hindsight is 20/20 -- so how about taking advantage of this fact?
Previously, I used the phrase "epistemic technology" in reference to our ability to achieve greater certainty through some recently-invented methods of investigation than through others that are native unto us. This, I confess, was an almost deliberate foreshadowing of my thesis here: we are not stuck with the inferential powers of our ancestors. One implication of the Bayesian-Jaynesian-Yudkowskian view, which marries epistemology to physics, is that our knowledge-gathering ability is as subject to "technological" improvement as any other physical process. With effort applied over time, we should be able to increase not only our domain knowledge, but also our meta-knowledge. As we acquire more and more information about the world, our Bayesian probabilities should become more and more confident.
If we're smart, we will look back at Lord Kelvin's reasoning, find the mistakes, and avoid making those mistakes in the future. We will, so to speak, debug the code. Perhaps we couldn't have spotted the flaws at the time; but we can spot them now. Whatever other flaws may still be plaguing us, our score has improved.
In the face of precise scientific calculations, it doesn't do to say, "Well, science has been wrong before". If science was wrong before, it is our duty to understand why science was wrong, and remove known sources of stupidity from our model. Once we've done this, "past scientific predictions" is no longer an appropriate reference class for second-guessing the prediction at hand, because the science is now superior. (Or anyway, the strength of the evidence of previous failures is diminished.)
That is why, with respect to Eliezer's LHC dilemma -- which amounts to a conflict between avoiding overconfidence and avoiding hypothesis-privileging -- I come down squarely on the side of hypothesis-privileging as the greater danger. Psychologically, you may not "feel up to" making a million predictions, of which no more than one can be wrong; but if that's what your model instructs you to do, then that's what you have to do -- unless you think your model is wrong, for some better reason than a vague sense of uneasiness. Without, ultimately, trusting science more than intuition, there's no hope of making epistemic progress. At the end of the day, you have to shut up and multiply -- epistemically as well as instrumentally.
I'd like to recast the problem this way: we know we're running on error-prone hardware, but standard probability theory assumes that we're running on errorless hardware, and seems to fail, at least in some situations, when running on error-prone hardware. What is the right probability theory and/or decision theory for running on error-prone hardware?
ETA: Consider ciphergoth's example:
This kind of reasoning can be derived from standard probability theory and would work fine on someone running errorless hardware. But it doesn't work for us.
We need to investigate this problem systematically, and not just make arguments about whether we're too confident or not confident enough, trying to push the public consensus back and forth. The right answer might be completely different, like perhaps we need different kinds or multiple levels of confidence, or upper and lower bounds on probability estimates.
I think that standard probability theory assumes a known ontology and infinite computing power. We should ideally also be able to produce a probability theory for agents with realistically necessary constraints but without the special constraints that we have.