In 1919, Sir Arthur Eddington led expeditions to Brazil and to the island of Principe, aiming to observe solar eclipses and thereby test an experimental prediction of Einstein’s novel theory of General Relativity. A journalist asked Einstein what he would do if Eddington’s observations failed to match his theory. Einstein famously replied: “Then I would feel sorry for the good Lord. The theory is correct.”
It seems like a rather foolhardy statement, defying the trope of Traditional Rationality that experiment above all is sovereign. Einstein seems possessed of an arrogance so great that he would refuse to bend his neck and submit to Nature’s answer, as scientists must do. Who can know that the theory is correct, in advance of experimental test?
Of course, Einstein did turn out to be right. I try to avoid criticizing people when they are right. If they genuinely deserve criticism, I will not need to wait long for an occasion where they are wrong.
And Einstein may not have been quite so foolhardy as he sounded . . .
To assign more than 50% probability to the correct candidate from a pool of 100,000,000 possible hypotheses, you need at least 27 bits of evidence (or thereabouts). You cannot expect to find the correct candidate without tests that are this strong, because lesser tests will yield more than one candidate that passes all the tests. If you try to apply a test that only has a million-to-one chance of a false positive (~ 20 bits), you’ll end up with a hundred candidates. Just finding the right answer, within a large space of possibilities, requires a large amount of evidence.
Traditional Rationality emphasizes justification: “If you want to convince me of X, you’ve got to present me with Y amount of evidence.” I myself often slip into this phrasing, whenever I say something like, “To justify believing in this proposition, at more than 99% probability, requires 34 bits of evidence.” Or, “In order to assign more than 50% probability to your hypothesis, you need 27 bits of evidence.” The Traditional phrasing implies that you start out with a hunch, or some private line of reasoning that leads you to a suggested hypothesis, and then you have to gather “evidence” to confirm it—to convince the scientific community, or justify saying that you believe in your hunch.
But from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis just to locate the hypothesis in theory-space. It’s not a question of justifying anything to anyone. If there’s a hundred million alternatives, you need at least 27 bits of evidence just to focus your attention uniquely on the correct answer.
This is true even if you call your guess a “hunch” or “intuition.” Hunchings and intuitings are real processes in a real brain. If your brain doesn’t have at least 10 bits of genuinely entangled valid Bayesian evidence to chew on, your brain cannot single out a correct 10-bit hypothesis for your attention—consciously, subconsciously, whatever. Subconscious processes can’t find one out of a million targets using only 19 bits of entanglement any more than conscious processes can. Hunches can be mysterious to the huncher, but they can’t violate the laws of physics.
You see where this is going: At the time of first formulating the hypothesis—the very first time the equations popped into his head—Einstein must have had, already in his possession, sufficient observational evidence to single out the complex equations of General Relativity for his unique attention. Or he couldn’t have gotten them right.
Now, how likely is it that Einstein would have exactly enough observational evidence to raise General Relativity to the level of his attention, but only justify assigning it a 55% probability? Suppose General Relativity is a 29.3-bit hypothesis. How likely is it that Einstein would stumble across exactly 29.5 bits of evidence in the course of his physics reading?
Not likely! If Einstein had enough observational evidence to single out the correct equations of General Relativity in the first place, then he probably had enough evidence to be damn sure that General Relativity was true.
In fact, since the human brain is not a perfectly efficient processor of information, Einstein probably had overwhelmingly more evidence than would, in principle, be required for a perfect Bayesian to assign massive confidence to General Relativity.
“Then I would feel sorry for the good Lord; the theory is correct.” It doesn’t sound nearly as appalling when you look at it from that perspective. And remember that General Relativity was correct, from all that vast space of possibilities.
Einstein didn't come up with General Relativity that way. He didn't even do the hard math himself. He came up with some little truths (e.g. equivalence, speed of light is constant, covariance, must reduce to Newtonian gravity in unexceptional cases), from a handful of results that didn't seem to fit classical theory, and then he found a set of equations that fit.
Newtonian gravity provided heaps of data points and a handful of non-fits. Einstein bootstrapped on prior achievements like Newtonian gravity and special relativity and tweaked them to fit a handful of additional data points better. His confidence came from fitting 100% of the small available data set (something that wasn't clear in the case of the cosmological constant), however small it may have been. The minimum bit hypothesis assumes that all bits are created equal. But they aren't. Some bits advance the cause not at all, some bits advance it a great deal.
Similarly, the 27 bit rule for 100,000,000 people assumes that the bits have equal numbers of people who are yes and no on a question. In fact, some bits are more discriminating than others. "Have you ever been elected to an office that requires a statewide vote or been a Vice President?" (perhaps two bits of information), is going to eliminate 99.9999%+ of potential candidates for President, yet work nearly perfectly to dramatically narrow the field from the 100,000,000 eligible candidates. "Do you want to run for President?", cuts another 90%+ of potential candidates.
Einstein was confident because his bits had greater discriminatory power than other bits of information. There are only so many ways it is logically possible to fit the data he had.