I like to think Einstein's confidence came instead from his belief that Relativity suitably justified the KL divergence between experiments in 1905 and physics theory in 1905. He was not necessarily in full possession of whatever evidence was required to narrow the hypothesis space down to relativity (which is a bit of a misformulation, I feel, since this space still contains a number of other theories both equally and more powerful than Physics+Relativity) but instead possessed enough so that in his own mental metropolis jumping he stumbled across Relativity (possibly the next closest convenient point climbing from the prior of Physics to the posterior including new evidence for the time) and sat there.
His comment just reflected a belief that new experiments were unlikely to yet be including the same new information he already used. In some sense, their resolution was not yet strong enough to pinpoint something more precise than Relativity.
Not to knock Einstein, of course. Just because you have new evidence drawing you to a different posterior hypothesis doesn't mean that the update is going to be easy. That's perhaps where the philosophy of Bayes runs into the computational limitations of today.
In 1919, Sir Arthur Eddington led expeditions to Brazil and to the island of Principe, aiming to observe solar eclipses and thereby test an experimental prediction of Einstein’s novel theory of General Relativity. A journalist asked Einstein what he would do if Eddington’s observations failed to match his theory. Einstein famously replied: “Then I would feel sorry for the good Lord. The theory is correct.”
It seems like a rather foolhardy statement, defying the trope of Traditional Rationality that experiment above all is sovereign. Einstein seems possessed of an arrogance so great that he would refuse to bend his neck and submit to Nature’s answer, as scientists must do. Who can know that the theory is correct, in advance of experimental test?
Of course, Einstein did turn out to be right. I try to avoid criticizing people when they are right. If they genuinely deserve criticism, I will not need to wait long for an occasion where they are wrong.
And Einstein may not have been quite so foolhardy as he sounded . . .
To assign more than 50% probability to the correct candidate from a pool of 100,000,000 possible hypotheses, you need at least 27 bits of evidence (or thereabouts). You cannot expect to find the correct candidate without tests that are this strong, because lesser tests will yield more than one candidate that passes all the tests. If you try to apply a test that only has a million-to-one chance of a false positive (~ 20 bits), you’ll end up with a hundred candidates. Just finding the right answer, within a large space of possibilities, requires a large amount of evidence.
Traditional Rationality emphasizes justification: “If you want to convince me of X, you’ve got to present me with Y amount of evidence.” I myself often slip into this phrasing, whenever I say something like, “To justify believing in this proposition, at more than 99% probability, requires 34 bits of evidence.” Or, “In order to assign more than 50% probability to your hypothesis, you need 27 bits of evidence.” The Traditional phrasing implies that you start out with a hunch, or some private line of reasoning that leads you to a suggested hypothesis, and then you have to gather “evidence” to confirm it—to convince the scientific community, or justify saying that you believe in your hunch.
But from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis just to locate the hypothesis in theory-space. It’s not a question of justifying anything to anyone. If there’s a hundred million alternatives, you need at least 27 bits of evidence just to focus your attention uniquely on the correct answer.
This is true even if you call your guess a “hunch” or “intuition.” Hunchings and intuitings are real processes in a real brain. If your brain doesn’t have at least 10 bits of genuinely entangled valid Bayesian evidence to chew on, your brain cannot single out a correct 10-bit hypothesis for your attention—consciously, subconsciously, whatever. Subconscious processes can’t find one out of a million targets using only 19 bits of entanglement any more than conscious processes can. Hunches can be mysterious to the huncher, but they can’t violate the laws of physics.
You see where this is going: At the time of first formulating the hypothesis—the very first time the equations popped into his head—Einstein must have had, already in his possession, sufficient observational evidence to single out the complex equations of General Relativity for his unique attention. Or he couldn’t have gotten them right.
Now, how likely is it that Einstein would have exactly enough observational evidence to raise General Relativity to the level of his attention, but only justify assigning it a 55% probability? Suppose General Relativity is a 29.3-bit hypothesis. How likely is it that Einstein would stumble across exactly 29.5 bits of evidence in the course of his physics reading?
Not likely! If Einstein had enough observational evidence to single out the correct equations of General Relativity in the first place, then he probably had enough evidence to be damn sure that General Relativity was true.
In fact, since the human brain is not a perfectly efficient processor of information, Einstein probably had overwhelmingly more evidence than would, in principle, be required for a perfect Bayesian to assign massive confidence to General Relativity.
“Then I would feel sorry for the good Lord; the theory is correct.” It doesn’t sound nearly as appalling when you look at it from that perspective. And remember that General Relativity was correct, from all that vast space of possibilities.