Today's post, Einstein's Arrogance was originally published on 25 September 2007. A summary (taken from the LW wiki):

 

Albert Einstein, when asked what he would do if an experiment disproved his theory of general relativity, responded with "I would feel sorry for [the experimenter]. The theory is correct." While this may sound like arrogance, Einstein doesn't look nearly as bad from a Bayesian perspective. In order to even consider the hypothesis of general relativity in the first place, he would have needed a large amount of Bayesian evidence.


Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was How Much Evidence Does It Take?, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

New Comment
15 comments, sorted by Click to highlight new comments since:
[-]gjm60

The wiki summary, quoted here, turns Einstein's "I would feel sorry for the good Lord" into "I would feel sorry for [the experimenter]". Surely this is just a mistake? Einstein was saying: if God didn't have the wit to make the universe work according to this theory, then so much the worse for God. (The usual disclaimers about Einstein's use of the word "God" apply here, of course.)

[-]Shmi50

As EY mentioned, in Einstein's mind the apriori probability of this prediction being correct was very high, probably over 99%.

The reason is that GR was being tested in the "weak-field" static limit, the same one that was used to derive it from the Special Relativity and the Equivalence Principle in the first place. Nearly every theory compatible with these would produce the same predictions.

He probably would not have said anything remotely as flippant, were the experiment to look for, say, gravitational radiation, the effect he was not at all sure about at the time. On the other hand, his confidence that a rather small effect would be successfully measured was probably much lower than 99%. Hence the apparent "arrogance": he simply rejected the low-probability outcome.

It almost worked out that way.The experimental results were indeed barely passable from one expedition, poor from the other one, and did not unequivocally confirm the expected effect. In fact, were Eddington not convinced that Einstein was right to begin with, his publication on the matter may well have been much less sympathetic.

This is one of my least favorite posts from Eliezer. One of my objections was already stated by Robin Hanson, but I haven't seen the other one discussed.

Now, how likely is it that Einstein would have exactly enough observational evidence to raise General Relativity to the level of his attention, but only justify assigning it a 55% probability? Suppose General Relativity is a 29.3-bit hypothesis. How likely is it that Einstein would stumble across exactly 29.5 bits of evidence in the course of his physics reading?

This argument violates one of Eliezer's other points: the law of conservation of expected evidence. If you have enough evidence to assign a theory a probability of .55, your expectation that it is true must be .55.

I don't think this addresses what Eliezer's point actually it is. The point is that the general region of evidence that causes one to locate a hypothesis and not have a hideously large amount of evidence for it is a pretty small region.

I think Eliezer is incorrect here, but the problem with the argument is subtle and requires careful examination of different historical discoveries.

Let's look at three historical examples and one example unfolding now.

First, let's look at Kepler's discovery that the orbits of the planets were ellipses. Given the very accurate data Kepler had access to thanks to Tycho Brahe's careful examination Kepler could see as soon as he looked at the data that ellipses fit really well. Moreover, if one accepted the idea and looked at the orbits one saw this really amazing pattern Kepler's third law which related the orbits of different planets to each other, an incredible display of an apparent relationship between the different planets, something the like of which had never been seen before in any other model. We all love to talk about Galileo but the truth is that it was Kepler's work which was the death blow for other theories. It not only was mathematically simple (ok, it used the most advanced math of the time but didn't have the hundreds of adhoc parameters that other models like the various Ptolemaic models, the Copernican model, Tycho's model, or the many hybrid models floating around at the time (some people wanted Venus or Mercury to orbit the sun. This is rarely discussed now as part of the history even though it was one of the most common views. Note that this is due in part because Galileo decided not to actually address this approach at all. So talking about it doesn't fit in with the amazing Galileo narrative). So once one saw the data one immediately knew it worked.

Next, consider Newton's discovery of gravity. Again, this required the most advanced math of the time. But, Newton was able to show that if one believed his laws of gravity one got Kepler's results and a lot of other known results (such as an approximate (but incomplete) description of the tides) so that once one had the hypothesis in the right form it became obvious. Oresme as early as the 1300s had discussed a law of gravity that diminished as objects got farther away, but he didn't really do much with it (he never specified the rate of decline with distance) so it really was hitting on the right function that did it. Moreover, Newton's laws explained observed very minor discrepancies in Kepler's laws (planets sped up or slowed down their orbits sometime when they were very near each other), and further observation and calculation quickly confirmed it. Still, this was only suggestive due in part to the very severe difficulty with measuring such minute differences. So it wasn't until the discovery of a whole new planet based on Newton's laws that Newtonian gravity got to be really accepted.

Now's let go to another field: chemistry. The idea that combustion took something from air was around in some form about the same time that phlogiston was proposed. It is common here to describe phlogiston as an example of an unfalsifiable theory. But this is unfair. The more extreme, unfalsifiable versions of phlogiston arose at the way end of the period when Joseph Priestly, actually credited by many as the discoverer of oxygen, refused to give up the phlogiston theory to his dying day and continued to add ad hoc hypotheses to defend it even as the rest of chemistry had gone on. (This is one of the very few examples where the whole old-scientists-need-to-just-croak thing actually happened. That pseudo-Kuhnian meme is very common but it isn't a very good model for reality.) (Excellent book on Priestly is Steve Johnson's "The Invention of Air") But, when phlogiston was actually proposed, it and the idea that fire took something from the air really did have close to the same amount of evidence. It was only over time as the careful work of measuring mass in reactions (which took a lot of careful, technical work when the reactions involved gases, or even worst, sometime multiple different gases) gradually produced a new theory which included oxygen and other elements. Yet unlike the Newton and Kepler examples, this not only happened gradually. the observed patterns didn't fit into some nice theory. The number of elements seemed to be massive, and their exact behavior seemed to have little rhyme or reason. It wouldn't be until the late 19th century that anyone had more than an inkling of what was happening. So the understanding of the basics of combustion arose as evidence gradually painted the picture in and gradually became more and more clear that the oxygen theory was essentially correct.

Now, let's look at one more example: Dark matter. This is an interesting one because there are many different hypotheses to explain what we don't see. A lot of the work has been simply eliminating the more easily falsifiable hypotheses, or doing extremely careful work eliminating difficult to falsify hypotheses (for example, it has taken a lot of extremely sensitive experiments to determine that the three known neutrino flavors can't contribute to more than 1-5% or so of the observed dark matter discrepancy). We have a lot of different ideas. Some of which are mathematically very elegant (here I'm counting different versions of supersymmetry or other proposed extensions of the Standard Model as different theories, and also including some of the stranger ideas like the hypothesis that part of the problem is that we are using Newtonian or nearly Newtonian approximations of GR that don't quite due what we want in some contexts). Will someone come up with a sudden idea that makes everything work nicely and explains dark matter and lots of other stuff besides just as SR explained multiple apparently unconnected problems at once? Maybe, but I doubt it. I suspect that what is more likely is that eventually one of the theories or even a collection of theories (say for example some version of supersymmetry and a lot more machos hanging around and a few sterile neutrinos) will gradually emerge as dealing with the discrepancy. This doesn't fit Eliezer's model.

So what is the upshot? Sometimes things go the way Eliezer depicts, and for much of the history of physics and astronomy this was a really good approximation. But it doesn't always go that way and it especially doesn't go that way if one is looking at fields other than physics.

Ah. I was misinterpreting what Eliezer meant. Thank you.

some people wanted Venus or Mercury to orbit the sun.

Something other than "sun" meant here I assume? Or did you mean "just Venus and Mercury"?

No. This is correct. Some people wanted Venus to orbit the sun with the sun and other planets orbiting the Earth. Others wanted the same arrangement for Mercury but not Venus. Others wanted it for both Venus and Mercury (this was much more common than just a single one). There were a lot of hybrid models floating around. This piece gives a good summary of just how much stuff was floating around and doesn't even bring up some of the more obscure other hypotheses that were also batted around. (There are some minor issues with that piece. I think he's slightly too unfair to Kuhn but this is a side point.)

ETA: Actually, thinking about this more this is probably slightly wrong: I don't see why anyone would have wanted just Mercury to be in solar orbit. Just Venus makes sense (due to the phases), but there's no reason anyone would have said just Mercury, so I'm probably remembering incorrectly and the two versions were Venus and Mercury together or just Venus.Will try to find sources and track this down.

Ok. I can't find any sources that just wanted Venus and not Venus and Mercury. So my earlier claim may be incorrect. I'll try to dig through this a bit more and see what turns up.

But, when phlogiston was actually proposed, it and the idea that fire took something from the air really did have close to the same amount of evidence.

How do you figure?

The point is that the general region of evidence that causes one to locate a hypothesis and not have a hideously large amount of evidence for it is a pretty small region.

The point is that the general region of evidence that causes one to locate a hypothesis and yet not have a hideously large amount of evidence that someone moderately smarter than you could use to decide decisively whether or not the hypothesis is true is a pretty small region.

So is the upshot? Sometimes things go the way Eliezer depicts, and for much of the history of physics and astronomy this was a really good approximation. But it doesn't always go that way and it especially doesn't go that way if one is looking at fields other than physics.

It's not a post about how things usually go. It's a post about the minimum requirements to know something with near certainty for an intelligent agent.

Now, how likely is it that Einstein would have exactly enough observational evidence to raise General Relativity to the level of his attention, but only justify assigning it a 55% probability? Suppose General Relativity is a 29.3-bit hypothesis. How likely is it that Einstein would stumble across exactly 29.5 bits of evidence in the course of his physics reading?

Not likely! If Einstein had enough observational evidence to single out the correct equations of General Relativity in the first place, then he probably had enough evidence to be damn sure that General Relativity was true.

Einstein already had plenty of observational evidence, so he used it properly. You say: "Will someone come up with a sudden idea that makes everything work nicely and explains dark matter and lots of other stuff besides just as SR explained multiple apparently unconnected problems at once? Maybe, but I doubt it." In physics, if they've truly narrowed it down like that, the conclusion is that they ought not need more evidence, not that the social forces of science will deterministically overturn every confusion dividing a professional field with a single "Eureka!", even those resulting from ignorance, politics, or stupidity.

If physicists have inefficiently parsed the data and gotten it down to the last three or so bits, then improving their thought would get them down to one theory as surely as providing enough mountains of data and analysis would. We pretty much know that as humans they are inefficient thinkers and have such room to improve, and even if they are at their ceiling, other possible intelligent systems could vastly improve upon individual or collective human thought even with less than all of our resources.

How do you figure?

They both fit the same basic evidence: burning a candle or similar object in a small enclosure made it go out. Similar remarks applied to small living animals and combinations of candles and animals. Moreover, many forms of combustion visibly gave off something into the air. Indeed the theory "combustion occurs when something from the substance goes into the air" is simpler than "combustion occurs when something from the air combines with the substance and sometimes but not always something else is added back into the air." It was only with careful measurements of the mass of objects before and after reactions (weighing gasses is really tough!), combined with the observation that some metals gained weight when being burned that really created a problem. A good Bayesian in 1680 who heard of both ideas arguably should favor phlogiston.

It's not a post about how things usually go. It's a post about the minimum requirements to know something with near certainty for an intelligent agent.

It is possible that I'm reading too much into this but it does seem that Eliezer is using Einstein's success as an actual example of his argument about how brains should work. But there's a problem: if brains are less than perfect Bayesians (and it seems that minds that are possible in this part of the Tegmark ensemble fit in that collection) they won't bring one hypothesis to the front, they will often have a fair number of hypotheses to explain based on incomplete data. In some cases, like Einstein, the sheer mathematical simplicity (in his case hitting on the simplest hypothesis that hit a large set of nice conditions (not to say that it is at all easy. Far from it.)) will make one hypothesis look like under some framework it takes less data than the others. But often the actual process will be that they need more data.

A mind, when investigating things will likely not just keep getting more and more clever insights. Things take effort. Let's say you have a really smart strongly-Bayesian mind with the control of the resources of a planet, but with minimal prior knowledge. So it can likely figure some things out pretty quickly like the orbits of the planets, and some other stuff. But somewhere between that and trying to detect fundamental particles of the universe it will probably need to collect more data. The mind isn't going to have any way to detect that neutrinos have mass (even if it suspects that) until it sees evidence that they oscillate. Etc. I suspect that no mind from the simple data that humans have from our naive senses will deduce the existence of quarks.

In physics, if they've truly narrowed it down like that, the conclusion is that they ought not need more evidence, not that the social forces of science will deterministically overturn every confusion dividing a professional field.

So this seems like a more valid point: There are problems of human cognitive biases that go in the other direction (that is making theories to overfit our data and our general tendency to be overconfident in our beliefs), But an actual good Bayesian should not need to specially test a hypothesis once the pre-existing evidence has singled it out as extremely likely. This feeling that we need to do this is an artifact of having to deal with the problems of human cognition and social biases.

If that's what Eliezer meant I don't think he said it very well.

Indeed the theory "combustion occurs when something from the substance goes into the air" is simpler...

Seems like a simpler theory. Is a shorter sentence.

The mind isn't going to have any way to detect that neutrinos have mass (even if it suspects that) until it sees evidence that they oscillate. Etc.

Sure, knowledge increases far more than arithmetically with additions of either smarts or data.

But an actual good Bayesian should not need to specially test a hypothesis once the pre-existing evidence has singled it out as extremely likely.

Only if he or she was sub-optimal when gathering data is this true. If, when doddering about, you narrowed down to a hypothesis, smarten up and you could probably determine if it's true. If you were doing your best, someone smarter in the relevant way probably can. A Bayesian of any quality must improve beyond the level of smarts that led him or her to merely single out a hypothesis to judge it well.

they won't bring one hypothesis to the front, they will often have a fair number of hypotheses to explain based on incomplete data.

I think this feels like having no idea at all, with no conscious hypothesis.

in his case hitting on the simplest hypothesis that hit a large set of nice conditions

Doing this feels like reasoning, or concentrating, or even zoning out, I suspect. It is varyingly subconscious pruning a tree of hypotheses, not conscious searching through them one by one and getting lucky by stumbling on a good one.

But often the actual process will be that they need more data.

Usually not once one consciously notices a finite set of hypotheses. Data can substitute for better thinking, there's no law against that, but it's not a "need" in many senses of the term.

So this seems like a more valid point: There are problems of human cognitive biases that go in the other direction

The point is that human minds just aren't efficient processors, biases aside. "The" point (there are multiple ones) isn't about overcoming bias, it's that a well designed AI brain would need computing resources less than those of our brains to be smarter than any human.

And they wouldn't be limited to that.

Seems like a simpler theory. Is a shorter sentence

Yes, simplicity of English language is not at all a good metric of actual simplicity for a decent prior. However, in this particular case, both account for the same qualitative observations, and I strongly suspect that if one did try to make these into some formal system one would find that the second hypothesis is actually more complicated since it has a conjunction.

I need to think more about the rest of your remarks more before responding. I think I agree with most of them.

(And right now I'm really tempted to pretend to be an internet crank and start going around the internet preaching that phlogiston is correct).

[-]FAWS30

I don't think I agree. To be equivalent the summary of the phlogiston hypothesis would also have to include that air has a definite, limited capacity for burned phlogiston and no other known substance does, nor does vacuum.

I do not understand the objection from the law of conservation of evidence.

I misinterpreted what Eliezer meant in that sentence. See here.