This post is great! Thanks for writing it.
When I read Ptolemy in college, one thing that struck me was how clear he was that he was advancing a probable hypothesis that needed to be supported with arguments and evidence. Some points he makes that stand up well in hindsight:
The main argument for heliocentrism was that under geocentrism, there's an unexplained coincidence in the way all the planets' motions track the motion of the sun, and unexplained coincidences are mathematically unaesthetic. This unexplained coincidence is very clear in the Almagest, no effort was made to hide it.
As far as I can tell, the correct position at the time - and at Copernicus's time - was that Ptolemy's theory seemed more physically plausible, but that there were strong mathematical-elegance reasons to favor some sort of heliocentric theory if you could get the details worked out, so one should keep one's mind open to both hypotheses until more evidence became available.
I think that the Copernican revolution looks less like people had some sort of mysterious epistemic insight, and more like just another plausible fad that happened to be especially epistemically lucky, when you take it in the context of the history of intellectual fashion. (Mesmerism does not hold up so well, for instance, and Cartesian physics got crucial questions badly wrong, even though it laid some of the conceptual groundwork for Newton.)
Thank you! I was quite nervous about posting but am very happy with the reception, and strongly update towards how remarkable a community LW2.0 might become (in terms of how welcoming it is of truth-seeking discussion and how constructively it forwards it).
Reading your comment, I'd update towards the relative importance of mathematical aesthetic compared to physical plausibility in finding true theories. I only want to believe in luck as a last resort. You seem to be making the "opposite" update. Is this correct? And, if it is, why do you update that way?
I think this is unambiguous evidence for the mathematical aesthetic heuristic, over the heuristic of support by existing physical theories. But in general we should expect that any good heuristic is occasionally going to outperform other good heuristics. Before you have an account of which heuristic to apply when, the best you can do is model this as a stochastic process with multiple imperfect predictors.
Along those lines, it's worth noting that the germ theory of disease prior to epidemiology is not mathematically elegant at all, since it adds a new causal factor on top of ones that we already had extremely good arguments and evidence for (environmental and behavioral causes).
By contrast, it was not luck that Newton was persuasive, because he gave a mathematically elegant account that simultaneously provided an alternative model of physics to explain not only astronomical data, but much other empirical data.
When it comes to the germ theory of disease it's worth noting that it isn't that good at predicting when people will get a cold.
It doesn't help you to predict that a person who's outside in the cold is more likely to get a cold. It doesn't help you to predict that people will more likely to get a cold in winter. It doesn't help you to predict that a person who's vitamin D3 deficient is more likely to get a cold.
You would expect bacteria to have a harder time when it's cold outside.
If you on the other hand assume that there's a life force that gets stronger or weaker that makes you more of less suspectible to disease you can find new ways to reason about those phenomena.
We could build a mathematical model that measures what kind of action correlate with changes in the life force and how changes in the life force correlate with events such as getting a cold.
As a result of the germ theory of disease children who are good at math enjoy their physics classes with nice mathematical models and the biology classes don't include formula for calculating the life force.
Quite obviously germs are a causal factor, but I think the way the germ theory of disease tries to monopolise the discussion of disease it caused also a lot of harm. Maybe this post should let us become more skeptical of the strong version of the germ theory.
This post is strong in the virtue of scholarship: you study history looking for general rationality lessons. It's also really clearly written. For these reasons I've promoted it to Featured.
These theories only differ in their prediction of whether we should be able to observe stellar parallaxes.
This surprised me, but I think I figured out what it means. Basically, we know that we can predict the movement of things inside the solar system equally well whether we put the Earth or the Sun at the center of our calculations page. But if the stars are rotating in a circle around the center of our calculations page, then we'd see parallax from the Earth if the Sun were at the center and we'd see parallax from the Sun if the Earth were are the center. But neither theory predicts that we'd see parallax from both the Sun and the Earth!
That is, maybe I'm too much in the Newtonian-Einsteinian paradigm, but it seems to me like heliocentrism vs. geocentrism is fundamentally confused / asking the wrong question; it assumes there's a privileged "origin point" to the map of the world when that's actually a model parameter not related to any predictions. (The computationally simplest option for predicting the dynamics of the solar system, putting it at the center of mass of the solar system, isn't really "orbiting around the sun" because the sun also has an orbit around the center of mass.) The deep conceptual breakthrough here isn't that the sun is in the middle, it's that the underlying laws are position-invariant, and thus you can abstract away the position.
That is, I think I don't agree with your central claim that they were right to be heliocentric, because heliocentrism is only slightly less mistaken than geocentrism on the next level of clarity, and doesn't appear to be meaningfully so. The actual advances were in mathematical modeling, and I'm not convinced that it was more than a coincidence that Kepler was a heliocentric. He was converted to that theory for theological reasons as a teenager, well over a decade before he had Brahe's superior data. Note that Kepler didn't try the elliptical orbit for years because he thought it was so simple that someone else must have tried it before--if Brahe had tried it, it would have been seen as a significant success for the Tychonic system, as Mars's orbit around the Sun would be well-explained, and the Sun's orbit around the Earth could be explained similarly.
It's a very interesting and controversial claim that heliocentrists were not really any more justified, epistemically, than the geocentrists. I will have to think more about that.
I feel like it might benefit from some additional clarification, because of the trap Asimov points at here:
John, when people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together.
It seems to me like there are about six important advances relevant to astronomical phenomena:
Most of those are approximations to later principles--2 is an approximation to 3, which is an exact solution to 4, but 4 is only an approximation to 6. (This also isn't the end of knowledge; I haven't included things here that a working cosmologist would, let alone a future physicist.)
My view is that the geocentrists and heliocentrists (as factions) are both stuck in having just 1+2, and openly contradicting 5. That says the interesting questions are: how did Kepler come up with 3, how did Newton come up with 4, and how did Einstein come up with 5 and 6?
Even there the answer might sometimes be luck, as opposed to good thinking skill. One could imagine a version of Kepler who thought ellipses were theologically significant, and tried to apply them to everything, and discovered that they happened to work for astronomy. This doesn't seem like a strategy worth stealing, whereas the strategy of "get good data, and try lots of functions on the data" does seem like a good strategy worth stealing. (And we can see Kepler's mistake of not doing the Occamian thing and recognize that it was a mistake.)
This view looks like it has two weaknesses. First, the list of six things that I picked. The scale of the universe didn't make the list, but might seem comparably important to 5. The position of the center of mass of the solar system also didn't make the list, because 'center of mass' isn't an interesting concept until you have conservation of momentum. (Since Aristarchus people have known the sun was bigger than the Earth, but until you understood gravitation, why would that be inconsistent with the sun moving around the Earth?) But I don't think there's a thing that could be added to the list such that the heliocentrists have clearly made an advance that the others haven't.
Second, the claim that heliocentrists and geocentrists both openly contradict 5. In retrospect, many present heliocentrism as giving up on Earth's privileged position, which is one of the inferential steps towards thinking that there's no privileged position. But it's not clear to me that this is the right way to view things; in particular, it seems important to distinguish the statements "the Earth isn't special" from "the Sun is more special than the Earth." My second-hand understanding is that Copernicus wouldn't endorse the Copernican Principle. (One could point to Bruno or Galileo or so on as pointing towards this advance, but they don't get credit for empirically discriminating between possibilities, much like Democritus doesn't get credit for proving that matter is made of atoms.)
It's worth noting that Kepler had the intuition that the motions of the planets ought to be produced by distinct lines of force adding up to regular curves - much as Galileo points out that constant acceleration is enough to cause mundane objects to fall in parabolic curves. But he didn't quite have enough math to formalize this persuasively - that had to wait until Newton.
Likewise, ellipses are more difficult to work with than circles, until something like Cartesian analytic geometry lets you formalize them simply without any direct reference to conic sections.
What do you mean by "Newtonian-Einsteinian paradigm"? Galileo invented relativity and Newton rejected it.
I feel like you're leaving out some arguments against the Ptolemaic model. As I understand it, Galileo wrote his dialogue at the suggestion of the pope who wanted to have a nice pro and cons list. The fact that the pope was even considering heliocentrism tells me that there must have been big problems with the geocentric view. Why would the head of a very conservative organisation (even if he was more on the open-minded end of the spectrum) entertain a new theory if the old theory is perfectly fine? And indeed Wikipedia tells me that the Ptolemaic model could not explain the observed phases of Venus and the motion of sunspots.
The motion of sunspots brings me to another topic. I think this paragraph is a bit misleading:
Moreover, Galileo’s observations of sunspots and moon craters weren’t unproblematic. In both cases there is evidence to indicate that he was fooled by optical illusions. And though he was also right about the existence of moons orbiting Jupiter, which contradicted the uniqueness of the earth as the only planet with a moon, what he actually observed rather seems to have been Saturn’s rings (Ladyman, 2001) [8].
The sunspots were at least known since 300 BC. And I can't imagine how you can mistake Saturn's rings for Jupiter's moons. I think what your source is saying is that he mistook Saturn's rings for moons of Saturn which is an entirely understandable mistake.
So, the Ptolemaic model definitely had problems and if I learned anything about humans it's that those problems were probably being ignored for too long. Wikipedia also tells me that at the time of Galileo the Tychonic model was actually quite popular because it solved so many problems of the Ptolemaic model. So, the question is, was it irrational of Galileo to prefer the Copernican model over the Tychonic model (given the data that he had)?
I wouldn't say so. Galileo rightly saw the Tychonic model as a weak compromise that didn't dare to go all the way. Sure, the parallalax was a problem but you can Defy the Data if you have a strong prior. If we steelman Galileo just a bit then his accomplishment was realizing that it's quite possible that the Earth is moving (you ordinarily woulnd't notice the difference) and thus, you should prefer a simple theory with a moving Earth over a more complicated theory with a stationary Earth.
A modern day example of sticking to the prior in the face of contrary evidence is this article by Bryan Caplan.
With regard to "How should you develop intellectually, in order to become the kind of person who would have accepted heliocentrism during the Copernican revolution?"
I think a possibly better question might be "How should you develop intellectually, in order to become the kind of person who would have considered both geocentrism and heliocentrism plausible with probability less than 0.5 and greater than 0.1 during the Copernican revolution?"
edit: May have caused confusion, alternative phrasing of same idea:
who would have considered geocentrism plausible with probability less than 0.5 and greater than 0.1 and would have considered heliocentrism plausible with probability less than 0.5 and greater than 0.1
I disagree. The point of the post is not that these theories were on balance equally plausible during the Renaissance. It's written so as to overemphasize the evidence for geocentrism, but that's mostly to counterbalance standard science education.
In fact, one my key motivations for writing it -- and a point where I strongly disagree with people like Kuhn and Feyerabend -- is that I think heliocentrism was more plausible during that time. It's not that Copernicus, Kepler Descartes and Galileo were lucky enough to be overconfident in the right direction, and really should just have remained undecided. Rather, I think they did something very right (and very Bayesian). And I want to know what that was.
I feel like you may have gone too far in the other direction then, since what I got out of this was definitely "there wasn't any evidence for heliocentrism and people just liked it better for philosophical reasons". As far as I know, the standard science education explanation for heliocentrism involves newtonian physics, observations that people weren't able to at this time (like you said, Tycho tried), and hindsight.
Can you expand on what the evidence that should have convinced people was? I feel like this article is a puzzle that's missing key information.
what I got out of this was definitely "there wasn't any evidence for heliocentrism and people just liked it better for philosophical reasons"
+1
They had arguments about physics that the OP weirdly downplays. Like I said below: Copernicus disliked the equant because it contradicted the most straightforward reading of Ptolemy's own physics; Kepler unambiguously disproved scholastic physics. Also, Galileo discovered Galilean relativity. He definitely made enough observations to show this last idea had something to it, unlike the scholastic explanation of heavenly bodies.
Galileo's observations indicating that the earth might not uniquely different from other planets, and the mathematical aesthetic of heliocentrism that Benquo points to above.
But as mentioned in the post, I'm mostly trying to point to a confusion and ask questions, not provide answers. There have been many great comments, and I think the fact that you perceived the post that way is improtant. I might rewrite it to reflect those things.
First -- great post.
In fact, one my key motivations for writing it -- and a point where I strongly disagree with people like Kuhn and Feyerabend -- is that I think heliocentrism was more plausible during that time.
I think this could be made clearer in the post itself, because whether or not there were good reasons around at the time is prior to whether or not we should try be like the heliocentrists.
Rather, I think they did something very right (and very Bayesian). And I want to know what that was.
This reasoning is itself quite non-Bayesian: exploring possibility-space, rather than updating a probability distribution over known unknowns. And maybe it's part of what the heliocentrists were doing right.
Do you think selection bias might play a role? Maybe the biggest breakthroughs tend to come from headstrong and philosophically-inclined scientists, but most such scientists we never hear about, and being headstrong and philosophical isn't epistemically hygenic in general.
Nitpick: as I understand, Feyerabend would agree. His main argument seems to be "any simple methodology for deciding whether a scientific theory is true or false (such as falsificationism) would have missed important advances such as heliocentrism, Newton's theory of gravity, and relativity, therefore philosophers of science should stop trying to formulate simple accept/reject methodologies."
I think he argues that any methododology -- not just any simple methodology -- will fail in some cases. The reason is that there is something "irrational", that is, irreducibly sociological, about scientific progess. I disagree because I think there is an optimal methodology for intellectual progess (Bayesian inference), and successful inference is ultimately reducible to approximations of it.
Bayesian inference only functions within known solution-space. Spotting things outside of known solution space, while rare, is essential for the progression of science – and can't be modelled simply as Bayesian inference.
I agree with your conclusions, but I have some complaints about this essay.
Do you really need 5000 words to present this? I’m not sure I’m reading you correctly, but you seem to boil it down to 3 points. Why not cut out most of the other material?
The 3 points that you present in the dialogue seem to me to be a pretty good summary of the debate in 1632. They are 3 issues that we recognize today as the most important and which I think the people in 1632 also recognized as most important. They probably didn’t see them as distinct from other important issues, just the top 3 on a list. But if you want to present this to people today you have to stop somewhere, and that seems like a pretty good place.
—
However, your treatments of those 2 of those 3 points are in serious error. You leave out Galileo’s two greatest contributions to science. Lack of parallax was a real strike against heliocentrism. As for the size of Venus, it was not only possible to observe, but Galileo did observe it changing in 1610.*
As for the Tower Argument, that was an argument made by geocentrists against heliocentrism (actually, stationary Earth vs rotation, not about revolution). The argument is that if the Earth were rotating, a ball dropped from the Leaning Tower of Pisa should land a kilometer to the west. Or, to put it another way, there should be winds of 300 meters per second. The rebuttal to that argument is not “something, something, Newton,” but “something, something, Galileo.” Not everyone was convinced by Galileo, but the fact that he addressed the argument was a very big deal.
Why bring up the Coriolis effect? To say that the situation is complicated? Sure, it’s complicated, and a lot of people weren’t convinced by Galileo. But his simple argument is correct, to first approximation. And the relevant part of the Coriolis effect is not complicated. He could have computed, without Newton, that a ball dropped from the Leaning Tower would land 3 millimeters east. I don’t think he did this calculation and if he had, he would have discarded it as irrelevant to the debate. It is true that it would have provided a way to measure the rotation of the Earth, to distinguish the two hypotheses. This makes it very different from parallax, which has an additional free parameter, the distance to the fixed stars. A measurement of small parallax could be explained away as a large distance to the fixed stars, but the Coriolis effect makes a precise prediction. Heliocentrism predicts nonzero parallax, but the Coriolis effect predicts not just nonzero drift, but 3mm of drift. If you prove that the drift is more than 1mm, you disprove a stationary Earth. If you prove that the drift is less than 2mm, you disprove a stationary sphere of fixed stars.
—
* Galileo also observed the phases of Venus, and most people treat this as more important. It appears to me that Ptolemy’s model also predicts the changing size of Venus. If the reason that Venus goes between 45° degrees east and west of the Sun is that it is traveling on an epicycle, then that circle should bring it close and far from the Earth. My impression is that very few people understood what Ptolemy actually said and a large part of Copernicus’s contribution was bothering to read Ptolemy.
Whereas the phases of Venus shows that the Venusian realm intersects the Solar realm, rather than being much farther away. They show that Venus goes around the Sun, rather than around a point that is on the radius through the Sun, but farther away.
An interesting blog post from Sabine Hossenfelder points to a similar situation with the Michelson-Morley experiment in 1887, which set the stage for special relativity.
Thanks for this nice post!
One conception I have of this debate was that it was largely about the metaphysical intuition of Do the heavens and the earth obey the same laws, or not? The heliocentrists argued that e.g. the planets seemed similar to earth, and so we should expect them all to be going around the sun (Descartes thought that they were borne through space as if on a current of clear liquid). The geocentrists argued that the heavens stayed up there, and we stayed down here, so wasn't it more sensible that the heavens were made of separate heaven-stuff that naturally floated around in space?
EDIT: Words.
Seems sensible. That is of course why the telescope and Galileo's observations were so important, as they revealed unexpected similarities between the earth and the heavens (other planets having moons and not being perfect spheres).
Paul Graham (there's a always a reason to quote him) makes the claim that much of intellectual history is just about discarding the notion that humans are special, in some kind of teleological sense. Earth is a planet among planets, homo sapiens a species among species. Both have remarkable and unique properties, but only because the universe just so happned to be that way. http://www.paulgraham.com/randomness.html
Note: I wrote most of this, and the sat on it for a couple days. I'm commenting here just to get it out there, because I think the approach is a good one, but I haven't proofread it or tweaked the phrasing to make it clearer. Hopefully I'll come back to it soon, though.
1. If you lived in the time of the Copernican revolution, would you have accepted heliocentrism?
No, absolutely not. I think this is roughly how we should have reasoned:
The best models of physics say that earthly objects are inherently center-seeking. It’s the nature of rocks and people and such to move toward the center. That’s the simplest explanation.
Now, celestial objects don’t have this property, which is why they are found so far from the center. What mechanisms govern their motion are a mystery, but the models which best fir the data are not heliocentric.
Sure, you could separate the ideas of “center” and “what attracts objects”. There’s no *a priori* reason they should coincide. And, Tycho Brahe’s combined geoheliocentric theory does just this. It places the sun at the center of the rotations of the planets, and the earth at the center of the rotation of the moon and the sun.
But, this only changes our interpretation of the celestial world, not the earthly world. And, our knowledge there is much less intimate than our practical, day-to-day knowledge of the physical laws that govern earthly objects. So, rocks are still drawn to their respective puller when thrown, and the sorts of objects that don’t fall and aren’t bound by this pull rotate around whatever it is they rotate around, sun or earth.
But, we know the moon orbits earth, so it is just a question of whether it’s simpler to have everything else also orbit us, but with complex epicycles, or to say that everything but the moon orbits the sun.
But, this second approach still requires the introduction of epicycles, and so is strictly more complex. So, in all likelihood, the earth is the center of all things.
I think this logic is correct and sound, at least until Newton. We should have notices we were confused after Galileo. He shattered the illusion that celestial objects were of a fundamentally different nature than earthly objects. Before that, earthly objects were rough and oddly shaped, while celestial objects were all perfectly round, or infinitely small points of light.
Celestial objects glowed, for god’s sake, and nonstop, in a way that we could only reproduce temporarily with fire. Conservation of energy clearly didn’t apply to them, especially because they moved constantly in mysterious unceasing patterns. Earthly objects are subject to friction, and even the fastest moving bullet eventually succumbs to gravity. The proper and natural thing to do is to classify them as fundamentally different.
2. How should you develop intellectually, in order to become the kind of person who would have accepted heliocentrism during the Copernican revolution?
I think the proper lesson here is NOT epistemic humility. We shouldn’t retain high degrees of model uncertainty forever, and agonize over whether we’re missing something that fuzzy, spiritual, mystical insight.
Mysticism happened to get the right answer in this case, but not because of anything intrinsic to mysticism. Instead, I think we can pull out the property that made it work, and leave the rest. (Pull out the baby, pitch the bathwater.)
But first, let’s look at our model for model uncertainty. Bayes’ Theorem would have us update from our priors to some ideal probability estimate, hopefully >99.9%, or <0.1%, if we can dig up enough data. Usually, we only pay attention to the p, but the amount of total evidence collected is also a decent measure of the progress from priors to truth.
Another measure I like even better is how large you expect future updates to be. Maybe I’m 20% sure of my best hypothesis, and I expect to update by +/- about 5% based on some experiment which I can’t do yet. The relative ratio of these 2 percentages is telling, because it tells you how much variability is left in your model. (Or, more precisely but using different units, maybe you give odds 10:1 in favor of something, but still expect to update by a factor of 100 after the next experiment, in one direction or another.)
By conservation of expected evidence, you can’t know in which *direction* that update will be. (Or if you can, than you should already have updated on *that* knowledge.) But, you can at least get a feel for the size of the update, and compare it to the probability of your current model.
So, you start out with uncountably many priors, all of which have only a tiny chance of being true. Then, as more and more evidence comes in, some hypotheses go past the 1% threshold, and you have a humanly manageable number, some of which are more probably than others. But, these shouldn't add up to 100%. Most of your probability mass should still be on unknown unknowns. And really, most of your models should only be thought of as rough outlines, rather than formal definitions.
I think this is where Capernacus should have considered himself to be. He had bad reasons for trying to come up with variants of the current best models. But, that’s exactly what he should have been doing, regardless of the reasons. And, note that, despide getting quite close, he was still wrong. The sun is not the center of the universe, or even the galaxy. It’s just the center of the solar system. Ish. Really, there’s some point that’s the center of mass of everything in the solar system, and if I recall it’s actually technically outside the sun. The sun and everything else just orbit that point.
So, you can only really expect to put non-negligible probability on models in this state of understanding when you include a bunch of weasel words, and phrase things as broadly as possible. Instead of “The earth and all the celestial objects but the moon rotate around the sun”, append this with “or the majority of them do, or they approximately do but some second-order correction terms are needed.”
And even then, it’s probably still not quite right. In this sense, we’re probably wrong about just about everything science claims to know, with probability nearly 1. But, I think we’re homing in on the truth asymptotically. Even if we never quite get to anything that’s 100% right, we can get arbitrarily close. So, is everything we know a lie, then? Should we put near-zero probability on everything, since we probably haven’t added enough weasel words to capture every possible subtlety we may have missed?
Isaac Asimov wrote a fantastic description of this problem, which he summed up this way:
John, when people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together.
It would be nice to have some absolute measure of this similarity in terms of Kolmogorov complexity or something. Like, a circle and an ellipse are quite similar mathematically, and there are probably plenty of ways to quantify how far a circle is from an ellipse. So, it seems like it should be possible to quantify how similar any 2 arbitrary mathematical models are. Maybe in terms of how different their predictions are, or how similar their mathematical structure is? I dono.
But, I’m not aware of any generalizations of circle/ellipse differences to all possible computations. How far is 1+1 from the Pythagorean theorem? I dono, but I think modeling something as a circle when it’s really closer to an ellipse (ignoring orbital perturbations from nearby planets) is a lot closer than 1+1 is to the Pythagorean theorem. And, I think that modeling everything as revolving around the sun is significantly closer to reality than modeling everything as orbiting the earth. It’d be interesting to be able to quantify exactly how much closer, though.
Meta comment: It'd be nice if this post had the nice, linked footnotes, like Eliezer's sequence posts.
Now if at the end of thinking you convinced yourself of yadda yadda straight line physics yadda yadda you were unfortunately mistaken. The tower argument is correct.
No, it isn't, not really.
If the motion of the point on the Earth at the tower has two components, one which is a straight line and one which isn't, and the straight line component is orders of magnitude larger than the other one (as it is over the course of a tower experiment), then it's fair to say that "straight line physics" is the answer. It's not literally 100% of the answer, of course, because of that small second component, but it's almost 100% of the answer. It isn't "mistaken" except to the same kind of pedant who insists that "humans have two legs" is mistaken because you really need to say that they average 1.99987 legs.
I disagree, because I think the intuition that leads people to accept the tower argument is not that if there's a drift component, it's negligible. In fact, I think people would accept the argument even for a planet sufficiently small to make the component non-negligible. The point is that the people formulated the tower argument had the right intuition but used it to defend the wrong view.
The intuition that leads them to accept the tower argument doesn't include an explicit step "I am going to think about the drift componenet. Okay, I decided to ignore it", but people don't think out all steps that way. At some point they will implicitly assume that the drift component is negligible (and they will be correct).
My view of the Copernican revolution used to be that when people finally switched to the heliocentric model, something clicked. The data was suddenly predictable and understandable.
It sounds like this is what happened though, but the click was at Kepler.
The surprising corollary is that Galileo just happened to be right, and I don't really want to imitate him. I don't want to be the kind of person who would have been a Copernican without knowing Kepler's theory.
But on the other hand, to invent Kepler's theory, Kepler had to be a Copernican.
I don't think people should try to emulate heliocentrists because I think that acting like they did would generally lead people to failure, not success. The lesson I take from this is that stubborn holdout populations who refuse to accept the obvious are important to the health of science as an ecosystem of ideas. But I don't think stubbornness should be seen as a general purpose virtue. I think Aristotle and co just experienced epistemic luck.
We often speak about the system of 4 elements but there were really 5 elements in their model. They thought that the stars were made up of the fifth element of "spirit".
Water, Fire, Earth, Air and Spirit. Or in symbols 🜄, 🜂, 🜃. 🜁 and ◯. If you start to think of the elements as the stuff that moves in those direction, it's neat. Air and Fire both move up and that's why they point up. Earth and water go down that's why they point down. Spirit doesn't go up or down but goes in circles.
As long as you think about the elements in those terms it makes sense to also stick with the astrology where the stars move in circles because that's just the direction in which the spirit stuff moves.
This system makes it harder to just change something about your astronomical beliefs without also changing something about your view of the elements.
Amazing post!
Would be useful to mention examples of contemporary ideas that could be analogues of heliocentrism in its time. I would suggest String Theory to be one possible candidate. The part when Geocentrist is challenging Heliocentrist to provide some proof while Heliocentrist is desperately trying to explain away lack of experimental evidence kinda reminds me of debates between string theorist and their sceptics. (it doesn't mean String Theory is true just there seems to be a similar state of uncertainty).
When I paused to think about the tower argument, I noticed a sudden shift in my thinking when I wondered "wait, how much did they know about inertia at the time?"
If you know about inertia, then you should be able to calculate (even before Newton...I think?) that a falling object will drift, but only negligibly so, for plausible tower heights. In this case, the one arguing for a stationary earth has correctly identified a real effect, but failed to consider its scale.
If you don't know about inertia, and you imagine a dropped ball will instantly lose all velocity (in some privileged reference frame) and fall "straight down" while the world rotates beneath it, then you'd predict a far larger drift. In this case, they're wrong about the qualitative rules, not just the effect size.
By completely ignoring physics until Galileo, you paint a deceptive picture.
In Aristotle's physics, each god inspired a different but equally regular and circular motion in the heavens.
Copernicus objected to the equant because it was not a regular circular motion. It just modified another circle, which seems like an obvious contradiction. If we treat it as a motion added to the system, it would be something like motion along a (rotating) radius. The planet would go back and forth in a straight line that happens to produce a modified circle. Now, we could imagine that all of these circles are conceptual rather than being actual motions added together. We could say that the deities involved compel the actual motion of the planet in its (single) crystal sphere to act as if influenced by other, imaginary circles. But that would seem to require a more active role for the deities, leading to awkward questions. That seems like the major reason why people called Copernicus more coherent and elegant.
Kepler - as you point out and then ignore - showed that all previous sytems gave false predictions, and you could get true ones (according to the observations of the time) by using ellipses. That was the end of the Church's Artistotelian physics. At that point, their model of the heavens and physics in general was provably wrong.
Notice what Tycho Brahe's system doesn't have? Guess what was also missing from the chief attempt to defend Brahe against Galileo. Abandoning Aristotle's physics of perfect circles would have removed most of the actual reason for thinking the heavens and the Earth followed different rules to begin with.
The Copernican revolution was a pivotal event in the history of science. Yet I believe that the lessons most often taught from from this period are largely historically inaccurate and that the most important lessons are basically not taught at all [1]. As it turns out, the history of the Copernican revolution carries important and surprising lessons about rationality -- about what it is and is not like to figure out how the world actually works. Also, it’s relevant to deep learning, but it’ll take me about 5000 words on renaissance astronomy to make that point.
I used to view the Copernican revolution as an epic triumph of reason over superstition, of open science over closed dogma. Basically, things went as follows: Copernicus figured out that the sun rather than the earth is at the center of our planetary system. This theory immediately made sense of the available data, undermining its contorted predecessors with dazzling elegance. Yet its adoption was delayed by the Catholic Church fighting tooth and claw to keep the truth at bay. Eventually, with the emergence of Newton’s work and the dawn of the Enlightenment, heliocentrism became undeniable and its adoption inevitable [2].
This view is inaccurate. Copernicus system was not immediately superior. It was rejected by many people who were not puppets of the Church. And among those who did accept it, better fit to the data was not a main reason. What did in fact happen will become clear in a moment. But in reading that, I’d like to prompt you to consider the events from a very particular vantage point: namely what they would be like from the inside. Ask yourself not what these events seem like for a millennial with the overpowered benefit of historical hindsight, but for a Prussian astronomer, an English nobleman or a Dominican priest.
More precisely, there are two key questions here.
First, if you lived in the time of the Copernican revolution, would you have accepted heliocentrism? I don’t mean this as a social question, regarding whether you would have had the courage and resources to stand up to the immensely powerful Catholic Church. Rather, this is an epistemic question: based on the evidence and arguments available to you, would you have accepted heliocentrism? For most of us, I think the answer is unfortunately, emphatically, and surprisingly, no. The more I’ve read about the Copernican revolution, the less I’ve viewed it as a key insight followed by a social struggle. Instead I now view it as a complete mess: of inconsistent data, idiosyncratic mysticism, correct arguments, equally convincing arguments that were wrong, and various social and religious struggles thrown in as well. It seems to me an incredibly valuable exercise to try and feel this mess from the inside, in order to gain a sense what intellectual progress, historically, has actually been like. Hence a key reason for writing this post is not to provide any clear answers -- although I will make some tentative suggestions -- but to provoke a legitimate sense of confusion.
If things were that chaotic, then this raises the second question. How should you develop intellectually, in order to become the kind of person who would have accepted heliocentrism during the Copernican revolution? Which intellectual habits, if any, unite heliocentric thinkers like Copernicus, Kepler, Galileo and Descartes, and separates them from thinkers like Ptolemy and Tycho? Once again, my answer will be tentative and limited. But my questions, on the other hand, are arguably the right ones.
What happened
My view of the Copernican revolution used to be that when people finally switched to the heliocentric model, something clicked. The data was suddenly predictable and understandable. Something like how Andrew Wiles describes his experience of doing mathematics:
“[...] in terms of entering a dark mansion. You go into the first room and it's dark, completely dark. You stumble around, bumping into the furniture. Gradually, you learn where each piece of furniture is. And finally, after six months or so, you find the light switch and turn it on. Suddenly, it's all illuminated and you can see exactly where you were.”
However, this is most certainly not how things appeared at the time. Let’s start at the beginning.
1. Scholasticism
The dominant medieval theory of physics, and by extension astronomy, was Scholasticism, a combination of Aristotelian physics and Christian theology. Scholasticism was a geocentric view. It placed the earth firmly at the center of the universe, and surrounded it with a series of concentric, rotating “crystalline spheres”, to which the celestial bodies were attached.
2. Ptolemy
Ptolemy of Alexandria provided the mathematical foundation for geocentrism, around 100 AD. He wanted to explain two problematic observations. First, the planets appear to move at different speeds at different times, contrary to the Aristotelian thesis that they should move with a constant motion. Second, some planets, like Mars, occasionally seem to briefly move backwards in their paths before returning to their regular orbit. Like this:
In order to explain these phenomena, Ptolemy introduced the geometric tools of equants and epicycles. He placed the earth slightly off the center of the planetary orbits, had the planets themselves orbit in little mini-cycles -- so-called “epicycles” -- along their original orbit, and introduced another off-center point, called the equant, in relation to which the motions of the planets are uniform, and which Ptolemy also claimed “controlled” the speed of the planets along their larger orbits. Like this:
[3]
Here’s how these additions make sense of retrograde motion [4]:
The ability of the Ptolemaic system to account for these phenomena, predicting planetary positions to within a few degrees (Brown, 2016), was a key contributor to its widespread popularity. In fact, the Ptolemaic model is so good that it’s still being used to generate celestial motions in planetariums (Wilson, 2000).
3. Copernicus
Copernicus published his heliocentric theory while on his deathbed, in 1543. It retained the circular orbits. More importantly, it of course placed the sun at the centre of the universe and proposed that the earth rotates around its own axis. Copernicus was keen to get rid of Ptolemy’s equants, which he abhorred, and instead introduced the notion of an epicyclet (which, to be fair, is kind of just like an equant with its own mini-orbit) [5]. Ptolemy’s system had required huge epicycles, and Copernicus was able to substantially reduce their size.
Retrograde motion falls out of his theory like this:
In order to get the actual motion of the planets correct, both Ptolemy and Copernicus had to bolster their models with many more epicycles, and epicycles upon epicycles, than shown in the above figure and video. Copernicus even considered introducing an epicyclepicyclet -- “an epicyclet whose center was carried round by an epicycle, whose center in turn revolved on the circumference of a deferent concentric with the sun as the center of the universe”... (Complete Dictionary of Scientific Biography, 2008).
Pondering his creation, Copernicus concluded an early manuscript outline his theory thus “Mercury runs on seven circles in all, Venus on five, the earth on three with the moon around it on four, and finally Mars, Jupiter, and Saturn on five each. Thus 34 circles are enough to explain the whole structure of the universe and the entire ballet of the planets” (MacLachlan & Gingerich, 2005).
These inventions might appear like remarkably awkward -- if not ingenious -- ways of making a flawed system fit the observational data. There is however quite an elegant reason why they worked so well: they form a primitive version of Fourier analysis, a modern technique for function approximation. Thus, in the constantly expanding machinery of epicycles and epicyclets, Ptolemy and Copernicus had gotten their hands on a powerful computational tool, which would in fact have allowed them to approximate orbits of a very large number of shapes, including squares and triangles (Hanson, 1960)!
Despite these geometric acrocrabitcs, Copernicus theory did not fit the available data better than Ptolemy’s. In the second half of the 16th century, renowned imperial astronomer Tycho Brahe produced the most rigorous astronomical observations to date -- and found that they even fit Copernicus’ data worse than Ptolemy’s in some places (Gingerich, 1973, 1975).
This point seems to have been recognized clearly by enlightenment scholars, many of whom instead chose to praise the increased simplicity and coherence of the Copernican system. However, as just described, it is unclear whether it even offered any such improvements. As Kuhn put it, Copernicus’s changes seem “great, yet strangely small”, when considering the complexity of the final system (Kuhn, 1957). The mathematician and historian Otto Neugebauer writes:
“Modern historians, making ample use of the advantage of hindsight, stress the revolutionary significance of the heliocentric system and the simplifications it had introduced. In fact, the actual computation of planetary positions follows exactly the ancient pattern and the results are the same. [...] Had it not been for Tycho Brahe and Kepler, the Copernican system would have contributed to the perpetuation of the Ptolemaic system in a slightly more complicated form but more pleasing to philosophical minds.” (Neugebauer, 1968)
4. Kepler and Galileo
At the turn of the 17th century Kepler, armed with Tycho Brahe’s unprecedentedly rigorous data, revised Copernicus’ theory and introduced elliptical orbits [6]. He also stopped insisting that the planets follow uniform motions, allowing him to discard the cumbersome epicyclical machinery.
Around the same time Galileo invented the telescope. Upon examining the celestial bodies, he found irregularities that seemed to contradict the Scholastic view of the heavens as a perfect, unchanging realm. There were spots on the sun...
...craters and mountains on the moon…
...and four new moons orbiting Jupiter.
Spurred on by his observations, Galileo would soon begin his ardent defense of heliocentrism. Despite the innovations of Galileo and Kepler, the path ahead wasn’t straightforward.
Galileo focused his arguments on Copernicus’ system, not Kepler’s. And in doing so he faced not only the problems with fitting positional planet data, which Kepler had solved, but also theoretical objections, to which Kepler was still vulnerable.
Consider the tower argument. This is a simple thought experiment: if you drop an object from a tower, it lands right below where you dropped it. But if the earth were moving, shouldn’t it instead land some distance away from where you dropped it?
You might feel shocked upon reading the argument, in the same way you might feel shocked by your grandpa making bigoted remarks at the Christmas table, or by a friend trying to recruit you to a pyramid scheme. Just writing it, I feel like I’m penning some kind of crackpot, flat-earth polemic. But if the reason is “well obviously it doesn’t fall like that… something something Newton…” then remind yourself of the fact that Isaac Newton had not yet been born. The dominant physical and cosmological theory of the day was still Aristotle’s. If your answer to the tower argument in any way has to invoke Newton, then you likely wouldn’t have been able to answer it in 1632.
Did you manage to find some other way of accounting for objects falling down in a straight line from the tower? You might want to take a few minutes to think about it.
[...time for thinking…]
Now if at the end of thinking you convinced yourself of yadda yadda straight line physics yadda yadda you were unfortunately mistaken. The tower argument is correct. Objects do drift when falling, due to the earth’s rotation -- but at a rate which is imperceptible for most plausible tower heights. This is known as the “Coriolis effect”, and wasn’t properly understood mathematically until the 19th century.
In addition a fair number of astronomical observations seemed to qualitatively contradict heliocentrism -- by leaving out predicted phenomena -- as opposed to just providing quantitative discrepancies in planetary positions. Consider the stellar parallax. A “parallax” is the effect you might have noticed while looking out of a car window, and seeing how things that are closer to you seem to fly by at a faster pace than things farther away. Like this:
If the earth orbits the sun, something similar should be visible on the night sky, with nearby stars changing their position substantially in relation to more distant stars. Like this:
No one successfully detected a stellar parallax during the renaissance. This included Tycho, who as mentioned above had gathered the most accurate and exhaustive observations to date. His conclusion was that either the distant stars were so distant that a parallax wasn’t detectable using his instruments -- which would entail that space was mostly an unfathomably vast void -- or there simply was no stellar parallax to be detected.
Once again, with the benefit of hindsight it is easy to arbitrate this debate. Space just is really, really, really vast. But it is worth noticing here the similarity to Russell’s teapot-style arguments [7]. On two points in a row, the defenders of heliocentrism have been pushed into unfalsifiable territory:
Heliocentrist: “There is drift when objects fall from towers -- we just can’t measure it!”
Geocentrist: “But provide a phenomenon we can measure, then.”
Heliocentrist: “Well, according to my recent calculations, a stellar parallax should be observable under these conditions…”
Geocentrist: “But Tycho’s data -- the best data astronomical we’ve ever had -- fails to find any semblance of a parallax. Even Tycho himself thinks the idea is crazy”
Heliocentrist: “The fact that Tycho couldn’t detect it doesn’t mean it’s not there! The stars could be too far away for it to be detected. And things aren’t absurd just because prominent scientists say they’re absurd”
Geocentrist: “Hold on… not only does your new theory contradict all of established physics, but whenever you’re asked for a way to verify it you propose a phenomenon that’s barely testable… and when the tests come out negative you blame the tests and not the theory!”
Heliocentrist: “Okay okay, I’ll give you something else… heliocentrism predicts that Venus will sometimes be on the same side of the sun as Earth, and sometimes on the opposite side...”
Geocentrist: “Yes?”
Heliocentrist: “This means that Venus should appear to vary in size… by...” and the heliocentrist scribbles in his notebook “… as much as… six times.”
And this prediction of the change in size of venus was indeed made by proponents of heliocentrism.
And, once again, although today we know this phenomena does in fact appear, the available observations of the 17th century failed to detect it.
This might all seem messy, complicated, disappointing. If this is what the history of intellectual progress actually looks like, how can we ever hope to make deliberate progress in the direction of truth?
It might be helpful to examine a few thinkers -- Copernicus, Kepler, Descartes, Galileo -- who actually accepted heliocentrism, and try to better understand their reasons for doing so.
Little is known about the intellectual development and motivations of Copernicus, as the biography written about him by his sole pupil has been lost. Nonetheless, a tentative suggestion is that he developed rigorous technical knowledge across many fields and found himself in environments which were, if not iconoclastic, at least exceptionally open-minded. According to historian Paul Knoll:
“[The arts faculty at the University of Cracow, where Copernicus studied] held the threefold promise of mathematics and astronomy which were abreast of any developments elsewhere in Europe, of philosophical questioning which undermined much the foundations of much that had been characteristically medieval, and of a critical humanistic attitude which was transforming older cultural and educational values” (Knoll, 1975)
Later, when studying law at the University of Bologna, Copernicus stayed with the astronomy professor Domenico Maria Novara, described as “a mind that dared to challenge the authority of [Ptolemy], the most eminent ancient writer in his chosen fields of study” (Sheila, 2015). Copernicus was also a polymath, who studied law in addition to mathematics and astronomy, and developed an early theory of inflation. His pupil Rheticus was an excellent mathematician, and provided crucial support in helping Copernicus complete his final, major work.
Beyond that some authors claim that Copernicus was influenced by a kind of neoplatonism that regarded the sun as a semi-divine entity, being the source of life and energy -- which made him more content to place it at the centre of the universe (Kuhn, 1957). These claims are however disputed (Sheila, 2015).
These conditions -- technical skill, interdisciplinary knowledge and open-mindedness -- seem necessary for Copernicus development, but they also feel glaringly insufficient.
As for Kepler and Descartes, their acceptance of heliocentrism was not motivated by careful consideration of the available data, but commitments to larger philosophical projects. Kepler is known as a mathematician and astronomer, but in his own day he insisted that he be regarded as a philosopher, concerned with understanding the ultimate nature of the cosmos (Di Liscia, 2017). He did have access to better data -- Tycho’s observations -- than most people before him, and he pored over it with tremendous care. Nonetheless, his preference for elliptical over circular orbits was equally influenced by mystical views regarding the basic geometric harmony of the universe, in which the sun provided the primary source of motive force (Ladyman, 2001; Di Liscia, 2017; Westman, 2001).
Something similar was true of Descartes, although his underlying philosophical agenda is quite different. A striking example of these commitments is that both Kepler and Descartes argued that a heliocentric world-view was self-evident, in the sense of being derivable from first principles without recourse to empirical observation (Frankfurt, 1999).
Beyond that, I know too little about their respective views to be able to offer any more detailed, mechanistic account of why they preferred heliocentrism.
Galileo -- Copernicus’ bulldog -- is a confusing figure as well. Just like Copernicus, Kepler and Descartes, Galileo was not purely guided by careful experiment and analysis of the data -- despite the weight popular history often places upon these characteristics of his. As Einstein writes in his foreword to a modern edition of Galileo’s Dialogue:
“It has often been maintained that Galileo became the father of modern science by replacing the speculative, deductive method with the empirical, experimental method. I believe, however, that this interpretation would not stand close scrutiny. There is no empirical method without speculative concepts and systems; and there is no speculative thinking whose concepts do not reveal, on closer investigation, the empirical material from which they stem.” (Einstein, 2001)
For Galileo, this speculative system consisted in replacing the four Aristotelian elements with a single, unified theory of matter, and replacing the view of nature as a teleological process with a view of it a deterministic, mechanistically intelligible process. Einstein later points out that in some respects this approach was inevitable given the limited experimental methods available to Galileo (for example, he could only measure time intervals longer than a second).
Galileo was also a man of courage and belligerence. One of his strengths was an absolute refusal to accept arguments from authority without experimental evidence or careful reasoning. It appears as if though his belligerence aided him several times in a quite ironic way. Many of the arguments he marshalled against his opponents were either incorrect, or correct but based on incorrect observations. One example is his attempt to derive a theory of tides from the motions of the earth, a project to which he devotes about a fourth of his famous Dialogue. Einstein, again, writes “it was Galileo’s longing for a mechanical proof of the motion of the earth which misled him into formulating a wrong theory of the tides. [These] fascinating arguments [...] would hardly have been accepted as proofs by Galileo, had his temperament not got the better of him”.
Moreover, Galileo’s observations of sunspots and moon craters weren’t unproblematic. In both cases there is evidence to indicate that he was fooled by optical illusions. And though he was also right about the existence of moons orbiting Jupiter, which contradicted the uniqueness of the earth as the only planet with a moon, what he actually observed rather seems to have been Saturn’s rings (Ladyman, 2001) [8].
Nonetheless, at this point you might be aching to object that, disregarding inconsistent data, theoretical flaws, failed predictions and incorrect formulation of a theory of the tides... surely Galileo’s Dialogue provided other convincing arguments that finally tipped the balance in favour of heliocentrism?
Alas, history is messy.
Recall that Galileo defended Copernicus system, not Kepler’s, and hence had to deal with its flaws. More strikingly, in the above I still haven’t mentioned the existence of a third major theory, rivalling both Ptolemy and Copernicus: Tycho Brahe’s combined geoheliocentric theory. This theory retained the moon and sun in orbit around the earth but placed all the other planets in orbit around the sun.
Galileo’s Dialogue does not engage with Tycho’s theory at all. One suggested explanation (given by an unknown Wikipedia contributor) is that, assuming Galileo’s theory of the tides, the Ptolemaic and Tychonic systems are identical, and hence it would suffice to rebut the former. But the theory of the tides was wrong.
These theories only differ in their prediction of whether we should be able to observe stellar parallaxes. And as mentioned above, Tycho’s data had failed to detect one, which he saw as key evidence for his view.
Eventually though, this historical mess was straightened out, and a crucial experiment arbitrated Galileo, Tycho and Ptolemy. German astronomer Friedrich Bessel’s finally managed to observe a stellar parallax in 1838. About 200 hundred years later. By that point, the Copernican revolution was surely already over -- even the Catholic church had removed Copernicus’ De revolutionibus from Its index of banned books, as it was simply accepted as true (Lakatos & Zahar, 1975).
5. Newton
At one point Newton also came along, but Galileo died about a year before he was born. Newton’s marriage of physics and mathematics, which implied Kepler’s laws as a special case, was crucial in demonstrating the viability of heliocentrism. But nonetheless some thinkers did something very right decades before the arrival of the Cambridge genius, which he was very well aware of. For the Copernican revolution might have been completed by Newton, but in the end he still stood on the shoulders of giants.
Now what?
One purpose of this essay has been to portray an important historical era in a more realistic way than other popular portrayals. I asked two questions at the beginning:
The preceding section argued that the answer to the first question might quite likely have been no. This section takes a closer look at the second question. I do however want to preface these suggestions by saying that I don’t have a good answer to this myself, and suggest you take some time to think of your own answers to these questions. I’d love to hear your thought in the comments.
What about Ibn ash-Shãtir?
There seems to be some Islamic scholars who beat Copernicus to his own game by a few hundred years. I’d be keen to learn more about their story and intellectual habits.
Careful with appearances
Geocentrists liked to claim that it certainly seems like the sun orbits the earth, and not vice versa. There is something odd about this. Consider the following Wittgenstein anecdote:
“He [Wittgenstein] once asked me [Anscombe]: ‘Why do people say it is more logical to think that the sun turns around the Earth than Earth rotating around its own axis?’ I answered: ‘I think because it seems as if the sun turns around the Earth.’ ‘Good,’ he said, ‘but how would it have been if it had seemed as if the Earth rotates around its own axis then?’” (Anscombe, 1959)
This quote hopefully inspired in you a lovely sense of confusion. If it it didn’t, try reading it again.
When I said above that it certainly seems like the sun orbits the earth and not vice versa, what I meant to say was that it certainly seems like it seems like the sun orbits the earth and not vice versa [9].
There’s a tendency to use the word “seems” in quite a careless fashion. For example, most people might agree that it seems like, if an astronaut were to push a bowling ball into space, it would eventually slow down and stop, because that’s what objects do. At least most people living prior to the 20th century. However, we, and they, already know that this cannot be true. It suffices to think about the difference between pushing a bowling ball over a carpet, or over a cleaned surface like polished wood, or over ice -- there’s a slippery slope here which, if taken to its logical extreme, should make it seem reasonable that a bowling ball wouldn’t stop in space. A prompt I find useful is to try to understand why the behaviour of the bowling ball in space could not have been any other way, given how it behaves on earth. That is, trying to understand why, if we genuinely thought a bowling ball would slow down in space, this would entail that the universe was impossibly different from the way it actually is.
Something similar seems true of the feeling that the sun orbits the earth, and this is brought out in the Wittgenstein anecdote. What we think of as “it seems as if though the sun orbits the earth” is actually just us carelessly imposing a mechanism upon a completely different sensation, namely the sensation of “celestial objects seeming to move exactly as they would move if the earth orbited the sun and not vice versa”. Whatever it would look like to live in a world where the opposite was true, it certainly wouldn’t look like this.
Careful with your reductios
Many of the major mistakes made by opponents of heliocentrism was to use reductio ad absurdum arguments without really considering whether the conclusion was absurd enough to actually overturn the original argument. Tycho correctly noted that either there wasn’t a stellar parallax or he couldn’t measure it, but incorrectly took the former as more plausible. Proponents of the tower argument assumed that objects fall down in straight lines without drift, and that anything else would be percetible by the naked eye. In both cases, people would just have been better off biting the bullet and accepting the implications of heliocentrism. That, of course, raises the question of which bullets one should bite -- and that question is beyond the scope of this essay.
The data is not enough
There’s a naïve view of science according to which the scientist first observes all the available data, then formulates a hypothesis that fits it, and finally tries to falsify this new hypothesis by making a new experiment. The Copernican revolution teaches us that the relation between data and theory is in fact much more subtle than this.
A true theory does not have to immediately explain all the data better than its predecessors, and can remain inconsistent with parts of the data for a long time.
The relation between data and theory is not a one-way shooting range, but an intricate two-way interplay. The data indicates which of our theories are more or less plausible. But our theories also indicate which data is more or less trustworthy [10]. This might seem like a sacrilegious claim to proponents of the naïve view described above: “ignore the data!? That’s just irrational cherry-picking!” Sure, dishonest cherry-picking is bad. Nonetheless, as the Copernican revolution shows, the act of disregarding some data in a principled manner as it doesn’t conform to strong prior expectations has been critical to the progress of science [11].
When Einstein famously remarked “God doesn’t play dice”, he arguably adopted the same kind of mindset. He had built a complex worldview characterised by a certain mathematical law-likeness, and was confident in it to the extent that if quantum mechanics threatened its core principles, then quantum mechanics was wrong -- not him.
Sometimes, scientists have to be bold -- or arrogant -- enough to trust their priors over the data.
...and, finally, deep learning
It seems apt to notice some similarities between the state of astronomy during the Copernican revolution and the current state of deep learning research.
Both are nascent fields, without a unifying theory that can account for the phenomena from first principles, like Newtonian physics eventually did for astronomy.
Both have seen researchers cling to their models for decades without encouraging data: many of the most succesful current deep learning techniques (conv nets, recurrent nets and LSTMs, gradient descent, ...) were invented in the 20th century, but didn’t produce spectacular results until decades later when sufficient computing power became available. It would be interesting to find out if people like like Geoffrey Hinton and Yann Le Cunn share intellectual habits with people like Copernicus and Galileo.
Finally, I’m particularly struck by the superficial similarities between the way Ptolemy and Copernicus happened upon a general, overpowered tool for function approximation (Fourier analysis) that enabled them to misleadingly gerrymander false theories around the data, and the way modern ML has been criticized as an inscrutable heap of linear algebra and super-efficient GPUs. I haven’t explored whether these similarities go any deeper, but one implication seems to be that the power and versatility of deep learning might allow suboptimal architectures to perform deceivingly well (just like the power of epicycle-multiplication kept geocentrism alive) and hence distract us from uncovering the actual architectures underlying cognition and intelligence.
Crossposted to my blog here.
Footnotes
[1] They of course are taught, because that is how I learnt about them. But this was in a university course on the philosophy of science. The story of Galileo is probably taught in most middle schools [no source, my own hunch]. But only about 0.5% of US college students major in philosophy [source], and I’d guesstimate something like a third of them to take classes in philosophy of science.
[2] This last step is kind of a blackbox. My model was something like “a true theory was around for long enough, and gained enough support, that it was eventually adopted”. This sounds quite romantic, if not magical. It’s unclear exactly how this happened, and in particular what strategic mistakes the Church made that allowed it to.
[3] Figure credit of the Polaris Institute of Iowa State University, which provides a great tutorial on medieval and renaissance astronomy here.
[4] I spent way too long trying to understand this, but this animation was helpful.
[5] I spent two hours trying to understand the geometry of this and I won’t drag you down that rabbit-hole, but if you’re keen to explore yourself, check out these links: [1], [2].
[6] It is however a common mistake to imagine these as clearly elongated ellipses: their eccentricity is very small. For most practical purposes apart from measurement and prediction they look like circles (Price, 1957).
[7] Russell's teapot is a skeptic thought-experiment intended to reveal the absurdity of unfalsifiable views, by postulating that there’s a teapot orbiting Jupiter and that it’s too small to be detectable, but nonetheless insisting that it really is there.
[8] And to think my philosopher friends thinks Gettier problems are nonsense!
[9] There’s of course a sense of mysticism in this, which -- like the rest of Wittgenstein’s mysticism -- I don’t like. Mysticism is mostly just a clever way of scoring social understanding-the-world-points without actually understanding the world. It might be that heliocentrism and geocentrism are genuinely indistinguishable from our vantage point, in which case the confusion here is just a linguistic sleight-of-hand, rather than an actual oddity in how we perceive the world. But this doesn’t seem correct. After all, we were able to figure out heliocentrism from our vantage point, indicating that heliocentrism is distinguishable from geocentrism from our vantage point.
[10] In Bayesian terms, your posterior is determined by both your likelihoods and your priors.
[11] And is core to rationality itself, on the Bayesian view.
References
Anscombe, E. (1959). An Introduction to Wittgenstein’s Tractatus. pp. 151.
Brown, M. (2016) “Copernicus’ revolution and Galileo’s vision: our changing view of the universe in pictures”. The Conversation. Available online here.
"Copernicus, Nicholas." Complete Dictionary of Scientific Biography. Retrieved October 26, 2017 from Encyclopedia.com, here.
Di Liscia, D. A. "Johannes Kepler". The Stanford Encyclopedia of Philosophy (Fall 2017 Edition), Zalta, E. N. (ed.).
Einstein, A. (2001). (Foreword) Dialogue Concerning the Two Chief World Systems: Ptolemaic and Copernican.
Frankfurt, H. (1999), Necessity, Volition and Love. pp. 40.
Gingerich, O. J. (1973) “The Copernican Celebration”. Science Year, pp. 266-267.
Gingerich, O. J. (1975) “‘Crisis’ versus aesthetic in the Copernican revolution”. Vistas in Astronomy 17(1), pp. 85-95.
Hanson, N. R. (1960) “The Mathematical Power of Epicyclical Astronomy” Isis, 51(2), pp. 150-158.
Knoll, P. (1975) “The Arts Faculty at the University of Cracow at the end of the Fifteenth Century”. The Copernican Achievement, Westman, R. S (ed.)
Kuhn, T. (1957) The Copernican Revolution. pp. 133.
Ladyman, J. (2001) Understanding Philosophy of Science. Chapter 4: Revolutions and Rationality.
Lakatos, I. & Zahar, E. (1975). “Why did Copernicus Research Program Supersede Ptolemy’s?”. The Copernican Achievement, Westman, R. S (ed.)
MacLachlan, J & Gingerich, O. J. (2005) Nicolaus Copernicus: Making the Earth a Planet, pp. 76.
Neugebauer, O. (1968). “On the Planetary Theory of Copernicus”, Vistas in Astronomy.,10, pp. 103.
Sheila, R. "Nicolaus Copernicus". The Stanford Encyclopedia of Philosophy (Fall 2015 Edition), Zalta, E. N. (ed.), available online.
Price, D. J. (1957) “Contra Copernicus: a critical re-estimation of the mathematical Planetary Theory of Copernicus, Ptolemy and Kepler”. Critical Problems in the History of Science, Clagett, M. (ed.).
Westman, R. S. (2001) "Kepler's early physical-astrological problematic." Journal for the History of Astronomy, 32, pp. 227-236.
Wilson, L. A. (2000) “The Ptolemaic Model” in the Polaris Project, Iowa State University. Available online here.