This post discusses two books by Michael Strevens: The Knowledge Machine: How Irrationality Created Modern Science and Thinking Off Your Feet: How Empirical Psychology Vindicates Armchair Philosophy. I loved the former, which tries to answer the core question in philosophy of science: why does science work so well? It’s a masterful synthesis of previous ideas and important new arguments, and it’ll be my go-to recommendation from now on for people interested in the field. The latter was… slightly less persuasive. But let’s start with the good stuff.

The Knowledge Machine begins with a review of two of the key figures in philosophy of science: Popper and Quine. Historically, philosophers of science focused on identifying a “scientific method”: a specific way of generating theories, designing experiments, and evaluating evidence which, when followed, led scientists to the truth. Popper’s influential account of the scientific method focused on scientists trying to refute their hypotheses. He claimed that only severe tests which attempt to falsify a hypothesis can give us reason to provisionally accept it. Along with other early philosophers of science, Popper’s work promoted (according to Strevens) “an ideal of the scientist as a paragon of intellectual honesty, standing up for truth in the face of stifling opposition from the prevailing politics, culture, and ideology”.


However, over the last half-century or so a range of criticisms of this flattering view of science have emerged. Most prominent is Kuhn, who in his book The Structure of Scientific Revolutions characterises scientists as constrained within a specific paradigm of thinking, unable to rationally decide between different paradigms. Soon afterwards, in his book The Sleepwalkers, Koestler described the emergence of early science not as the triumph of a superior method, but rather as “a history of collective obsessions and controlled schizophrenias”. More recently, Feyerabend’s book Against Method espoused a position he called “epistemological anarchism”, arguing that “anything goes” in the pursuit of truth.


Strevens focuses on Kuhn, but his arguments are in line with the positions of the others. He summarises a range of case studies of scientists ignoring inconvenient data, deciding questions via political maneuvering, and generally behaving in ways no “scientific method” would endorse. These case studies also reiterate a point made by Quine: that a theory can never be fully falsified, since it’s always possible to argue that it’s consistent with new evidence - e.g. by tacking on new parameters, or appealing to experimental mistakes. This line of thinking provides a useful counterbalance to earlier idolisation of the scientific method, but in doing so it reopens the core question of philosophy of science: if the scientific method isn’t what makes science work so well, then what does?


Strevens’ core idea is to strip down scientific methodology to the bare basics. Instead of trying to understand the success of science as being due to a shared methodology for generating theories, or for designing experiments, or for interpreting evidence, or for rejecting theories, we should understand it as being due to a shared norm about what types of evidence to accept. He calls it the Iron Rule of Explanation: resolve disagreements via empirical tests. The law results in a strict separation between “private” and “public” science: scientists in private proceed on the basis of hunches, aesthetic intuitions, rivalries, visions from god, or whatever other motivations they like. But in public, these are all sterilised: only empirical tests count. This forces scientists to search more and more deeply for key empirical data, rather than trying to build castles of arguments which aren’t ever tested - and which can therefore be washed away even after millennia of work, as the example of theology shows. In a particularly evocative passage, Strevens portrays science as a communal process for producing solid facts:


Science, then, is built up like a coral reef. Individual scientists are the polyps, secreting a shelly carapace that they bequeath to the reef upon their departure. That carapace is the sterilized public record of their research, a compilation of observation or experimentation and the explanatory derivation, where possible, of the data from known theories and auxiliary assumptions. The scientist, like a polyp, is a complete living thing, all too human in just the ways that the historians and sociologists of science have described. When the organism goes, however, its humanity goes with it. What is left is the evidential exoskeleton of a scientific career.


Strevens dates the birth of the iron rule to Newton, and in particular the famous passage where he says that he will “feign no hypotheses” about why the laws of gravity work the way they do. Newton thereby accepts a shallower and more instrumental conception of “explanation” than previous scientists, who searched for theories built on mechanisms that made intuitive sense. Strevens claims that the counterintuitive nature of the iron rule is why it took so long for science to get started. Shouldn’t the lack of a mechanism which implements Newtonian gravity be a significant strike against it? Shouldn’t the intuitions which led us to a theory be a key part of our arguments for it? And why can’t our other beliefs - say, about the existence of god - help inform our scientific theories?


Strevens agrees that excluding information which we believe is relevant is, in a sense, irrational (hence the book’s subtitle: “how irrationality gave birth to modern science”). But he argues that it’s necessary for the success of science, because it pushes scientists towards doing the difficult work required to find the truth:


We live in a Tychonic world - a world in which great competing stories about the underlying nature of things can be distinguished by, and only by, scrutinizing subtle intricacies and minute differences. Humans in their natural state are not much disposed to attend to such trifles. But they love to win. The procedural consensus imposed by the iron rule creates a dramatic contest within which the trifles acquire an unnatural luster, becoming, for their tactical worth, objects of fierce desire. The rule in this way redirects great quantities of energy that might have gone toward philosophical or other forms of argument into empirical testing. Modern science’s human raw material is molded into a strike force of unnervingly single-minded observers, measurers, and experimenters, generating a vast, detailed, varied, discriminating stock of evidence.


I think this explanation points in the right direction - but it’s incomplete. Why do we need the iron rule to create a dramatic contest, rather than just competing to find any type of compelling evidence? It’s true that early thinkers didn’t understand that we lived in a Tychonic world, and so underrated empirical evidence. But after seeing many examples of the power of empirical evidence (and more specifically, the power of advance predictions), why wouldn’t they update towards empirical evidence being a powerful way to identify the truth, without enshrining it as the only way to identify the truth? In other words, Strevens’ proposal of science-as-competition works almost as well without the iron rule, as long as scientists reward progress towards truth in an unbiased way.


So a complete version of Strevens’ explanation needs to identify why scientists will predictably overrate non-empirical evidence for theories, and reward that evidence more than it deserves. There may be a range of sociological considerations in play - for example, if observers tend to underestimate how much work has gone into finding evidence, then the reputational payoff for doing the hardest empirical work might be disproportionately low, meaning that scientists will focus on other ways to win the game. But for now I want to focus on the hypothesis that we find non-empirical arguments more persuasive than we should because of a range of cognitive biases. To illustrate this point, let’s dig into Strevens’ previous book - a more philosophical work named Thinking off your feet: how empirical psychology vindicates armchair philosophy.

The perils of philosophy


Having been so impressed by his latest book, I was surprised by how much I disagreed with this one. I ended up only skimming through most chapters, so take this summary with a pinch of salt, but my impression of Strevens’ case was as follows. The standard mode of inquiry in philosophy, known as conceptual analysis, aims to discover the “essential natures” of things by consulting our intuitions, especially intuitions about complex edge cases. Conceptual analysis has come under fire over the last few decades from skeptics who point out that almost no concepts can be characterised in terms of necessary and sufficient conditions - most of them are inherently vague or imprecise. Strevens agrees with this point. Nevertheless, he claims, conceptual analysis is still useful because the process of trying to identify essential natures helps us understand even entities which lack them.


What’s bizarre is that Strevens sees so clearly the difficulty of science - which forces us to adopt the strict restriction of the iron rule - yet still thinks that philosophy can make progress basically by accident, by aiming at the wrong target entirely. Perhaps this would be compelling if there were historical examples of this working well, but the ones Strevens identifies are underwhelming, to say the least. Consider, for example, the thought experiment of a swan which spontaneously appears due to random particle fluctuations. Strevens claims that arguing about whether this is “really” a swan helps us understand the “causal-explanatory structure” of normal swans - e.g. the ways in which their properties are explained by their ancestry. To be honest, my main response here is an incredulous stare. I have no idea what valuable knowledge about swans biologists lack, which this type of philosophising has provided, or could ever provide. And I don’t think that’s a coincidence - these types of thought experiments are usually designed to steer as far clear from any empirical uncertainties as possible (and sometimes further), to make the conceptual debate clearer.


Or consider Strevens’ analysis of the concepts of belief and desire in everyday psychology. He argues that conceptual analysis is valuable in this case because this approach to psychology is “all or nothing”: in the face of empirical investigation, concepts like belief and desire “either stand firm or suffer a catastrophic collapse”. To me this seems totally wrongheaded - our understanding of belief and desire has undergone extensive shifts as we’ve gradually learned more about things like the subconscious, behavioural reinforcement, addiction, prospect theory, dual process theory, signalling theory, evolutionary psychology, and so on. By contrast, conceptual analysis of belief has been stuck in an unproductive merry-go-round of definitions and counterexamples for decades.


This is not to say that there has been no progress in philosophy - see, for instance, Tom Adamczewski’s list of philosophy success stories. But it seems like Strevens, and many other philosophers, dramatically overestimate how useful philosophy has been. I claim that this is because common cognitive biases (like the bias towards essentialism, and confirmation bias, and hindsight bias) make philosophical arguments seem more insightful than they actually are. And if these biases are common even amongst the brightest thinkers, it answers the question I posed above about why the iron rule is still necessary. By ruling out these types of arguments, the iron rule doesn’t just steer us towards useful research, it also protects us from cognitive biases which make conceptual arguments seem disproportionately valuable.


I don’t want to point the finger only at philosophy; I think many other humanities and social sciences have similar problems. But as one of the fields which makes least use of empirical data, philosophy is a particularly easy clear illustration of my core claim: science succeeds because the iron rule of explanation (“resolve disagreements via empirical tests”) mitigates cognitive and sociological biases in our judgments of the strengths of different types of evidence.


There’s much more room to elaborate on the specific types of biases involved; I do that to some extent in this blog post, and it’s also a core theme of Yudkowsky’s writings (along with how to do better than the standard scientific process). But one point to note is that this formulation assumes that human reasoning is actually pretty good in general - that, if we get rid of these biases, we’re capable of thinking in a broadly reliable way about domains that are very far removed from our everyday experience. So in some sense, an explanation for why science succeeds needs to also be a story about human intelligence, and the mental models which we can build using it. But I’ll save that for another post.
New Comment
5 comments, sorted by Click to highlight new comments since:
But after seeing many examples of the power of empirical evidence (and more specifically, the power of advance predictions), why wouldn’t they update towards empirical evidence being a powerful way to identify the truth, without enshrining it as the only way to identify the truth?

There is one non-empirical of identifying truth that reliably works: formal proof. Mathematics is built on this, and it is similarly impressive to empirical sciences.

Why do we need the iron rule to create a dramatic contest, rather than just competing to find any type of compelling evidence? (...) In other words, Strevens’ proposal of science-as-competition works almost as well without the iron rule, as long as scientists reward progress towards truth in an unbiased way.

If we're basing a competition on this, then asking scientists to "reward progress towards truth in an unbiased way" is a really big ask, because everyone who's working in a field is going to have interests that will make them biased (regardless of whether this takes the shape of machiavellian reasoning or self-serving cognitive biases).

So here's one take on this: Most sources of evidence are so fuzzy and intuition-based that you can't trust people to apply them fairly. If someone rewards/punishes a certain fuzzy evidential move, you can't tell if they're doing it for truth-seeking or self-serving purposes. But advanced prediction and formal proofs are so clear and unambiguous that you can enforce community norms around applying them pretty fairly.

Contrasting this with your explanation: The issue isn't that people overestimate the value of non-empirical evidence. It's that evaluation of non-empirical (and unformalized) evidence is so ambiguous that you can't hope to rely on them in a community setting with many non-epistemic interests. (Or so goes the hypothesis.)

By contrast, conceptual analysis of belief has been stuck in an unproductive merry-go-round of definitions and counterexamples for decades.

Thousands of years, given the story of Diogenes' plucked chicken. "Behold Plato's man!" I believe one can find similar examples throughout the centuries since. I wonder if "conceptual analysis" only became a named thing when under the march of science it fell from being the default way of enquiring into truth.

[-]TAG10

By contrast, conceptual analysis of belief has been stuck in an unproductive merry-go-round of definitions and counterexamples for decades.

Who's escaped from it?

If everything philosophy considers is a concrete, tangible thing, then you can use empiricism to solve everything philosophy considers. But it is not the case that everything that philosophy considers is a concrete, tangible thing. Mainstream philosophy considers abstract topics like Truth and Goodness. So does rationalism. Rationalism considers abstract topics like Truth and Goodness, and it doesn't have any empirical (or mathematical) methods for investigating them. So it's using conceptual analysis, or something like conceptual analysis.

Which? Well, if we had a really clear exposition of conceptual analysis, with no fuzzy edge cases -- a conceptual analysis of conceptual analylis -- we could tell. But if we did, it would work, in that case. On the other hand, if it never works, then we can't apply it to itself, and we therefore can't make clear claims that it fails in every case.

[-]TAG10

Let's say you start with 10 hypotheses, test them empirically, and 5 fail. So you now have five not-wrong hypotheses. But which is the least wrong, assuming that they are equally predictive?

At this point, many people would appeal to parsimony. But theoretical parsimony is not empirical, and does not have an empirical justification. The various justifications for the various simplicity criteria use typically philosophical reasoning.

What were we hoping to get out parsimony, that we don't already have on the basis of empirical testing? Correspondence to the territory, some kind of statement about the world. That creates a dilemma. Either we accept that purely empirical science is instrumentalist and incapable of informing us about reality; or we need to use philosophical reasoning in science.

Of course , scientists will object that they are not self consciously and explicitly engaging in philosophical reasoning ... and they are not! Scientists typically accept claims like

*science tells you about the world

*simplicity is important

*Simplicity has something to do with truth.

..as givens. But that doesn't help at all.

[-]TAG10

Strevens claims that arguing about whether this is “really” a swan helps us understand the “causal-explanatory structure” of normal swans—e.g. the ways in which their properties are explained by their ancestry. To be honest, my main response here is an incredulous stare. I have no idea what valuable knowledge about swans biologists lack, which this type of philosophising has provided, or could ever provide.

I think the point was to explicate "really", not "swan"