Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
You can build a machine to draw [deductive] conclusions for you, but I think you can never build a machine that will draw [probabilistic] inferences.
George Polya, 34 years before Pearl (1988) launched the probabilistic revolution in AI
The energy produced by the breaking down of the atom is a very poor kind of thing. Anyone who expects a source of power from the transformation of these atoms is talking moonshine.
Ernest Rutherford in 1933, 18 years before the first nuclear reactor went online
I confess that in 1901 I said to my brother Orville that man would not fly for fifty years. Two years later we ourselves made flights. This demonstration of my impotence as a prophet gave me such a shock that ever since I have distrusted myself...
Wilbur Wright, in a 1908 speech
Startling insights are hard to predict.1 Polya and Rutherford couldn't have predicted when computational probabilistic reasoning and nuclear power would arrive. Their training in scientific skepticism probably prevented them from making confident predictions about what would be developed in the next few decades.
What's odd, then, is that their scientific skepticism didn't prevent them from making confident predictions about what wouldn't be developed in the next few decades.
I am blessed to occasionally chat with some of the smartest scientists in the world, especially in computer science. They generally don't make confident predictions that certain specific, difficult, insight-based technologies will be developed soon. And yet, immediately after agreeing with me that "the future is very hard to predict," they will confidently state that a specific, difficult technology is more than 50 years away!
Error. Does not compute.
Related to: Infinite Certainty
Suppose the people at FiveThirtyEight have created a model to predict the results of an important election. After crunching poll data, area demographics, and all the usual things one crunches in such a situation, their model returns a greater than 999,999,999 in a billion chance that the incumbent wins the election. Suppose further that the results of this model are your only data and you know nothing else about the election. What is your confidence level that the incumbent wins the election?
Mine would be significantly less than 999,999,999 in a billion.
When an argument gives a probability of 999,999,999 in a billion for an event, then probably the majority of the probability of the event is no longer in "But that still leaves a one in a billion chance, right?". The majority of the probability is in "That argument is flawed". Even if you have no particular reason to believe the argument is flawed, the background chance of an argument being flawed is still greater than one in a billion.
More than one in a billion times a political scientist writes a model, ey will get completely confused and write something with no relation to reality. More than one in a billion times a programmer writes a program to crunch political statistics, there will be a bug that completely invalidates the results. More than one in a billion times a staffer at a website publishes the results of a political calculation online, ey will accidentally switch which candidate goes with which chance of winning.
So one must distinguish between levels of confidence internal and external to a specific model or argument. Here the model's internal level of confidence is 999,999,999/billion. But my external level of confidence should be lower, even if the model is my only evidence, by an amount proportional to my trust in the model.
Edited to add:
I'm surprised by negative/neutral reviews. This means that either I'm simply wrong about what counts as interesting, or I haven't expressed my point very well. Based on commenter response, I think the problem is the latter. In the next week or so, expect a much more concise version of this post that expresses my point about epistemology without the detour through a criticism of economics.
At the beginning of my last post, I was rather uncharitable to neoclassical economics:
If I had to choose a single piece of evidence off of which to argue that the rationality assumption of neoclassical economics is totally, irretrievably incorrect, it's this article about financial incentives and medication compliance.... [to maintain that this theory is correct] is to crush reality into a theory that cannot hold it.
Some mistook this to mean that I believe neoclassical economists honestly, explicitly believe that all people are always totally rational. But, to quote Rick Moranis, "It's not what you think. It's far, far worse." The problem is that they often take the complex framework of neoclassical economics and believe that a valid deduction within this framework is a valid deduction about the real world. However, deductions within any given framework are entirely uninformative unless the framework corresponds to reality. But, because such deductions are internally valid, we often give them far more weight than they are due. Testing the fit of a theoretical framework to reality is hard, but a valid deduction within a framework feels so very satisfying. But even if you have a fantastically engineered hammer, you cannot go around assuming everything you want to use it on is a nail. It is all too common for experts to assume that their framework applies cleanly to the real world simply because it works so well in its own world.
If this concept doesn't make perfect sense, that's what the rest of this post is about: spelling out exactly how we go wrong when we misuse the essentially circular models of many sciences, and how this matters. We will begin with the one discipline in which this problem does not occur. The one discipline which appears immune to this type of problem is mathematics, the paragon of "pure" academic disciplines. This is principally because mathematics appears to have perfect conformity with reality, with no research or experimentation needed to ensure said conformity. The entire system of mathematics exists, in a sense, in its own world. You could sit in windowless room (perhaps one with a supercomputer) and, theoretically, derive every major theorem of mathematics, given the proper axioms. The answer to the most difficult unsolved problems in mathematics was determined the moment the terms and operators within them were defined - once you say a "circle" is "a convex polygon with every point equidistant from a center," you have already determined every single digit of pi. The problem is finding out exactly how this model works - making calculations and deductions within this model. In the case of mathematics, for whatever reason, the model conforms perfectly to the real world, so any valid mathematical deduction is a valid deduction in the real world.
This is not the case in any true science, which by necessity must rely on experiment and observation. Every science operates off of some simplified model of the world, at least with our current state of knowledge. This creates two avenues of progress: discoveries within the model, which allow one to make predictions about the world, and refinements of the model, which make such predictions more accurate. If we have an internally consistent framework, theoretical manipulation within our model will never show us our error, because our model is circular and functions outside the real world. It would be like trying to predict a stock market crash by analyzing the rules of Monopoly, except that it doesn't feel absurd. There's nothing wrong with the model qua the model, the problem is with the model qua reality, and we have to look at both of them to figure that out.
Economics is one of the fields that most suffers from this problem. Our mathematician in his windowless room could generate models of international exchange rates without ever having seen currency, once we gave him the appropriate definitions and assumptions. However, when we try using these models to forecast the future, life gets complicated. No amount of experimenting within our original model will fix this without looking at the real world. At best, we come up with some equations that appear to conform to what we observe, but we run the risk that the correspondence is incidental or that there were some (temporarily) constant variables we left out that will suddenly cease to be constant and break the whole model. It is all too easy to forget that the tremendous rigor and certainty we feel when we solve the equations of our model does not translate into the real world. Getting the "right" answer within the model is not the same thing as getting the real answer.
As an obvious practical example, an individual with a serious excess of free time could develop a model of economics which assumes that agents are rational paper-clip maximizers - that agents are rational and their ultimate concern is maximizing the number of existing paper-clips. Given even more free time and a certain amount of genius, you could even model the behaviour of irrational paper-clip maximizers, so long as you had a definition of irrational. But however refined these models are, they models will remain entirely useless unless you actually have some paper-clip maximizers whose behaviour you want to predict. And even then, you would need to evaluate your predictions after they succeed or fail. Developing a great hammer is relatively useless if the thing you need to make must be put together with screws.
There is an obvious difference in the magnitude of this problem between the sciences, and it seems to be based on the difficulty of experimenting within them. In harder sciences where experiments are fairly straightforwards, like physics and chemistry, it is not terribly difficult to make models that conform well with reality. The bleeding edge of, say, physics, tends to like in areas that are either extremely hard to observe, like the subatomic, or extremely computation-intensive. In softer sciences, experiments are very difficult, and our models rely much more on powerful assumptions, social values, and armchair reasoning.
As humans, we are both bound and compelled to use the tools we have at our disposal. The problem here is one of uncertainty. We know that most of our assumptions in economics are empirically off, but we don't know how wrong or how much that matters when we make predictions. But the model nevertheless seeps into the very core of our model of reality itself. We cannot feel this disconnect when we try to make predictions; a well-designed model feels so complete that there is no feeling of error when we try to apply it. This is likely because we are applying it correctly, but it just doesn't apply to reality. This leads people to have high degrees of certainty and yet frequently be wrong. It would not surprise me if the failure of many experts to appreciate the model-reality gap is responsible for a large proportion of incorrect predictions.
This, unfortunately, is not the end of the problem. It gets much worse when you add a normative element into your model, when you get to call some things, "efficient" or "healthful," or "normal," or "insane." There is also a serious question as to whether this false certainty is preferable to the vague unfalsifiability of even softer social sciences. But I shall save these subjects for future posts.
Response to Man-with-a-hammer syndrome.
It's been claimed that there is no way to spot Affective Death Spirals, or cultish obsession with the One Big Idea of Everything. I'd like to posit a simple way to spot such error, with the caveat that it may not work for every case.
There's an old game called Two Truths and a Lie. I'd bet almost everyone's heard of it, but I'll summarize it just in case. A person makes three statements, and the other players must guess which of those statements is false. The statement-maker gets points for fooling people, people get points for not being fooled. That's it. I'd like to propose a rationalist's version of this game that should serve as a nifty check on certain Affective Death Spirals, runaway Theory-Of-Everythings, and Perfectly General Explanations. It's almost as simple.
Say you have a theory about human behaviour. Get a friend to do a little research and assert three factual claims about how people behave that your theory would realistically apply to. At least one of these claims must be false. See if you can explain every claim using your theory before learning which one's false.
If you can come up with a convincing explanation for all three statements, you must be very cautious when using your One Theory. If it can explain falsehoods, there's a very high risk you're going to use it to justify whatever prior beliefs you have. Even worse, you may use it to infer facts about the world, even though it is clearly not consistent enough to do so reliably. You must exercise the utmost caution in applying your One Theory, if not abandon reliance on it altogether. If, on the other hand, you can't come up with a convincing way to explain some of the statements, and those turn out to be the false ones, then there's at least a chance you're on to something.
Come to think of it, this is an excellent challenge to any proponent of a Big Idea. Give them three facts, some of which are false, and see if their Idea can discriminate. Just remember to be ruthless when they get it wrong; it doesn't prove their idea is totally wrong, only that reliance upon it would be.
Edited to clarify: My argument is not that one should simply abandon a theory altogether. In some cases, this may be justified, if all the theory has going for it is its predictive power, and you show it lacks that, toss it. But in the case of broad, complex theories that actually can explain many divergent outcomes, this exercise should teach you not to rely on that theory as a means of inference. Yes, you should believe in evolution. No, you shouldn't make broad inferences about human behaviour without any data because they are consistent with evolution, unless your application of the theory of evolution is so precise and well-informed that you can consistently pass the Two-Truths-and-a-Lie Test.
See also: The Importance of Saying "Oops"
I'm posting this to call attention to the fact that I've now reconsidered the highly confident probability estimates in my post from yesterday on the Knox/Sollecito case. I haven't retracted my arguments; I just now think the level of confidence in them that I specified was too high. I've added the following paragraph to the concluding section:
[EDIT: After reading comments on this post, I have done some updating of my own. I now think I failed to adequately consider the possibility of my own overconfidence. This was pretty stupid of me, since it meant that the focus was taken away from the actual arguments in this post, and basically toward the issue of whether 0.001 can possibly be a rational estimate for anything you read about on the Internet. The qualitative reasoning of this post, of course, stands. Also, the focus of my accusations of irrationality was not primarily the LW community as reflected in my previous post; I actually think we did a pretty good job of coming to the right conclusion given the information provided -- and as others have noted, the levelheadedness with which we did so was impressive.]
While object-level comments on the case and on my reasoning about it should probably continue to be confined to that thread, I'd be interested in hearing in comments here what people think about the following:
- How much of a distraction did you find my extremely confident probabilities to be from the substance of my arguments?
- How much did those confident estimates make it seem like I was disagreeing, rather than agreeing, with the LW survey consensus? (It seemed to me that I had provoked people into trumpeting pro-guilt arguments more than they otherwise would have if I had initally given more "reasonable" numbers.)
- To what sorts of propositions, if any, do you yourself assign probabilities on the order of 0.999 or 0.001?
In my journeys across the land, I have, to date, encountered four sets of probability calibration tests. (If you just want to make bets on your predictions, you can use Intrade or another prediction market, but these generally don't record calibration data, only which of your bets paid out.) If anyone knows of other tests, please do mention them in the comments, and I'll add them to this post. To avoid spoilers, please do not post what you guessed for the calibration questions, or what the answers are.
The first, to boast shamelessly, is my own, at http://www.acceleratingfuture.com/tom/?p=129. My tests use fairly standard trivia questions (samples: "George Washington actually fathered how many children?", "Who was Woody Allen's first wife?", "What was Paul Revere's occupation?"), with an emphasis towards history and pop culture. The quizzes are scored automatically (by computer) and you choose whether to assign a probability of 96%, 90%, 75%, 50%, or 25% to your answer. There are five quizzes with fifty questions each: Quiz #1, Quiz #2, Quiz #3, Quiz #4 and Quiz #5.
Previously in series: Mandatory Secret Identities
I've noticed a serious problem in which aspiring rationalists vastly overestimate their ability to optimize other people's lives. And I think I have some idea of how the problem arises.
You read nineteen different webpages advising you about personal improvement—productivity, dieting, saving money. And the writers all sound bright and enthusiastic about Their Method, they tell tales of how it worked for them and promise amazing results...
But most of the advice rings so false as to not even seem worth considering. So you sigh, mournfully pondering the wild, childish enthusiasm that people can seem to work up for just about anything, no matter how silly. Pieces of advice #4 and #15 sound interesting, and you try them, but... they don't... quite... well, it fails miserably. The advice was wrong, or you couldn't do it, and either way you're not any better off.
And then you read the twentieth piece of advice—or even more, you discover a twentieth method that wasn't in any of the pages—and STARS ABOVE IT ACTUALLY WORKS THIS TIME.
At long, long last you have discovered the real way, the right way, the way that actually works. And when someone else gets into the sort of trouble you used to have—well, this time you know how to help them. You can save them all the trouble of reading through nineteen useless pieces of advice and skip directly to the correct answer. As an aspiring rationalist you've already learned that most people don't listen, and you usually don't bother—but this person is a friend, someone you know, someone you trust and respect to listen.
And so you put a comradely hand on their shoulder, look them straight in the eyes, and tell them how to do it.
Consider the problem of an agent who is offered a chance to improve their epistemic rationality for a price. What is such an agent's optimal strategy?
A complete answer to this problem would involve a mathematical model to estimate the expected increase in utility associated with having more correct beliefs. I don't have a complete answer, but I'm pretty sure about one thing: From an instrumental rationalist's point of view, to always accept or always refuse such offers is downright irrational.
And now for the kicker: You might be such an agent.
One of the classic demonstrations of the Fundamental Attribution Error is the 'quiz study' of Ross, Amabile, and Steinmetz (1977). In the study, subjects were randomly assigned to either ask or answer questions in quiz show style, and were observed by other subjects who were asked to rate them for competence/knowledge. Even knowing that the assignments were random did not prevent the raters from rating the questioners higher than the answerers. Of course, when we rate individuals highly the affect heuristic comes into play, and if we're not careful that can lead to a super-happy death spiral of reverence. Students can revere teachers or science popularizers (even devotion to Richard Dawkins can get a bit extreme at his busy web forum) simply because the former only interact with the latter in domains where the students know less. This is certainly a problem with blogging, where the blogger chooses to post in domains of expertise.
Specifically, Eliezer's writing at Overcoming Bias has provided nice introductions to many standard concepts and arguments from philosophy, economics, and psychology: the philosophical compatibilist account of free will, utility functions, standard biases, and much more. These are great concepts, and many commenters report that they have been greatly influenced by their introductions to them at Overcoming Bias, but the psychological default will be to overrate the messenger. This danger is particularly great in light of his writing style, and when the fact that a point is already extant in the literature, and is either being relayed or reinvented, isn't noted. To address a few cases of the latter: Gary Drescher covered much of the content of Eliezer's Overcoming Bias posts (mostly very well), from timeless physics to Newcomb's problems to quantum mechanics, in a book back in May 2006, while Eliezer's irrealist meta-ethics would be very familiar to modern philosophers like Don Loeb or Josh Greene, and isn't so far from the 18th century philosopher David Hume.
If you're feeling a tendency to cultish hero-worship, reading such independent prior analyses is a noncultish way to diffuse it, and the history of science suggests that this procedure will be applicable to almost anyone you're tempted to revere. Wallace invented the idea of evolution through natural selection independently of Darwin, and Leibniz and Newton independently developed calculus. With respect to our other host, Hans Moravec came up with the probabilistic Simulation Argument long before Nick Bostrom became known for reinventing it (possibly with forgotten influence from reading the book, or its influence on interlocutors). When we post here we can make an effort to find and explicitly acknowledge such influences or independent discoveries, to recognize the contributions of Rational We, as well as Me.
Previously in series: Epistemic Viciousness
Robyn Dawes, author of one of the original papers from Judgment Under Uncertainty and of the book Rational Choice in an Uncertain World—one of the few who tries really hard to import the results to real life—is also the author of House of Cards: Psychology and Psychotherapy Built on Myth.
From House of Cards, chapter 1:
The ability of these professionals has been subjected to empirical scrutiny—for example, their effectiveness as therapists (Chapter 2), their insight about people (Chapter 3), and the relationship between how well they function and the amount of experience they have had in their field (Chapter 4). Virtually all the research—and this book will reference more than three hundred empirical investigations and summaries of investigations—has found that these professionals' claims to superior intuitive insight, understanding, and skill as therapists are simply invalid...
Remember Rorschach ink-blot tests? It's such an appealing argument: the patient looks at the ink-blot and says what he sees, the psychotherapist interprets their psychological state based on this. There've been hundreds of experiments looking for some evidence that it actually works. Since you're reading this, you can guess the answer is simply "No." Yet the Rorschach is still in use. It's just such a good story that psychotherapists just can't bring themselves to believe the vast mounds of experimental evidence saying it doesn't work—
—which tells you what sort of field we're dealing with here.
And the experimental results on the field as a whole are commensurate. Yes, patients who see psychotherapists have been known to get better faster than patients who simply do nothing. But there is no statistically discernible difference between the many schools of psychotherapy. There is no discernible gain from years of expertise.
And there's also no discernible difference between seeing a psychotherapist and spending the same amount of time talking to a randomly selected college professor from another field. It's just talking to anyone that helps you get better, apparently.
In the entire absence of the slightest experimental evidence for their effectiveness, psychotherapists became licensed by states, their testimony accepted in court, their teaching schools accredited, and their bills paid by health insurance.
And there was also a huge proliferation of "schools", of traditions of practice, in psychotherapy; despite—or perhaps because of—the lack of any experiments showing that one school was better than another...
View more: Next