This is well-recognized issue in philosophy of science, known as the (Pierre) Duhem thesis, that any test of a theory is actually a test of a a conjunction of the theory and the theory of the experiment, and that any resulting disconfirming data might have disconfirmed the theory, the theory of the experiment, or both. The picture turns out to be even more complex, because more than just the theory and the theory of the experiment are tested, but so are background, often nonconscious assumptions, and irrelevant assumptions about statistics. The most detailed technical analysis of theory testing was developed by P. Suppes (1962). Models of data, in Logic, methodology, and philosophy of science (E. Nagel et al., eds., Stanford UP. The proud rejection of data in favor of theory is well attested in history of science, and formed a key part of the neo-Popperian philosophy of Imre Lakatos, e.g. (1970) Criticism and the methology of research programs, in Lakatos & Musgrave (Eds.) Criticism and the growth of knowledge, Cambridge, UP.
Leahey, since I'm not familiar with that line of philosophy, I would ask whether they had the concept of defying data as a bold and insolent act, one which calls for further attention, and which explicitly confesses that the theory and the experimental result are not compatible (so that replication will indeed destroy the theory). This is what I had thought was my innovation. I would not characterize the rejection of data in favor of theory as proud, but rather as sticking your head way out so that someone can easily chop it off.
The post reminds me of one of my favorite scenes from a Hitchcock movie (which I transcribed a while back, knowing it would be useful at some point in the future).
In Hitchcock’s 1938 movie The Lady Vanishes, the heroine Iris Henderson is traveling on a train in the same compartment as an old lady. When the old lady disappears (it later turns out to be connected with a spy ring), Iris scours the train in search of her. She meets a German doctor named Dr. Hartz, who accompanies her on her search for the old lady. But when everyone denies having seen the old lady, Dr. Hartz theorizes that some psychological hallucination must have caused Iris to imagine the old lady’s existence.
But then, finally, Iris finds one woman who admits to having seen the old lady. Iris then confronts Dr. Hartz with this new witness:
“You’ll have to think of a fresh theory now, Doctor.”
"It is not necessary,” Dr. Hartz responds. “My theory was a perfectly good one. The facts were misleading."
I love this idea. It reminds me of a bit of management advice I once heard: knives in the chest, not in the back. For better results, get debate out in the open.
I've read that physics journals use a 0.001% standard for statistical significance. Maybe, in the case of "alternative medicine" or "parapsychology" experiments that must rely on unknown physics to work, we should insist on the same standard? ;)
Eliezer, I think you're mistaken on the facts -- most theories take a lot of experimental anomaly before they get thrown out. For example, Kuhn, in Structure of Scientific Revolutions, for example (which I think is a much better work of history than philosophy, but anyway...), gives a marvelous description of "normal science" as just that -- tinkering with the dominant paradigm, fitting new results into it bit by bit, etc.
My experience in engineering fields is quite a bit different. Strange results happened frequently and data was ignored when the circumstances warranted it. Occasionally strange data lead the reseracher to an "A-Ha!" moment but most often the cause was likely due to "operator error" or poor experimental design
The engineers I've worked with were quite savvy about the limitations of their experimental methods and these issues were debated vigorously and openly.
Apparently, things are a bit different in pure science. Or maybe I was just oblivious to 'character assasination' that was happening.
There's a particular field in which you have to do this all the time: Parapsychology, in which people with PhD credentials somehow manage to coax absurd results from apparently impeccable experimental procedures. Daryl Bem is a good example, as is Diane Hennesy Powell.
I can even understand why they think we are the ones being unscientific: I just showed you data! What more do you want? Well, to be honest, we want data that doesn't contradict 200 years of past research in every other scientific field. We know there must be something wrong with your experiment, even if we can pin down what it is (often this comes to light later!); because the result that people can see the future or communicate with the dead just contradicts way too much of what we already knew for us to take it seriously. It's a lot like publishing a proof that 2=1.
Well, here's an example from my own experience. It's fairly dramatic, actually, to have this happen. Three or four of us, in 1957, put forward a partially complete theory of one of these forces, this weak force. And it was in disagreement with seven -- seven, count them, seven experiments. Experiments were all wrong. And we published before knowing that, because we figured it was so beautiful, it's gotta be right! The experiments had to be wrong, and they were. Now our friend over there, Albert Einstein, used to pay very little attention when people said, "You know, there's a man with an experiment that seems to disagree with special relativity. DC Miller. What about that?" And he would say, "Aw, that'll go away." (Laughter)
(Also, the people in the superluminal neutrino fiasco at OPERA did all but defy their own data in the last paragraph of their original arXiv paper.)
One of the great weaknesses of Science is this mistaken idea that if an experiment contradicts the dominant theory, we should throw out the theory instead of the experiment.
Experiments can go awry. They can contain design flaws. They can be deliberately corrupted. They can be unconsciously corrupted. They can be selectively reported. Most of all, 1 time in 20 they can be "statistically significant" by sheer coincidence, and there are a lot of experiments out there.
Unfortunately, Science has this notion that you can never go against an honestly obtained experimental result. So, when someone obtains an experimental result that contradicts the standard model, researchers are faced with a dilemma for resolving their cognitive dissonance: they either have to immediately throw away the standard model, or else attack the experiment - accuse the researchers of dishonesty, or flawed design, or conflict of interest...
Someone once presented me with a new study on the effects of intercessory prayer (that is, people praying for patients who are not told about the prayer), which showed 50% of the prayed-for patients achieving success at in-vitro fertilization, versus 25% of the control group. I liked this claim. It had a nice large effect size. Claims of blatant impossible effects are much more pleasant to deal with than claims of small impossible effects that are "statistically significant".
So I cheerfully said: "I defy the data."
My original phrasing was actually "I deny the data". Nonetheless I said it outright, without apology, and with deliberate insolence. I am keeping my theory; your experiment is wrong.
If an experimental result contradicts the Standard Model, this is an important fact. It needs to be openly acknowledged. An experiment that makes traditionalists want to discard the data - or even an experiment that makes traditionalists very skeptical of the data - should be a high priority for replication. An experiment worth defying should command attention!
But it is not socially acceptable to say, "The hell with your experimental falsification, I'm keeping my theory." So the data has to be defied covertly - by character assassination of the researchers, by sly innuendos, by dire hints of controversy. The data has to be dismissed, excused away, swept under a rug, silently into the dark, because you can't admit you're defying the data. This is not a good way of focusing attention on an anomalous result. This is not a good way to ensure funding for replication attempts.
It would be much better if science had a standard procedure for saying, "I defy the data!" It would be clearly understood that this was a bold act, and someone else in the audience might stand up and say, "Wait a minute, is that data really worth defying?" If a major figure in the field said "I defy the data!", this would be sufficient justification on grant proposals for why the result urgently needed replication. Scientists could say, "I'm holding my breath, waiting for replication," rather than having to take sides immediately in the character-assassination controversy.
Maybe you could even get the media to report that the experiment has been "published but defied". Then the replication, or failure to replicate, would be news. The replicators could get their names in the newspaper, and the negative result could be published in a major journal. If you want replications done, you'll have to offer some incentive.
I would also suggest that when an experiment is defied, the replication must pre-declare a minimum effect size, and attain significance of p<0.01. In extreme cases where claims have been made and shot down before, p<0.001.
Oh, and the prayer study? Soon enough we heard that it had been retracted and was probably fraudulent. But I didn't say fraud. I didn't speculate on how the results might have been obtained. That would have been dismissive. I just stuck my neck out, and nakedly, boldly, without excuses, defied the data.
Addendum: I should have spelled this out explicitly: You can defy the data on one experiment. You can't defy the data on multiple experiments. At that point you either have to relinquish the theory or dismiss the data - point to a design flaw, or refer to an even larger body of experiments that failed to replicate the result, or accuse the researchers of a deliberate hoax, et cetera. But you should not turn around and argue that the theory and the experiment are actually compatible. Why didn't you think of that before you defied the data? Defying the data admits that the data is not compatible with your theory; it sticks your neck way out, so your head can be easily chopped off.