The amusing thing is that Mitchel's argument proves much more than he wants it to prove.
Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way. Unless direct replications are conducted by flawless experimenters, nothing interesting can be learned from them.
Notice that the above argument applies just as well to the original experiment being replicated.
Yes, noticed.
Has anyone read his entire article? Does he attempt any justification for why this particular argument doesn't equally apply to the original experiment?
One principle I try to keep in mind is "The other guy is probably not a total moron. If it seems that way, you're probably missing something."
I read it. He has a section titled "The asymmetry between positive and negative evidence".
His argument is that a positive result is like seeing a black swan, and a null result is like seeing a white swan, and once you see a black swan, then no matter how many white swans you see it doesn't prove that all swans are white.
He addresses the objection that this leaves us unable to ever reject a spurious claim. His answer is that, since negative evidence is always meaningless, we should get positive evidence that the experimenter was wrong.
I think this is a fair summary of the section. It's not long, so you can check for yourself. I am... not impressed.
There's a lot wrong with the argument; he has no actual justification for assuming that social science is anything like swan-spotting.
But even within his unjustified analogy... apparently if someone reports a new color of swan in Australia, he might give polygraphs and vision tests to the reporter, but sending an expedition to Australia to check it out would be of no scientific value.
I don't think that means you are smarter than that Harvard professor. He is a very successful person and has reached heights coveted by many very smart people. It just means that the game he is playing is not one where you get ahead by saying things that make sense.
For example, if you listen to a successful politician and spot a false statement he utters, that does not mean that you are smarter than that politician.
You should have looked at his vita for a more accurate description of his activities. If you had looked at his paper titles, some of them indicate he's not a stranger to social justice like theorizing and investigation, and likewise his funding sources, on top of Harvard's well-earned reputation: eg. "What’s in a forename?: Cue familiarity and stereotypical thinking", "Gender differences in implicit weight identity", "Deflecting negative self-relevant stereotype activation: The effects of individuation", "Me and my group: Cultural status can disrupt cognitive consistency", and the funding:
June 2007 – May 2010: National Science Foundation (BCS 0642448), "The neural basis of stereotyping", $609,800 (co-PI: Mahzarin Banaji)...September 2010 – August 2012: Templeton Foundation for Positive Neuroscience, "Vicarious Neural Response to Others as a Basis for Altruistic Behavior", $180,000 (co-PI: Jamil Zaki)
And now you find a man saying that is is an irrelevant demand to expect a repeatable experiment. This is science?
-- Richard Feynman, "Cargo Cult Science"
(Yes, I am aware of the irony of appealing to authority to mock someone who says we need to defer more to established authorities.)
I sort of side with Mitchel on this.
A mentor of mine once told me that replication is useful, but not the most useful thing you could be doing because it's often better to do a followup experiment that rests on the premises established by the initial experiment. If the first experiment was wrong, the second experiment will end up wrong too. Science should not go even slower than it already does - just update and move on, don't obsess.
It's kind of how some of the landmark studies on priming failed to replicate, but there are so many followup studies which are explained by priming really well that it seems a bit silly to throw out the notion of priming just because of that.
Keep in mind, while you are unlikely to hit statistically significance where there is no real result, it's not statistically unlikely to have a real result that doesn't hit significance the next time you do it. Significance tests are attuned to get false negatives more often than false positives.
Emotionally though... when you get a positive result in breast cancer screening even when you're not at risk, you don't just shrug and say "probably a false positive" even though it is. Instead, you irrationally d...
If the first experiment was wrong, the second experiment will end up wrong too.
I guess the context is important here. If the first experiment was wrong, and the second experiment is wrong, will you publish the failure of the second experiment? Will you also publish your suspicion that the first experiment was wrong? How likely will people believe you that your results prove the first experiment was wrong, if you did something else?
Here is what the selection bias will do otherwise:
20 people will try 20 "second experiments" with p = 0,05. 19 of them will fail, one will succeed and publish the results of their successful second experiment. Then, using the same strategy, 20 people will try 20 "third experiments", and again, one of them will succeed... Ten years later, you can have dozen experiments examining and confirming the theory from dozen different angles, so the theory seems completely solid.
It's kind of how some of the landmark studies on priming failed to replicate, but there are so many followup studies which are explained by priming really well that it seems a bit silly to throw out the notion of priming just because of that.
Is there a chance that the process I described was responsible for this?
A mentor of mine once told me that replication is useful, but not the most useful thing you could be doing because it's often better to do a followup experiment that rests on the premises established by the initial experiment. If the first experiment was wrong, the second experiment will end up wrong too. Science should not go even slower than it already does - just update and move on, don't obsess.
Tell me, does anyone actually do what you think they should do? That is, based on a long chain of ideas A->B->C->D, none of which have been replicated, upon experimenting and learning ~Z, do they ever reject the bogus theory D? (Or wait, was it C that should be rejected, or maybe the ~Z should be rejected as maybe the experiment just wasn't powered enough to be meaningful as almost all studies are underpowered or, can you really say that Z logically entailed A...D? Maybe some other factor interfered with Z and so we can 'save the appearances' of A..Z! Yes, that's definitely it!) "Theory-testing in psychology and physics: a methodological paradox", Meehl 1967, puts it nicely (and this is as true as the day he wrote it half a century ago):
...This last methodological si
You should probably have read part of the second sentence: "active vs passive control groups criticism: found, and it accounts for most of the net effect size".
When natural scientists attempt to replicate famous experiments where the original result was clearly correct, with what probability do they tend to succeed? Is it closer to 1 than, say, .7?
I'd think that "famous experiments where the original result was clearly correct" are exactly those whose results have already been replicated repeatedly. If they haven't been replicated they may well be famous -- Stanford prison experiment, I'm looking at you -- but they aren't clearly correct.
Rather, the problem is that at least one celebrated authority in the field hates that, and would prefer much, much more deference to authority.
I don't think this is true at all. His points against replicability are very valid and match my experience as a researcher. In particular:
Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way.
This is a very real issue and I think that if we want to solve the current issues with science we need to be honest about this, rather than close our eyes and repeat the mantra that replication will solve everything. And it's not like he's arguing against accountability. Even in your quoted passage he says:
The field of social psychology can be improved, but not by the publication of negative findings. Experimenters should be encouraged to restrict their “degrees of freedom,” for example, by specifying designs in advance.
Now, I think he goes too far by saying that no negative findings should be published; but I think they need to be held to a high standard for the very reason he gives. On the other han...
What is the purpose of an experiment in science? For instance, in the field of social psychology? For instance,what is the current value of the Milgram experiment? A few people in Connecticut did something in a room at Yale in 1961. Who cares? Maybe it's just gossip from half a century ago.
However, some people would have us believe that this experiment has broader significance, beyond the strict parameters of the original experiment, and has implications for (for example) the military in Texas and corporations in California.
Maybe these people are wrong. Maybe the Milgram experiment was a one-off fluke. If so, then let's stop mentioning it in every intro to psych textbook. While we're at it, why the hell was that experiment funded, anyway? Why should we bother funding any further social psychology experiments?
I would have thought, though, that most social psychologists would believe that the Milgram experiment has predictive significance for the real world. A Bayesian who knows about the results of the Milgram experiment should better be able to anticipate what happens in the real world. This is what an experiment is for. It changes your expectations.
However, if a supp...
Because experiments can be undermined by a vast number of practical mistakes, the likeliest explanation for any failed replication will always be that the replicator bungled something along the way.
This is a very real issue and I think that if we want to solve the current issues with science we need to be honest about this, rather than close our eyes and repeat the mantra that replication will solve everything.
Why is it more likely that the followup experiment was flawed, rather than the original? Are we giving a prior of > 50% to every hypothesis that a social scientist comes up with?
Either way, I think you are being quite uncharitable to Mitchell.
I disagree. Let's look at this section again:
Whether they mean to or not, authors and editors of failed replications are publicly impugning the scientific integrity of their colleagues. Targets of failed replications are justifiably upset, particularly given the inadequate basis for replicators’ extraordinary claims.
Contrast this to:
“This been difficult for me personally because it’s an area that’s important for my research,” he says. “But I choose the red pill. That’s what doing science is.”
From here, linked before on LW here.
The first view seems to have the implied assumption that false positives don't happen to good researchers, whereas the second view has the implied assumption that theories and people are separate, and people should follow the facts, rather than the other way around.
But perhaps it is the case that, in social psychology, the majority of false positives are not innocent, and thus when a researchers results do not replicate it is a sign that they're dishonest rather than that they're unlucky. In such a case, he is declaring that researchers should not try to expose dishonesty, which should bring down opprobrium from all decent people.
This is why we can't have social science. Not because the subject is not amenable to the scientific method -- it obviously is. People are conducting controlled experiments and other people are attempting to replicate the results. So far, so good.
So, you say people are trying the scientific approach. My guess is, the nature of the problem is such that nothing much came out of these attempts. No great insights were gained, no theories were discovered. Real scientists had nothing to show for their efforts, and this is why the these fields are now not owned by...
I think someone should mention Harry Collins and Trevor Pinch's book The Golem here. It's a collection of episodes from the history of science. The general theme is that in practice, new discoveries do not involve a clear-cut observation followed by theorizing, instead there is a lot of squabbling over whether the researchers involved carried out their experiments correctly, and these kind of feuds can persist for a scientific generation.
My view is that this makes replication attempts all the more important. But it also shows that some resistance and recri...
In the second paragraph of the quote the author ignores the whole point of replication efforts. We know that scientific studies may suffer from methodological errors. The whole point of replication studies are to identify methodological errors. If they disagree then you know there is an uncontrolled variable or methodological mistake in one or both of them, further studies and the credibility of the experimenters is then used to determine which result is more likely to be true. If the independent studies agree then it is evidence that they are both correct...
While I agree that this guy needs to hand in his "Scientist" card this is an individual who no more reflects on his field than any other individual does on theirs.
There was a notable climate scientist whose response to people asking for his data was literally "no, you'll just try to use it to prove me wrong".
Edit: exact quote:"Even if WMO agrees, I will still not pass on the data. We have 25 or so years invested in the work. Why should I make the data available to you, when your aim is to try and find something wrong with it.&quo...
Jason Mitchell is [edit: has been] the John L. Loeb Associate Professor of the Social Sciences at Harvard. He has won the National Academy of Science's Troland Award as well as the Association for Psychological Science's Janet Taylor Spence Award for Transformative Early Career Contribution.
Here, he argues against the principle of replicability of experiments in science. Apparently, it's disrespectful, and presumptively wrong.
This is why we can't have social science. Not because the subject is not amenable to the scientific method -- it obviously is. People are conducting controlled experiments and other people are attempting to replicate the results. So far, so good. Rather, the problem is that at least one celebrated authority in the field hates that, and would prefer much, much more deference to authority.