HT reddit/r/science: http://www.ruudwetzels.com//articles/Wagenmakersetal_subm.pdf

Probably nobody is surprised here, but I thought one might be interested.

New to LessWrong?

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 9:39 AM

That article is actually a really good introduction to the advantages of Bayesian statistics in experiments over the regular p-value approach.

And it is an even better introduction to proper experiment design. In particular, it argues eloquently for a very clear distinction between exploratory and confirmatory experiments. This distinction should be drilled into every junior undergrad, but somehow even world-famous experimental psychologists can miss it. Now, if we asked Daryl Bem about this, he would probably say that all his experiments were exploratory, and now it is the task of the research community to confirm or reject them. But the problem is, he used significance tests that were only suitable for confirmatory experiments.

That response boils down to "abusing statistics is okay because other fields are doing it too". But the fact that other scientific fields are also abusing statistics does not make it okay, because it does not make the conclusions that result from statistical abuses true. The choice of which statistical test to use is not arbitrary, and using the wrong one is as bad as writing down the wrong value for a low-order digit; you can get away with it when the effect size is large, but not here.

(Paraphrased from my reply on that article's comments section)

From the blog post:

Using a different sort of statistical test than Bem used, they re-analyze Bem's data and they find that, while the results are positive, they are not positive enough to pass the level of "statistical significance." They conclude that a somewhat larger sample size would be needed to conclude statistical significance using the test they used.

Err, that's not what they found. Over half the data was not merely "not positive enough", but literally negative.

[-][anonymous]13y00

An important reply cited in the other threads:
http://www.ruudwetzels.com//articles/Wagenmakersetal_subm.pdf

Does psi exist? In a recent article, Dr. Bem conducted nine studies with over a thousand participants in an attempt to demonstrate that future events retroactively affect people’s responses. Here we discuss several limitations of Bem’s experiments on psi; in particular, we show that the data analysis was partly exploratory, and that one-sided p-values may overstate the statistical evidence against the null hypothesis. We reanalyze Bem’s data using a default Bayesian t-test and show that the evidence for psi is weak to nonexistent.

[-][anonymous]13y00

And it is an even better introduction to proper experiment design. In particular, it argues eloquently for a very clear distinction between exploratory and confirmatory experiments. This distinction should be drilled into every junior undergrad, but somehow even world-famous experimental psychologists can miss it. Now, if we asked Daryl Bem about this, he would probably say that all his experiments were exploratory, and now it is the task of the research community to confirm or reject them. But the problem is, he used significance tests that were only suitable for confirmatory experiments.