ChristianKl comments on Outside the Laboratory - Less Wrong

63 Post author: Eliezer_Yudkowsky 21 January 2007 03:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (336)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 22 January 2007 08:50:26PM 6 points [-]

Sorry, ambiguous wording. 0.05 is too weak, and should be replaced with, say, 0.005. It would be a better scientific investment to do fewer studies with twice as many subjects and have nearly all the reported results be replicable. Unfortunately, this change has to be standardized within a field, because otherwise you're deliberately handicapping yourself in an arms race. This probably deserves its own post.

In my head, I always translate so-called "statistically significant" results into (an often poorly-computed approximation to) a likelihood ratio of 0.05 over the null hypothesis. I believe that experiments should report likelihood ratios.

I am an infinite set atheist - have you ever actually *seen* an infinite set?

I am a "subjective/objective" Bayesian. If we are ignorant about a phenomenon, this is a fact about our state of mind, not a fact about the phenomenon. Probabilities are in the mind, not in the environment. Nonetheless I follow a correspondence, rather than a coherentist, theory of truth: we are trying to concentrate as much subjective probability mass as possible into (the mental representation that corresponds to) the real state of affairs. See my "The Simple Truth" and "A Technical Explanation of Technical Explanation".

Comment author: ChristianKl 26 December 2012 03:24:28PM 0 points [-]

Sorry, ambiguous wording. 0.05 is too weak, and should be replaced with, say, 0.005. It would be a better scientific investment to do fewer studies with twice as many subjects and have nearly all the reported results be replicable.

I'd rather prefer two studies with 0.05% on the same claim by different scientifists to one study with 0.005%. Proving replicable of scientific studies with actually replicating them is better than going for a even lower p value.

Comment author: gwern 26 December 2012 08:06:49PM *  3 points [-]

I'd rather prefer two studies with 0.05% on the same claim by different scientifists to one study with 0.005%.

I wouldn't. Two studies opens the door to publication bias concerns and muddles the 'replication': rarely do people do a straight replication.

From Nickerson in http://lesswrong.com/lw/g13/against_nhst/

Experiments that are literal replications of previously published experiments are very seldom published - I do not believe I have ever seen one. Others who have done systematic searches for examples of them confirm that they are rare (Mahoney, 1976; Sterling, 1959)....PhD committees generally expect more from dissertations than the replication of someone else's findings. Evidence suggests that manuscripts that report only replication experiments are likely to get negative reactions from journal reviewers and editors alike (Neuliep & Crandall, 1990, 1993)

Comment author: Eliezer_Yudkowsky 27 December 2012 02:20:00AM 2 points [-]

I wouldn't. Two studies opens the door to publication bias concerns

Agreed. It's much easier for a false effect to garner two 'statistically significant' studies with p < .05 than to gain one statistically significant study with p < .005 (though you really want p < .0001).

Comment author: ChristianKl 27 December 2012 04:23:34PM 1 point [-]

I wouldn't. Two studies opens the door to publication bias concerns and muddles the 'replication': rarely do people do a straight replication.

If you put the general significance standard at P<0.005 you will even further decrease the amount of straight replications. We need more straight replication instead of less.

A single study can wrong due to systematic bias. One researcher could engage in fraud and therefore get a P<0.005 result. He could also simply be bad at blinding his subjects properly. There are many possible ways to get a P<0.005 result by messing up the underlying science in a way that you can't see by reading a paper.

Having a second researcher reproduce the effects is vital to know that the first result is not due to some error in the experiment setup of the first study.