You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gwern comments on Against NHST - Less Wrong Discussion

57 Post author: gwern 21 December 2012 04:45AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (62)

You are viewing a single comment's thread.

Comment author: gwern 31 December 2012 11:51:26PM 3 points [-]

http://library.mpib-berlin.mpg.de/ft/gg/GG_Null_2004.pdf

In 1962, Jacob Cohen reported that the experiments published in a major psychology journal had, on average, only a 50 : 50 chance of detecting a medium-sized effect if there was one. That is, the statistical power was as low as 50%. This result was widely cited, but did it change researchers’ practice? Sedlmeier and Gigerenzer (1989) checked the studies in the same journal, 24 years later, a time period that should allow for change. Yet only 2 out of 64 researchers mentioned power, and it was never estimated. Unnoticed, the average power had decreased (researchers now used alpha adjustment, which shrinks power). Thus, if there had been an effect of a medium size, the researchers would have had a better chance of finding it by throwing a coin rather than conducting their experiments. When we checked the years 2000 to 2002, with some 220 empirical articles, we finally found 9 researchers who computed the power of their tests. Forty years after Cohen, there is a first sign of change.

Comment author: gwern 14 September 2013 10:40:10PM 0 points [-]

Oakes (1986) tested 70 academic psychologists and reported that 96% held the erroneous opinion that the level of significance specified the probability that either H0 or H1 was true.

  • Oakes, M. (1986). Statistical inference: A commentary for the social and behavioral sciences. New York: Wiley.

...Gosset, who developed the t-test in 1908, anticipated this overconcern with significance at the expense of other methodological concerns:

"Obviously the important thing. . . is to have a low real error, not to have a 'significant' result at a particular station. The latter seems to me to be nearly valueless in itself" (quoted in Pearson, 1939, p. 247).

--"Do Studies of Statistical Power Have an Effect on the Power of Studies?", Sedlmeier & Gigerenzer 1989