Cyan comments on Case study: abuse of frequentist statistics - Less Wrong

25 Post author: Cyan 21 February 2010 06:35AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (96)

You are viewing a single comment's thread. Show more comments above.

Comment author: Psy-Kosh 21 February 2010 11:45:58PM 4 points [-]

What the OP was saying was that this test only depends on the rankings. So to check for sanity, he calculated what the p values would have been for all possible rankings and found that none of those p values would be below .05.

In other words, it was a mathematical impossibility for this test, when treated this way, to result in a rejection of the null hypothesis. There was no possible outcome given this many data points, analyzed using this method, a rejection.

(in other words, it was a "heads I win, tails you lose" situation)

Comment author: Cyan 22 February 2010 12:09:28AM 1 point [-]

More of a double-headed coin situation, actually.

Comment author: Psy-Kosh 22 February 2010 12:35:31AM *  0 points [-]

Well... different ranking outcomes (different sides of the coin) are possible. Just that the interpretation will always be "don't reject the null hypothesis" but yeah. :)

Either way, my overall reaction to your post is "yuck" (not your post itself! That I upvoted. I mean the whole situation... That a relatively standard statistical test could allow this sort of madness. I mean, I know frequentist stats isn't the Bayesian way, but that relatively standard methods in it can be this pathological does not at all give me warm fuzzies)

Comment author: soreff 23 February 2010 10:49:02PM *  1 point [-]

I concur with your "yuck", but would phrase it slightly differently. The specific type of statistical test applied, plus the number of samples taken, has the effect, as Cyan said, of guaranteeing the results that the authors wanted. Note that, more generally, the fact that the authors chose to phrase their analysis so that accepting the null hypothesis was the result they wanted plus choosing a nonparametric statistical test, which is always weaker than a parametric one is in and of itself suspicious. If they had had enough samples so that it would be theoretically possible for the null hypothesis to be rejected (say if they had taken more samples) but they had still wanted the null result and they had still chosen a nonparametric test I would still be suspicious. As Cyan said, the nonparametric tests throw away most of the information.

Comment author: PhilGoetz 25 February 2010 02:12:54PM 0 points [-]

It's not the fault of the method if someone abuses it.

Comment author: wnoise 25 February 2010 06:15:23PM 2 points [-]

In general, no. However, if a method is more easily abused than others, that that is something worth pointing out.