You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

VincentYu comments on Open thread, 24-30 March 2014 - Less Wrong Discussion

6 Post author: Metus 25 March 2014 07:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (156)

You are viewing a single comment's thread. Show more comments above.

Comment author: pianoforte611 30 March 2014 06:57:19PM *  2 points [-]

Am I confused about frequentism?

I'm currently learning about hypothesis testing in my statistics class. The idea is that you perform some test and you use the results of that test to calculate:

P(data at least as extreme as your data | Null hypothesis)

This is the p-value. If the p-value is below a certain threshold then you can reject the null hypothesis (which is the complement of the hypothesis that you are trying to test).

Put another way:

P(data | hypothesis) = 1 - p-value

and if 1 - p-value is high enough then you accept the hypothesis. (My use of "data" is handwaving and not quite correct but it doesn't matter.)

But it seems more useful to me to calculate P(hypothesis | data). And that's not quite the same thing.

So what I'm wondering is whether under frequentism P(hypothesis | data) is actually meaningless. The hypothesis is either true or false and depending on whether its true or not the data has a certain propensity of turning out one way or the other. Its meaningless to ask what the probability of the hypothesis is, you can only ask what the probability of obtaining your data is under certain assumptions.

Comment author: VincentYu 30 March 2014 09:06:44PM *  4 points [-]

I'm currently learning about hypothesis testing in my statistics class. The idea is that you perform some test and you use the results of that test to calculate:

P(data at least as extreme as your data | Null hypothesis)

This is the p-value. If the p-value is below a certain threshold then you can reject the null hypothesis.

This is correct.

Put another way:

P(data | hypothesis) = 1 - p-value

and if 1 - p-value is high enough then you accept the hypothesis. (My use of "data" is handwaving and not quite correct but it doesn't matter.)

This is not correct. You seem to be under the impression that

P(data | null hypothesis) + P(data | complement(null hypothesis)) = 1,

but this is not true because

  1. complement(null hypothesis) may not have a well-defined distribution (frequentists might especially object to defining a prior here), and
  2. even if complement(null hypothesis) were well defined, the sum could fall anywhere in the closed interval [0, 2].

More generally, most people (both frequentists and bayesians) would object to "accepting the hypothesis" based on rejecting the null, because rejecting the null means exactly what it says, and no more. You cannot conclude that an alternative hypothesis (such as the complement of the null) has higher likelihood or probability.