Eliezer_Yudkowsky comments on Original Research on Less Wrong - Less Wrong

21 Post author: lukeprog 29 October 2012 10:50PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (47)

You are viewing a single comment's thread. Show more comments above.

Comment author: Eliezer_Yudkowsky 31 October 2012 02:39:46PM 2 points [-]

the significance of some evidence E is actually the probability that the null hypothesis is true, given E

No frequentist says this. They don't believe in P(H|E). That's the explicit basis of the whole philosophy. People who talk about the probability of a hypothesis given the evidence are Bayesians, full stop.

Statistical significance is, albeit in a strange and distorted way, supposed to be about P(E|null hypothesis), and so, yes, two experiments with a p-value of 0.05 should add up to somewhere in the vicinity of p < 0.0025, because it's about likelihoods, which do multiply, and not posteriors.

Comment author: jsteinhardt 31 October 2012 04:31:10PM 0 points [-]

While some frequentist methods do use likelihoods, the mapping from likelihood to p-value is non-linear, so multiplying them would still be a mistake, at least as far as I can tell.

Comment author: Kindly 31 October 2012 02:50:39PM *  0 points [-]

I'm not saying that frequentists believe this. I'm saying that the frequentist math (which computes Pr[E|H0]) is equivalent to computing Pr[H0|E] with respect to a prior distribution under which Pr[H0]=Pr[E]. Furthermore, this is a reasonable thing to look at, because from that point of view the way statistical significances combine actually makes sense.

Comment author: Eliezer_Yudkowsky 01 November 2012 04:41:59AM 1 point [-]

Pr[H0]=Pr[E]

Whaa?

Comment author: Kindly 01 November 2012 04:55:36AM 0 points [-]

Well, we have, in general, Pr[H0|E] = Pr[E|H0] * Pr[H0]/Pr[E]. Frequentists compute Pr[E|H0] instead of Pr[H0|E], but this turns out not to matter if Pr[H0]/Pr[E] cancels, which happens when the above equality holds.

From a certain point of view, this is just mathematical sleight of hand, of course. Also, the "E" is actually some class of outcomes that are grouped together (e.g. all outcomes in which 8 or more coins, out of 10, came up heads). But if we combine sequences of experimental results in the correct way, then this means that the frequentist and Bayesian result differ only by a constant factor (precisely the factor which we assumed, above, to be 1).

Comment author: Eliezer_Yudkowsky 01 November 2012 07:09:31AM 4 points [-]

Why the heck would the probability of seeing the evidence, conditional on the mix of all hypotheses being considered, exactly equal the prior probability of the null hypothesis?

Comment author: Kindly 01 November 2012 01:42:59PM *  1 point [-]

It wouldn't. Probably a better way to explain it would have been to factor their ratio out as a constant.

Anyway, I've totally messed up explaining this, so I will fold for now and direct you to a completely different argument made elsewhere in the comments which is more worthy of being considered.