buybuydandavis comments on The Logic of the Hypothesis Test: A Steel Man - Less Wrong

5 Post author: Matt_Simpson 21 February 2013 06:19AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread.

Comment author: buybuydandavis 21 February 2013 06:37:45AM *  7 points [-]
  1. Either the null hypothesis or the alternative hypothesis is true.

Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven't conceived of.

\4. Therefore under the null hypothesis, our sample is extremely unlikely.
\5. Therefore the null hypothesis is false.

Low likelihood of data under a hypothesis in no way implies rejection of that hypothesis.

\6. Therefore the alternative hypothesis is true.

Without also calculating the likelihood under the alternative hypothesis (it may be less), this is unjustified as well.

Comment author: Matt_Simpson 21 February 2013 06:48:37AM *  3 points [-]

Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven't conceived of.

Yes, the implicit assumption here is that the model is true.

Low likelihood of data under a hypothesis in no way implies rejection of that hypothesis.

\6. Therefore the alternative hypothesis is true.

Without also calculating the likelihood under the alternative hypothesis (it may be less), this is unjustified as well.

I don't think you understood my point. I'm avoiding claiming any inductive theory is correct - including Bayes' - and trying to show how hypothesis testing may be a way to do induction while simultaneously being agnostic about the correct theory. That Bayesian theory rejects certain steps of the hypothesis testing process is irrelevant to my point (and if you read closely, you'll see that I acknowledge it anyway).

Comment author: buybuydandavis 21 February 2013 07:30:21AM 1 point [-]

Yes, the implicit assumption here is that the model is true.

I think that's a bad assumption, and if you're trying to steelman, you should avoid relying on bad assumptions.

I'm avoiding claiming any inductive theory is correct

Going from 4 to 5 looks dependent on an inductive theory to me.

Comment author: Matt_Simpson 21 February 2013 07:45:28AM 4 points [-]

I think that's a bad assumption, and if you're trying to steelman, you should avoid relying on bad assumptions.

In any given problem the model is almost certainly false, but whether you use frequentist or Bayesian inference you have to implicitly assume that it's (approximately) true in order to actually conduct inference. Saying "don't assume the model is true because it isn't" is unhelpful and a nonstarter. If you actually want to get an answer, you have to assume something even if you know it isn't quite right.

Going from 4 to 5 looks dependent on an inductive theory to me.

Why yes it does. Did you read what I wrote about that?

Comment author: buybuydandavis 26 February 2013 12:27:13AM 0 points [-]

Saying "don't assume the model is true because it isn't" is unhelpful and a nonstarter.

It starts fine for me.

Testing just the Null hypothesis is the least one can do. Then one can test the alternative, That way you at least get a likelihood ratio. You can add priors or not. Then one can build in terms modeling your ignorance.

See previous comment: http://lesswrong.com/lw/gqt/the_logic_of_the_hypothesis_test_a_steel_man/8ioc

One could keep going and going on modeling ignorance, but few even get that far, and I suspect it isn't helpful to go further.

Why yes it does. Did you read what I wrote about that?

Yes. It conflicted with what you subsequently wrote:

I'm avoiding claiming any inductive theory is correct

Comment author: Matt_Simpson 26 February 2013 09:01:49PM 1 point [-]

Testing just the Null hypothesis is the least one can do. Then one can test the alternative, That way you at least get a likelihood ratio. You can add priors or not. Then one can build in terms modeling your ignorance.

This doesn't address the problem that the truth isn't in your hypothesis space (which is what I thought you were criticizing me for). If your model assumes constant variance, for example, when in truth there's nonconstant variance, the truth is outside your hypothesis space. You're not even considering it as a possibility. What does considering likelihood ratios of the hypotheses in your hypothesis space do to help you out here?

See previous comment: http://lesswrong.com/lw/gqt/the_logic_of_the_hypothesis_test_a_steel_man/8ioc

Reading that thread, I think jsteinart is right - if the truth is outside of your hypothesis space, you're screwed no matter if you're a Bayesian or a frequentist (which is a much more succinct way of putting my response to you). Setting up a "everything else" hypothesis doesn't really help because you can't compute a likelihood without some assumptions that, in all probability, expose you to the problem you're trying to avoid.

Yes. It conflicted with what you subsequently wrote:

Are you happier if I say that Bayes is a "thick" inductive theory and that NHST can be viewed as induction with a "thin" theory which therefore keeps you from committing yourself to as much? (I do acknowledge that others treat NHST as a "thick" theory and that this difference seems like it should result in differences in the details of actually doing hypothesis tests.)

Comment author: buybuydandavis 26 February 2013 10:08:23PM 0 points [-]

What does considering likelihood ratios of the hypotheses in your hypothesis space do to help you out here?

The likelihood ratio was for comparing the hypotheses under consideration, the Null and the alternative. My point is that the likelihood of the alternative isn't taken into consideration at all. Prior to anything Bayesian, hypothesis testing moved from only modeling the likelihood of the null to also modeling the likelihood of a specified alternative, and comparing the two.

if the truth is outside of your hypothesis space, you're screwed no matter if you're a Bayesian or a frequentist

Therefore, you put an error placeholder of appropriate magnitude onto "it's out of my hypothesis space" so that unreasonable results have some systematic check.

And the difference between Bayesian and NHST isn't primarily how many assumptions you've committed too, which is enormous, but how many of those assumptions you've identified, and how you've specified them.

Comment author: Viliam_Bur 21 February 2013 08:23:17AM 1 point [-]

Going from 4 to 5 seems to me like silently changing "if A then B" to "if B then A". Which is a logical mistake that many people do.

More precisely, it is a silent change from "if NULL, then DATA with very low proability" to "if DATA, then NULL with very low probability".

Specific example: Imagine a box containing 1 green circle, 10 red circles, and 100 red squares; you choose a random item. It is true that "if you choose a red item, it is unlikely to be a circle". But it is not true that "if you choose a circle, it is unlikely to be red".

Comment author: jsteinhardt 25 February 2013 03:11:53AM 1 point [-]

Nope. This was a good point by Jaynes. The truth may not exist in your hypothesis space. It may be (and often is) something you haven't conceived of.

If the truth doesn't exist in your hypothesis space then Bayesian methods are just as screwed as frequentist methods. In fact, Bayesian methods can grow increasingly confident that an incorrect hypothesis is true in this case. I don't see how this is a weakness of Matt's argument.

Comment author: buybuydandavis 25 February 2013 04:52:18AM 0 points [-]

The details are hazy at this point, but by assigning a realistic probability to the "Something else" hypothesis, you avoid making over confident estimates of your other hypotheses in a multiple hypothesis testing scenario.

See Multiple Hypothesis Testing in Jaynes PTTLOS, starting pg. 98, and the punchline on pg. 105:

In summary, the role of our new hypothesis C was only to be held in abeyace until needed, like a fire extinguisher. In a normal testing situation, it is "dead", playing no part in the inference because its probability remains far below that of the other hypotheses. But a dead hypothesis can be brought back to life by very unexpected data.

I think this is especially relevant to standard "null hypothesis" hypothesis testing because the likelihood of the data under the alternative hypothesis is never calculated, so you don't even get a hint that your model might just suck, and instead conclude that the null hypothesis should be rejected.

Comment author: jsteinhardt 25 February 2013 06:05:04AM *  0 points [-]

What is the likelihood of the "something else" hypothesis? I don't think this is really a general remedy.

Also, you can get the same thing in the hypothesis testing framework by doing two hypothesis tests, one of which is a comparison to the "something else" hypothesis and one of which is a comparison to the original null hypothesis.

Finally, while I forgot to mention this above, in most cases where hypothesis testing is applied, you actually are considering all possibilities, because you are doing something like P0 = "X <= 0", P1 = "X > 0" and these really are logically the only possibilities =) [although I guess often you need to make some assumptions on the probabilistic dependencies among your samples to get good bounds].

Comment author: buybuydandavis 26 February 2013 12:07:43AM 0 points [-]

Yes, you can say it in that framework. And you should. That's part of the steelmanning exercise - putting in the things that are missing. If you steelman enough, you get to be a good bayesian.

P0 = "X <= 0" and {All My other assumptions}
NOT(P0) = NOT("X <= 0") or NOT({All My other assumptions})