You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gwern comments on Open thread, August 19-25, 2013 - Less Wrong Discussion

2 Post author: David_Gerard 19 August 2013 06:58AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (325)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 03 September 2013 09:39:24PM 0 points [-]

So you're trying to talk about overall probability distributions in a Bayesian framework? I haven't ever done power analysis with that approach, so I don't know what would be analogous to Type I and II errors and whether one can trade them off; in fact, the only paper I can recall discussing how one does it is Kruschke's paper (starting on pg11) - maybe he will be helpful?

Comment author: Lumifer 04 September 2013 01:10:28AM 0 points [-]

Not necessarily in the Bayesian framework, though it's kinda natural there. You can think in terms of complete distributions within the frequentist framework perfectly well, too.

The issue that we started with was of statistical power, right? While it's technically defined in terms of the usual significance (=rejecting the null hypothesis), you can think about it in broader terms. Essentially it's the capability to detect a signal (of certain effect size) in the presence of noise (in certain amounts) with a given level of confidence.

Thank for the paper, I've seen it before but didn't have a handy link to it.

Comment author: gwern 04 September 2013 05:13:44PM 0 points [-]

You can think in terms of complete distributions within the frequentist framework perfectly well, too.

Does anyone do that, though?

Essentially it's the capability to detect a signal (of certain effect size) in the presence of noise (in certain amounts) with a given level of confidence.

Well, if you want to think of it like that, you could probably formulate all of this in information-theoretic terms and speak of needing a certain number of bits; then the sample size & effect size interact to say how many bits each n contains. So a binary variable contains a lot less than a continuous variable, a shift in a rare observation like 90/10 is going to be harder to detect than a shift in a 50/50 split, etc. That's not stuff I know a lot about.

Comment author: Lumifer 04 September 2013 05:44:30PM *  0 points [-]

Does anyone do that, though?

Well, sure. The frequentist approach, aka mainstream statistics, deals with distributions all the time and the arguments about particular tests or predictions being optimal, or unbiased, or asymptotically true, etc. are all explicitly conditional on characteristics of underlying distributions.

Well, if you want to think of it like that, you could probably formulate all of this in information-theoretic terms and speak of needing a certain number of bits;

Yes, something like that. Take a look at Fisher information, e.g. "The Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ upon which the probability of X depends."