Open thread, August 19-25, 2013

David_Gerard

Eh, I'm not sure the idea of 'double-spending' really applies here. In the multiple comparisons case, you're spending all your budget on detecting the observed effect size and getting high-power/reducing-Type-II-errors (if there's an effect lurking there, you'll find it!), but you then can't buy as much Type I error reduction as you want.

This could be fine in some applications. For example, when I'm A/B testing visual changes to gwern.net, I don't care if I commit a Type I error, because if I replace one doohickey with another doohickey and they work equally well (the null hypothesis), all I've lost is a little time. I'm worried about coming up with an improvement, testing the improvement, and mistakenly believing it isn't an improvement when actually it is.

The problem with multiple comparisons comes when people don't realize they've used up their budget and they believe they really have controlled alpha errors at 5% or whatever. When they think they've had their cake & ate it too.

I guess a better financial analogy would be more like "you spend all your money on the new laptop you need for work, but not having checked your bank account balance, promise to take your friends out for dinner tomorrow"?

gwern13y00

Lumifer13y00

I am a bit confused -- is the framework for this thread observation (where the number of samples is pretty much the only thing you can affect pre-analysis) or experiment design (where you you can greatly affect which data you collect)?

I ask because I'm intrigued by the idea of trading off Type I errors against Type II errors, but I'm not sure it's possible in the observation context without introducing bias.

6

Open thread, August 19-25, 2013

6

6

6

Open thread, August 19-25, 2013

6

6