paper-machine comments on Why the tails come apart - LessWrong

114 Post author: Thrasymachus 01 August 2014 10:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (90)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 31 July 2014 09:20:28PM 2 points [-]

How do I account for how many models I've tested? No, really, I don't know what that'd even be called in the statistics literature, and it seems like if a general technique for doing this were known the big data people would be all over it.

Comment author: Stuart_Armstrong 08 August 2014 11:11:06AM 2 points [-]

What we're doing at the FHI is acting like a machine learning problem: splitting the data into a training and a testing set, checking as much as we want on the training set, formulating the hypotheses, then testing them on the testing set.

Comment author: Stuart_Armstrong 01 August 2014 03:11:41PM 2 points [-]

The Bayesian approach with multiple models seems to be exactly what we need. eg http://www.stat.washington.edu/raftery/Research/PDF/socmeth1995.pdf

Comment author: Stuart_Armstrong 07 August 2014 04:25:21PM 1 point [-]

Another approach seems to be stepwise regression: http://en.wikipedia.org/wiki/Stepwise_regression

Comment author: EHeller 07 August 2014 05:14:24PM *  4 points [-]

I see a lot of stepwise regression being used by non-statisticians, but I think statisticians themselves think its something of a joke. If you have more predictors than you can fit coefficients for, and want an understandable linear model you are better off with something like LASSO.

Edit: Don't just take my word for it, google found this blog post for me: http://andrewgelman.com/2014/06/02/hate-stepwise-regression/

Comment author: Lumifer 07 August 2014 05:38:45PM 1 point [-]

I concur. Stepwise regression is a very crude technique.

I find it useful as an initial filter if I have to dig through a LOT of potential predictors, but you can't rely on it to produce a decent model.

Comment author: [deleted] 07 August 2014 04:30:25PM 1 point [-]

So it wasn't as clear with the previous link, but it seems to me that the nth step of this method doesn't condition on the fact that the last n-1 steps failed.