You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ChristianKl comments on [QUESTION]: Academic social science and machine learning - Less Wrong Discussion

11 Post author: VipulNaik 19 July 2014 03:13PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (17)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 20 July 2014 07:09:09AM 2 points [-]

I don't have knowledge on random forests in particular but I did learn a little bit about machine learning in bioinformatics classes.

As far as I understand you can train your machine learning algorithm on one set of data and then see how it predicts values of a different set of data. That means you have values for sensitivity and specificity of your model. You can build a receiver operating characteristic (ROC) plot with it. You can also do things like seeing whether you get a different model if you build the model on a different set of your data. That can tell you whether your model is robust.

The idea of p values is to decide whether or not your model is true. In general that's not what machine learning folks are concerned with. The know that their model is a model and not reality and they care about the receiver operating characteristic.