Daniel_Burfoot comments on The prior of a hypothesis does not depend on its complexity - Less Wrong

26 Post author: cousin_it 26 August 2010 01:20PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (59)

You are viewing a single comment's thread. Show more comments above.

Comment author: Daniel_Burfoot 28 August 2010 04:00:16AM 1 point [-]

why "predictive power" should be related to prior probability of a hypothesis.

To solve the learning problem as described, you must ask: what is the prior probability that a given rule/hypothesis will correctly predict income? It is trivially true that the OR-hypothesis selects a larger number of people, but there is no reason to believe it will more accurately predict income.

Since you don't buy the KC idea, do you also refuse to accept the more general idea of capacity control/regularization/MDL as a (the) way to prevent overfitting and achieve generalization? In the standard setting of the learning problem, it seems inevitable that some method of penalizing complexity is necessary for generalization.

Comment author: cousin_it 30 August 2010 11:46:00AM *  0 points [-]

I thought about it some more and it seems you're right. In learning problems we need some weighting of hypotheses to prevent overfitting, description length has no obvious downsides, so we can just use it and be happy. Now I just need to shoehorn Islam into the "learning problem" framework, to understand why our prior for it should be low...

Comment author: Vladimir_Nesov 30 August 2010 02:44:48PM 0 points [-]

This isn't about prior though.