Johnicholas comments on Development of Compression Rate Method - Less Wrong

11 Post author: Daniel_Burfoot 20 May 2010 05:11PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (19)

You are viewing a single comment's thread.

Comment author: Johnicholas 23 May 2010 06:00:25PM 2 points [-]

I think Kevin T Kelly has some slight adjustment of this "the scientific method is compression" paradigm. http://www.andrew.cmu.edu/user/kk3n/ockham/Ockham.htm

As far as I understand, the basic idea is: In order to possibly eventually become correct, you must switch among theories as you acquire more evidence, moving from simpler to more complex, because it's impossible to list theories from more complex to simpler (there are no monotonic descending functions from the natural numbers to the natural numbers, and theories can be Godel-coded).

The claim "the simplest theory that fits the data is also most likely to be correct" (or a variation of that claim regarding compression performance) is a factual claim about the world - a claim that may not be true (aggregate methods, boosting and bagging, can yield better predictors than predicting with the simplest theory).

Kevin Kelly is providing an alternative reason why we should follow simplicity in scientific method, one not based on these dubious factual claims.

Comment author: Daniel_Burfoot 25 May 2010 06:34:30PM 1 point [-]

The claim "the simplest theory that fits the data is also most likely to be correct" (or a variation of that claim regarding compression performance) is a factual claim about the world - a claim that may not be true (aggregate methods, boosting and bagging, can yield better predictors than predicting with the simplest theory).

I think the majority of research in machine learning indicates that this claim IS true. Certainly all methods of preventing overfitting that I am aware of involve some form of capacity control, regularization, or model complexity penalty. If you can cite a generalization theorem that does not depend on some such scheme, I would be very interested to hear about it.