orthonormal comments on Significance of Compression Rate Method - Less Wrong

5 Post author: Daniel_Burfoot 30 May 2010 03:50AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (60)

You are viewing a single comment's thread. Show more comments above.

Comment author: orthonormal 30 May 2010 06:08:19PM 2 points [-]

The latter. Until I see a real-world case where CRM has been very effective compared to other methods, I'm not going to give much credit to claims that it will achieve greatness in this, that and the other field.

And in particular, I find it extremely unlikely that current major theories of linguistics could in practice be coded into compressors, in a way that satisfies their proponents.

Comment author: marks 06 June 2010 04:28:37AM 1 point [-]

This isn't precisely what Daniel_Burfoot was talking about but its a related idea based on "sparse coding" and it has recently obtained good results in classification:

http://www.di.ens.fr/~fbach/icml2010a.pdf

Here the "theories" are hierarchical dictionaries (so a discrete hierarchy index set plus a set of vectors) which perform a compression (by creating reconstructions of the data). Although they weren't developed with this in mind, support vector machines also do this as well, since one finds a small number of "support vectors" that essentially allow you to compress the information about decision boundaries in classification problems (support vector machines are one of the very few things from machine learning that have had significant and successful impacts elsewhere since neural networks).

The hierarchical dictionaries learned do contain a "theory" of the visual world in a sense, although an important idea is that they do so in a way that is sensitive to the application at hand. There is much left out by Daniel_Burfoot about how people actually go about implementing this line of thought.