Tordmor comments on Significance of Compression Rate Method - Less Wrong

5 Post author: Daniel_Burfoot 30 May 2010 03:50AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (60)

You are viewing a single comment's thread.

Comment author: [deleted] 31 May 2010 07:10:00AM 0 points [-]

Let's take the stock market as an example. The stock market prices are in principle predictable, only not from the data itself but from additional data taken from the newspapers or other sources. How does the CRM apply if the data does not in itself contain the neccessary information?

Let's say I have a theory that cutting production costs will increase stock prices in relation to the amount of cost cut and the prominence of the company and the level of fear of a crash on the stock market and the level of a "bad news indicator" that is a weighted sum of bad press for the company in the past. How would I test my theory with CRM?

Comment author: Jonathan_Lee 31 May 2010 07:48:26AM *  0 points [-]

In the wider sense, MML still works on the dataset {stock prices, newspapers, market fear}. Regardless of what work has presently been done to compress newspapers and market fear, if your hypothesis is efficient then you can produce the stock price data for a very low marginal message length cost.

You'd write up the hypothesis as a compressor-of-data; the simplest way being to produce a distribution over stock prices and apply arithmetic coding, though in practice you'd tweak whatever state of the art compressors for stock prices exist.

Of course the side effect of this is that your code references more data, and will likely need longer internal identifiers on it, so if you just split the cost of code across the datasets being compressed, you'd punish the compressors of newspapers and market fear. I would suggest that the solution is to deploy shapely value, with the value being the number of bits saved overall by a single compressor working on all the data sets in a given pool of cooperation.

Comment author: ocr-fork 31 May 2010 07:36:05AM *  0 points [-]

First, infer the existence of people, emotions, stock traders, the press, factories, production costs, and companies. When that's done your theory should follow trivially from the source code of your compression algorithm. Just make sure your computer doesn't decay into dust before it gets that far.