DanielLC comments on Frugality and working from finite data - Less Wrong

27 Post author: Snowyowl 03 September 2010 09:37AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (46)

You are viewing a single comment's thread. Show more comments above.

Comment author: DanielLC 05 September 2010 05:10:10PM 0 points [-]

First off, when you have that much data, over-fitting won't make a big difference. For example, you'll get a prediction that something happens between 999,000 and 1,001,000 times, instead of 1,000,000. Second, the correct answer would take 2,000,000 bits. The incorrect one would take 1001000-ln(0.5005)/ln(2)+999000-ln(0.4995)/ln(2) = 1,999,998.56 bits. The difference in data will always be how unlikely it is to be that far from the mean.

Third, and most importantly, no matter how much your intuition says otherwise, this actually is the correct way to do it. The more bits you have to use, the less likely it is. The coincidence might not seem interesting, but that exact sequence of data is unlikely. What normally makes it seem like a coincidence is that there seems to be a way to explain it with smaller data.

Can someone else explain this better?