timtyler comments on Open Thread: December 2009 - Less Wrong

3 Post author: CannibalSmith 01 December 2009 04:25PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (263)

You are viewing a single comment's thread. Show more comments above.

Comment author: Matt_Simpson 03 December 2009 08:44:10PM 1 point [-]

Is there a proof anywhere that occam's razor is correct? More specifically, that occam priors are the correct priors. Going from the conjunction rule to P(A) >= P(B & C) when A and B&C are equally favored by the evidence seems simple enough (and A, B, and C are atomic propositions), but I don't (immediately) see how to get from here to an actual number that you can plug into Baye's rule. Is this just something that is buried in textbook on information theory?

On that note, assuming someone had a strong background in statistics (phd level) and little to no background in computer science outside of a stat computing course or two, how much computer science/other fields would they have to learn to be able to learn information theory?

Thanks to anyone who bites

Comment author: timtyler 09 December 2009 06:15:51AM 0 points [-]

Occam's razor is dependent on a descriptive language / complexity metric (so there are multiple flavours of the razor).

Unless a complexity metric is specified, the first question seems rather vague.

Comment author: Jayson_Virissimo 08 January 2010 10:40:40AM 0 points [-]

Occam's razor is dependent on a descriptive language / complexity metric (so there are multiple flavours of the razor).

I think you might be making this sound easier than it is. If there are an infinite number of possible descriptive languages (or of ways of measuring complexity) aren't there an infinite number of "flavours of the razor"?

Comment author: timtyler 09 January 2010 12:14:09AM 0 points [-]

Yes, but not all languages are equal - and some are much better than others - so people use the "good" ones on applications which are sensitive to this issue.

Comment author: ciphergoth 08 January 2010 11:01:52AM 0 points [-]

There's a proof that any two (Turing-complete) metrics can only differ by at most a constant amount, which is the message length it takes to encode one metric in the other.

Comment author: timtyler 09 January 2010 12:12:02AM 0 points [-]

Of course, the constant can be arbitrarily large.

However, there are a number of domains for which this issue is no big deal.