Quick question: Does anyone know of a formal from-first-principles justification for Occam's Razor (assigning prior probabilities in inverse proportion to the length of the model in universal description language)?
http://wiki.lesswrong.com/wiki/Occam's_razor Not sure if thats in depth enough, but I think it does a pretty good job. -edit the apostrophe seems to break the link, but the url is right.
Thanks, but that proof doesn't work for the formulation of Occam's Razor that I was talking about.
For example, if I have a boolean-output function, there are three "simplest possible" (2 bit long) minimum hypotheses as to what it is, before I see the evidence: [return 0], [return 1], and [return randomBit()]. But a "more complex" (longer than 2 bit) hypothesis, like [on call #i to function, return i mod 2] can't be represented as being equivalent to [[one of the previous hypotheses] AND [something else]] so the conjunction rule doesn't ...