27chaos comments on Is simplicity truth indicative? - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (45)
It turns out there's an extremely straightforward mathematical reason why simplicity is to some extent an indicator of high probability.
Consider the list of all possible hypotheses with finite length. We might imagine there being a labeling of this list, starting with hypothesis 1, then hypothesis 2, and continuing on for an infinite number of hypotheses. This list contains the hypotheses capable of being distinguished by a human brain, input into a computer, having their predictions checked against the others, and other nice properties like that. In order to make predictions about which hypothesis is true, all we have to do is assign a probability to each one.
The obvious answer is just to give every hypotheses equal probability. But since there's an infinite number of these hypotheses, that can't work, because we'd end up giving every hypothesis probability zero! So (and here's where it starts getting Occamian) it turns out that any valid probability assignment has to get smaller and smaller as we go to very high numbers in the list (so that the probabilities can all add up to 1). At low numbers in the list the probability is, in general, allowed to go up and down, but hypotheses with very high numbers always have to be low probability.
There's a caveat, though - the position in the list can be arbitrary, and doesn't have to be based on simplicity. But it turns out that it is impossible to make any ordering of hypotheses at all, without having more complicated hypotheses have higher numbers than simpler hypotheses on average.
There's a general argument for this (there's a more specific argument based on universal turing machines that you can find in a good textbook) that's basically a reflection of the fact that there's a most simple hypothesis, but no "most complex" hypothesis, just like how there's no biggest positive integer. Even if you tried to shuffle up the hypotheses really well, you have to have each simple hypothesis end up at some finite place in the list (otherwise they end up at no place in the list and it's not a valid shuffling). And if the simple hypotheses are all at finite places in the list, that means there's still an infinite number of complex hypotheses with higher numbers, so complexity still decreases for large enough places in the list.
Thanks for this! Apparently, among many economists Occam's Razor is viewed as just a modelling trick, judging from the conversations on Reddit I've had recently. I'd felt that perspective was incorrect for a while, but after encountering it so many times, and then later on being directed to this paper, I'd begun to fear my epistemology was built on shaky foundations. It's relieving to see that's not the case.
Is there anything ruling out a bias towards simplicity that is extremely small, or are there good reasons to think the bias would be rather large? Figuring out how much predictive accuracy to exchange for theory conciseness seems like a tough problem, possibly requiring some arbitrariness.