Non-trivial probability distributions for priors and Occam's razor

JoshuaZ

Assume we have a countable set of hypotheses described in some formal way with some prior distribution such that 1) our prior for each hypothesis is non-zero 2) our formal description system has only a finite number of hypotheses of any fixed length. Then, I claim that that under just this set of weak constraints, our hypotheses are under a condition that informally acts a lot like Occam's razor. In particular, let h(n) be the the probability mass assigned to "a hypothesis with description at least exactly n is correct." (ETA: fixed from earlier statement) Then, as n goes to infinity, h(n) goes to zero. So, when one looks in the large-scale complicated hypotheses must have low probability. This suggests that one doesn't need any appeal to computability or anything similar to accept some form of Occam's razor. One only needs that one has a countable hypothesis space, no hypothesis has probability zero or one, and that has a non-stupid way of writing down hypotheses.

A few questions: 1) Am I correct in seeing this as Occam-like or is this just an indication that I'm using too weak a notion of Occam's razor?

2) Is this point novel? I'm not as familiar with the Bayesian literature as other people here so I'm hoping that someone can point out if this point has been made before.

ETA: This was apparently a point made by Unknowns in an earlier thread which I totally forgot but probably read at the time. Thanks also for the other pointers.

A few questions: 1) Am I correct in seeing this as Occam-like or is this just an indication that I'm using too weak a notion of Occam's razor?

2) Is this point novel? I'm not as familiar with the Bayesian literature as other people here so I'm hoping that someone can point out if this point has been made before.

ETA: This was apparently a point made by Unknowns in an earlier thread which I totally forgot but probably read at the time. Thanks also for the other pointers.

Huh? There are only countably many statements in any string-based language, so this includes decidedly non-Occamian scenarios like every statement being true, or every statement being true with the same probability.

I'm confused. What is the countable set of hypotheses you are considering? My claim is merely that if you have hypotheses H1, H2, ..., then p(Hi) > 1/n for at most n-1 values of i. This can be thought of as a weak form of Occam's razor.

In what sense is "every statement being true" a choice of a countable set of hypotheses?

I think maybe the issue is that we are using hypothesis in a different sense. In my case a hypothesis is a complete model of the world, so it is not possible for multiple hypotheses to be true. You can marginalize out / observe a bunch of variables to talk about a subset of the world, but your hypotheses should still be mutually exclusive.

5

Non-trivial probability distributions for priors and Occam's razor

5

5

5

Non-trivial probability distributions for priors and Occam's razor

5

5