RichardKennaway comments on A Proof of Occam's Razor - Less Wrong

3 Post author: Unknowns 10 August 2010 02:20PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (121)

You are viewing a single comment's thread.

Comment author: RichardKennaway 10 August 2010 08:13:50PM *  10 points [-]

Another way of putting this is that all probability distributions over an infinite enumerated set are biased towards elements that occur earlier, regardless of the principle of enumeration.

Comment author: RichardKennaway 18 August 2010 01:58:55PM 0 points [-]

And yet another way is this: every computable enumeration of all programs is equivalent to a UTM, and therefore generates the same measure of Kolmogorov complexity, up to the usual additive constant.

Proof: Computability of the enumeration means that there is a program which takes any integer N and returns the Nth in that enumeration. Represent this program as a Turing machine and cascade it with a second Turing machine that executes the program returned by the first TM.

Is this a novel observation? Does it go any distance towards answering Shalizi's question at the end of this note?

Comment author: utilitymonster 12 August 2010 04:55:23PM 0 points [-]

This assumes countable additivity. Otherwise, each basic event could have probability zero, though the whole has probability 1.

I'm inclined to use countable additivity, but I couldn't give you arguments for it, other than by sketching models that violate it and pointing out that they are weird.

Comment author: Unknowns 11 August 2010 12:59:17AM 0 points [-]

Yes, exactly.

Comment author: Oscar_Cunningham 10 August 2010 08:30:19PM 0 points [-]

This is the best way of putting it.

Comment author: thomblake 10 August 2010 08:16:48PM 0 points [-]

Certainly not regardless of the probability distribution though. Do you have a particular probability distribution in mind?

Comment author: RichardKennaway 10 August 2010 08:28:49PM *  5 points [-]

Regardless of the probability distribution.

If one has any assignment of probabilities to an infinite series of mutually exclusive hypotheses H1, H2, ..., then for every epsilon > 0 there is an N such that every hypothesis after the Nth has probability less than epsilon. In fact, there is an N such that the sum of the probabilities of all the hypotheses after the Nth is less than epsilon.

Comment author: whpearson 11 August 2010 08:49:47AM *  1 point [-]

But N could be 3^^^3. Which does an injury to the term early in my book. E.g. I could swap the probabilities of p (x3^^^3) and p(x1).

Comment author: RichardKennaway 11 August 2010 09:06:12AM 5 points [-]

Indeed you could, but that problem is already present in the definition of Kolmogorov complexity. It's only defined up to an arbitrary additive constant determined by (in one formulation) the choice of a universal Turing machine. The Kolmogorov complexity of a string is the size of the shortest input for that UTM that produces that string as output, but there's nothing in the definition to prevent the UTM from having any finite set of arbitrary strings on speed-dial.

Comment author: cousin_it 11 August 2010 09:36:43AM *  0 points [-]

Kelly deals with this by looking at complexity from other angles. For example, a complex world can give you a long sequence of observations persuading you that it's a simple world and then suddenly "change its mind", but a simple world cannot pretend that it's complex.

Comment author: DanielLC 16 December 2010 04:51:55AM 0 points [-]

Why not? It would look almost exactly like the complex worlds imitating it, wouldn't it?

Comment author: thomblake 11 August 2010 03:12:30AM 0 points [-]

Hmm... maybe I was reading your claim as stronger than you intended. I was imagining you were claiming that property would hold for any finite enumerated subset, which clearly isn't what you meant.

Comment author: apophenia 11 August 2010 03:32:27AM 1 point [-]

If the sum of every term in a sequence after the Nth one is less than epsilon, then the sum of every term in any subsequence after the Nth one is also less than epsilon.

Comment author: thomblake 11 August 2010 03:42:16AM *  0 points [-]

Right, but that isn't what I meant - it is not necessarily the case that for every n, every hypothesis after the nth has probability less than that of the the nth hypothesis. Obviously - which is why I should have noticed my confusion and not misread in the first place.

Comment author: Oscar_Cunningham 10 August 2010 08:31:37PM 0 points [-]

Yes, regardless of the distribution.