V_V comments on An Intuitive Explanation of Solomonoff Induction - Less Wrong

53 Post author: Alex_Altair 11 July 2012 08:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (210)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 20 November 2013 09:35:15PM *  2 points [-]

Second given that the probability is reduced by half with every bit of added algorithm length wouldn't that imply that algorithms' having 1 bit were the most likely and have a probability of 1/2. In fact I doubt if any algorithm at all is describable with 1 bit.

Keep in mind that the Solomonoff induction is defined up to the choice of a prefix-free universal Turing machine. There exist some of these UTMs that can represent a valid program with 1 bit, but this isn't particularly relevant:
Each UTM leads to a different Solomonoff prior. Solomonoff induction is universal only in an asymptotic sense: as you accumulate more and more bits of evidence, the posterior distributions converge with high probability.

Probability estimates dominated by short programs are biased by the choice of the UTM. As more evidence accumulates, the likelihood of a short program being consistent with all of it decreases (intuitively, by the pigeonhole principle), and the probability estimate will be dominated by longer and longer programs, "washing out" the initial bias.

Why do they need to be ranked? Why not assign them all probability 1/N where N = 2^(n+1) - 2, the number of algorithms having length up to and including length n.

In order to normalize properly, you have to divide the raw Solomonoff measure by the Chaitin's omega of that particular UTM, that is, the probability that a randomly generated program halts.