cousin_it comments on Solomonoff induction on a random string - Less Wrong

0 Post author: christopherj 08 April 2014 05:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (43)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 11 April 2014 07:08:47AM *  2 points [-]

I'm not sure why we need to make that distinction. The Solomonoff and Levin constructions are equivalent. The prior built from all deterministic programs that output bit strings, and the prior built from all computable probability distributions, turn out to be the same prior. See e.g. here for proofs and references.

Comment author: pivo 11 April 2014 08:16:07AM 1 point [-]

They're equivalent from the point of view of a consumer of the prediction, they're not equivalent from the point of view of an implementation. And since this is a discussion about how does it work, the distinction is useful.

Comment author: MrMind 11 April 2014 08:56:06AM 0 points [-]

Then I'm confused, because the two would seem to produce two very different answers on the same string.
Since a string with very high Kolmogorov complexity can be clearly produced by a uniform distribution, the Solomonoff prior would converge to a very high complexity hypothesis, while the Levin mixture would just assign 0.5 to 0 and 0.5 to 1.
What am I missing here?

Comment author: cousin_it 11 April 2014 12:23:15PM *  3 points [-]

The Solomonoff prior would have many surviving hypotheses at each step, and the total weight of those that predict a 0 for the next bit would be about equal to the total weight of those that predict a 1. If the input distribution is biased, e.g. 0 with probability 5/6 and 1 with probability 1/6, then the Solomonoff prior will converge on that as well. That works for any computable input distribution, with probability 1 according to the input distribution.

Comment author: V_V 11 April 2014 01:10:44PM 1 point [-]

nitpick: the prior does not converge, the prior is what you have before you start observing data, then it is a posterior.

Comment author: MrMind 11 April 2014 12:39:51PM 1 point [-]

Many thanks, I get it now.

Comment author: V_V 11 April 2014 10:06:21AM 2 points [-]

What matters is the probability that they assign to the next bit being equal to one.