cousin_it comments on Solomonoff Induction, by Shane Legg - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (7)
The tricky part for me was proving the equivalence of two views of the Solomonoff prior: as a probability distribution over programs that print bitstrings and never make mistakes, and as a mixture of all "lower semi-computable" probability distributions over finite prefixes of bitstrings that are allowed to make mistakes. Once you derive the second view from the first, the conclusion is indeed trivial.
I started from the view of multiplicative weights over computable predictors, and then when I tried to describe it as an improvement on Solomonoff induction realized it was the same thing. If you write down what the two algorithms actually do, the equivalence to within a constant is clear.
I mentioned to you a few days ago (in chat) that Scholarpedia stated:
When you say "The tricky part for me was proving the equivalence of two views of the Solomonoff prior" do you mean you tried to re-derive this result?
For the discrete version, doesn't this boil down to converting a program that semi-computes a semi-measure into a program that "semi-samples from" a semi-measure?
Which is pretty straightforward:
We interpret the remainder of our (infinite) input as a real number y in [0,1]. Every time the measure that we're semi-computing adds an amount p of probability to one of its possible values v, we increment a variable x (which was initialised to 0) by p. If and when x finally exceeds y, we return the value v whose probability was just incremented.
(The continuous version, which is the one we actually need, looks harder, but I guess the same general idea should work. Perhaps we just repeat the algorithm above once for every bit of the output?)
Yes. I eventually succeeded, but not without a tiny bit of help from the texts.