shminux comments on An Intuitive Explanation of Solomonoff Induction - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (210)
After reading it fully, I've another, deeper problem with this (still great) article : Bayes' theorem totally disappears at the end. Hypothesis that exactly match the observation bitwise have a fixed probability (2^-n where n is their length), and those which are off even by one bit is discarded. There is no updating of probabilities, because hypothesis are always right or wrong. There is no concept left of an hypothesis that'll predict the position of an electron using a probability distribution, and there is no room for an hypothesis like "the coin will fall heads 75% of times, tails 25% of time".
That's both a problem for the article itself (why even speak of Bayes' theorem, if at the end it doesn't matter ?) and to find the "truth" about the universe in a quantum world (even if you accept MWI, what you'll actually observe in the world you end up being in will still be random).
I understand that going into the full details on how to handle fuzzy hypothesis (like, algorithms who don't just output one result, but different results and the probability of each) would make the article even longer, but it would still be a good thing to address those issues somewhere, IMHO.
Kilobug wrote "There is no updating of probabilities, because hypothesis are always right or wrong" Do not forget that any wrong hypothesis can become right by adding some error correcting instructions in it. It will only make the program longer and thus lower its probability. But is seems intuitive that the more a theory needs error corrections, the less it's probable.
Kilobug wrote "there is no room for an hypothesis like "the coin will fall heads 75% of times, tails 25% of time" There is room for both an hypothesis that predict the coin will always fall heads and this hypothesis has a probability of 75%. And an hypothesis that predict the coin will always fall tails and this hypothesis has a probability of 25%.
Say the coin falls HHTHTHH. Then the hypothesis that it will always fall heads has probability 0%, because they weren't all heads. Same for the hypothesis that it will always fall tails.
It seems like it would be pretty easy, though, to extend Solomonoff induction to have each hypothesis be an algorithm that outputs a probability distribution, and then update for each bit of evidence with Bayes's theorem. If we did that for this example, I think the hypothesis that generated the 25%:75% probability distribution would eventually win out.