Qiaochu_Yuan comments on Open thread, July 29-August 4, 2013 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (381)
I believe I've encountered a problem with either Solomonoff induction or my understanding of Solomonoff induction. I can't post about it in Discussion, as I have less than 20 karma, and the stupid questions thread is very full (I'm not even sure if it would belong there).
I've read about SI repeatedly over the last year or so, and I think I have a fairly good understanding of it. Good enough to at least follow along with informal reasoning about it, at least. Recently I was reading Rathmanner and Hutter's paper, and Legg's paper, due to renewed interest in AIXI as the theoretical "best intelligence," and the Arcade Learning Environment used to test the computable Monte Carlo AIXI approximation. Then this problem came to me.
Solomonoff Induction uses the size of the description of the smallest Turing machine to output a given bitstring. I saw this as a problem. Say AIXI was reasoning about a fair coin. It would guess before each flip whether it would come up heads or tails. Because Turing machines are deterministic, AIXI cannot make hypotheses involving randomness. To model the fair coin, AIXI would come up with increasingly convoluted Turing machines, attempting to compress a bitstring that approaches Kolmogorov randomness as its length approaches infinity. Meanwhile, AIXI would be punished and rewarded randomly. This is not a satisfactory conclusion for a theoretical "best intelligence." So is the italicized statement a valid issue? An AI that can't delay reasoning about a problem by at least labeling it "sufficiently random, solve later" doesn't seem like a good AI, particularly in the real world where chance plays a significant part.
Naturally, Eliezer has already thought of this, and wrote about it in Occam's Razor:
Does this warrant further discussion, if at least to validate or refute this claim? I don't think Eliezer's proposal for a version of SI that assigns probabilities to strings is strong enough, it doesn't describe what form the hypotheses would take. Would hypotheses in this new description be universal nondeterministic Turing machines, with the aforementioned probability distribution summed over the nondeterministic outputs?
Hypotheses in this description are probabilistic Turing machines. These can be cashed out to programs in a probabilistic programming language.
I think it's going too far to call this a "problem with Solomonoff induction." Solomonoff induction makes no claims; it's just a tool that you can use or not. Solomonoff induction as a mathematical construct should be cleanly separated from the claim that AIXI is the "best intelligence," which is wrong for several reasons.
Can probabilistic Turing machines be considered a generalization of deterministic Turing machines, so that DTMs can be described in terms of PTMs?
Editing in reply to your edit: I thought Solomonoff Induction was made for a purpose. Quoting from Legg's paper:
I'm just pointing out what I see as a limitation in the domain of problems classical Solomonoff Induction can successfully model.
Yes.
I don't think anyone claims that this limitation doesn't exist (and anyone who claims this is wrong). But if your concern is with actual coins in the real world, I suppose the hope is that AIXI would eventually learn enough about physics to just correctly predict the outcome of coin flips.
The steelman is to replaces coin flips with radioactive decay and then go through with the argument.
Yes.