You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Qiaochu_Yuan comments on Open thread, July 29-August 4, 2013 - Less Wrong Discussion

3 Post author: David_Gerard 29 July 2013 10:26PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (381)

You are viewing a single comment's thread. Show more comments above.

Comment author: Transfuturist 30 July 2013 02:35:17AM *  5 points [-]

I believe I've encountered a problem with either Solomonoff induction or my understanding of Solomonoff induction. I can't post about it in Discussion, as I have less than 20 karma, and the stupid questions thread is very full (I'm not even sure if it would belong there).

I've read about SI repeatedly over the last year or so, and I think I have a fairly good understanding of it. Good enough to at least follow along with informal reasoning about it, at least. Recently I was reading Rathmanner and Hutter's paper, and Legg's paper, due to renewed interest in AIXI as the theoretical "best intelligence," and the Arcade Learning Environment used to test the computable Monte Carlo AIXI approximation. Then this problem came to me.

Solomonoff Induction uses the size of the description of the smallest Turing machine to output a given bitstring. I saw this as a problem. Say AIXI was reasoning about a fair coin. It would guess before each flip whether it would come up heads or tails. Because Turing machines are deterministic, AIXI cannot make hypotheses involving randomness. To model the fair coin, AIXI would come up with increasingly convoluted Turing machines, attempting to compress a bitstring that approaches Kolmogorov randomness as its length approaches infinity. Meanwhile, AIXI would be punished and rewarded randomly. This is not a satisfactory conclusion for a theoretical "best intelligence." So is the italicized statement a valid issue? An AI that can't delay reasoning about a problem by at least labeling it "sufficiently random, solve later" doesn't seem like a good AI, particularly in the real world where chance plays a significant part.

Naturally, Eliezer has already thought of this, and wrote about it in Occam's Razor:

The formalism of Solomonoff Induction measures the "complexity of a description" by the length of the shortest computer program which produces that description as an output. To talk about the "shortest computer program" that does something, you need to specify a space of computer programs, which requires a language and interpreter. Solomonoff Induction uses Turing machines, or rather, bitstrings that specify Turing machines. What if you don't like Turing machines? Then there's only a constant complexity penalty to design your own Universal Turing Machine that interprets whatever code you give it in whatever programming language you like. Different inductive formalisms are penalized by a worst-case constant factor relative to each other, corresponding to the size of a universal interpreter for that formalism.

In the better (IMHO) versions of Solomonoff Induction, the computer program does not produce a deterministic prediction, but assigns probabilities to strings. For example, we could write a program to explain a fair coin by writing a program that assigns equal probabilities to all 2^N strings of length N. This is Solomonoff Induction's approach to fitting the observed data. The higher the probability a program assigns to the observed data, the better that program fits the data. And probabilities must sum to 1, so for a program to better "fit" one possibility, it must steal probability mass from some other possibility which will then "fit" much more poorly. There is no superfair coin that assigns 100% probability to heads and 100% probability to tails.

Does this warrant further discussion, if at least to validate or refute this claim? I don't think Eliezer's proposal for a version of SI that assigns probabilities to strings is strong enough, it doesn't describe what form the hypotheses would take. Would hypotheses in this new description be universal nondeterministic Turing machines, with the aforementioned probability distribution summed over the nondeterministic outputs?

Comment author: Qiaochu_Yuan 30 July 2013 04:14:36AM *  6 points [-]

Hypotheses in this description are probabilistic Turing machines. These can be cashed out to programs in a probabilistic programming language.

I think it's going too far to call this a "problem with Solomonoff induction." Solomonoff induction makes no claims; it's just a tool that you can use or not. Solomonoff induction as a mathematical construct should be cleanly separated from the claim that AIXI is the "best intelligence," which is wrong for several reasons.

Comment author: Transfuturist 30 July 2013 04:21:40AM *  1 point [-]

Can probabilistic Turing machines be considered a generalization of deterministic Turing machines, so that DTMs can be described in terms of PTMs?

Editing in reply to your edit: I thought Solomonoff Induction was made for a purpose. Quoting from Legg's paper:

Solomonoff's induction method is an attempt to design a general all purpose inductive inference system. Ideally such a system would be able to accurately learn any meaningful hypothesis from a bare minimum of appropriately format- ted information.

I'm just pointing out what I see as a limitation in the domain of problems classical Solomonoff Induction can successfully model.

Comment author: Qiaochu_Yuan 30 July 2013 05:14:55AM 0 points [-]

Can probabilistic Turing machines be considered a generalization of deterministic Turing machines, so that DTMs can be described in terms of PTMs?

Yes.

I'm just pointing out what I see as a limitation in the domain of problems classical Solomonoff Induction can successfully model.

I don't think anyone claims that this limitation doesn't exist (and anyone who claims this is wrong). But if your concern is with actual coins in the real world, I suppose the hope is that AIXI would eventually learn enough about physics to just correctly predict the outcome of coin flips.

Comment author: JoshuaZ 30 July 2013 12:50:52PM 6 points [-]

The steelman is to replaces coin flips with radioactive decay and then go through with the argument.

Comment author: Pfft 30 July 2013 05:05:50AM 0 points [-]

Yes.