Versions of AIXI can be arbitrarily stupid

Stuart_Armstrong

Many people (including me) had the impression that AIXI was ideally smart. Sure, it was uncomputable, and there might be "up to finite constant" issues (as with anything involving Kolmogorov complexity), but it was, informally at least, "the best intelligent agent out there". This was reinforced by Pareto-optimality results, namely that there was no computable policy that performed at least as well as AIXI in all environments, and strictly better in at least one.

However, Jan Leike and Marcus Hutter have proved that AIXI can be, in some sense, arbitrarily bad. The problem is that AIXI is not fully specified, because the universal prior is not fully specified. It depends on a choice of a initial computing language (or, equivalently, of an initial Turing machine).

For the universal prior, this will only affect it up to a constant (though this constant could be arbitrarily large). However, for the agent AIXI, it could force it into continually bad behaviour that never ends.

For illustration, imagine that there are two possible environments:

The first one is Hell, which will give ε reward if the AIXI outputs "0", but, the first time it outputs "1", the environment will give no reward for ever and ever after that.
The second is Heaven, which gives ε reward for outputting "0" and 1 reward for outputting "1", and is otherwise memoryless.

Now simply choose a language/Turing machine such that the ratio P(Hell)/P(Heaven) is higher than the ratio 1/ε. In that case, for any discount rate, the AIXI will always output "0", and thus will never learn whether its in Hell or not (because its too risky to do so). It will observe the environment giving reward ε after receiving "0", behaviour which is compatible with both Heaven and Hell. Thus keeping P(Hell)/P(Heaven) constant, and ensuring the AIXI never does anything else.

In fact, it's worse than this. If you use the prior to measure intelligence, then an AIXI that follows one prior can be arbitrarily stupid with respect to another.

For illustration, imagine that there are two possible environments:

The first one is Hell, which will give ε reward if the AIXI outputs "0", but, the first time it outputs "1", the environment will give no reward for ever and ever after that.
The second is Heaven, which gives ε reward for outputting "0" and 1 reward for outputting "1", and is otherwise memoryless.

In fact, it's worse than this. If you use the prior to measure intelligence, then an AIXI that follows one prior can be arbitrarily stupid with respect to another.

Position 1 or 2 is correct. 3 isn't coherent; what is "reality fluid" and how can things be more "real" than other things. Where do subjective beliefs come from in this model? 4 has nothing to do with probability theory. Values and utility functions don't enter into it. Probability theory is about making predictions and doing statistics, not how much you care about different worlds which may or may not actually exist.

I interpret probability as expectation. I want to make predictions about things. I want to maximize the probability I assign to the correct outcomes. If I multiply all the predictions I ever made together, I want that number to be as high as possible (predictions of the correct outcome, that is.) That would the probability I gave to the world. Or at least my observations of it.

So then it doesn't really matter what the numbers represent. Just that I want them to be as high as possible. When I make decisions based on the numbers using some decision theory/algorithm and utility function, the higher the numbers are, the better my results will be.

I'm reminded of someone's attempt to explain probability without using words like "likely", "certain" or "frequency", etc. It was basically an impossible task. If I was going to attempt that, I would say something like the previous two paragraphs. Saying things like "weights", "reality fluid", "measure", "possible world", etc, just pushes the meaning elsewhere.

In any case, all of your definitions should be mathematically equivalent. They might have philosophical implications, but they should all produce the same results on any real world problems. Or at least I think they should. You aren't disputing Bayes theorem or standard probability theory or anything?

In that case the choice of prior should have the same consequences. And you still want to choose the prior that you think will assign the actual outcome the highest probability.

30

Versions of AIXI can be arbitrarily stupid

30

30

30

Versions of AIXI can be arbitrarily stupid

30

30