Comment Permalink

jacob_cannell13y20

it's computed utilities will then be random samples from the utility function over the space of all programs, and should then converge to the mean of the utility function by the central limit theorem.

Well the mean of the utility function is just the expected utility.

There are number of utility terms in the AIXI equation. The utility function is evaluated for every hypothesis/program/universe forward evaluated for all future action paths, giving one best utility for just that universe, and the total expected utility is then the sum over all valid universes weighted by their complexity penalty.

By 'mean of the utility function', I meant the mean of the utility function over all possible universes rather than just valid universes. The validity constraint forces the expected utility to diverge from the mean of the utility function - it must for the agent to make any useful decisions!

So the total expected utility is not normally the mean utility, but it reduces to it in the case where the observation filter is removed.

The way I'm approaching this is to ask whether most of the expected utility comes from high probability events or low probability ones

My entire post concerns the subset of universes with probabilities approaching 1/infinity, corresponding to programs with length going to infinity. The high probability scenarios (shorter program universes) don't matter in mugger scenarios, we categorically assume they all have boring extremely low utilities (the mugger is jokin/lying/crazy).

Your observations have some probability P(T|N) to retain a hypothesis of length N. I don't see why this would depend that strongly on the value of N.

In AIXI-models, hypothesis acceptance is not probabilistic, it is completely binary: a universe program either perfectly fits the observation history or it does not. If even 1 bit is off, the program is ignored.

It's unfortunate I started using N for program length in my prior post, that was a mistake, L was the term for program length in the EU equation. L (program length) matters because of the solomonoff prior complexity penalty: 2^-L.

How did you get the number 2^-(L - length(O)) as a limit on the amount of the hypothesis space that is filtered (and do you mean retained by the filter or removed by the filter when you say 'filtered').

This simply comes from the fact that an observation history O can at most filter out only a fraction of the space of programs that are longer than it.

For example, start with an empty observation history O: {}. Clearly, this filters nothing. The space of valid programs of length L, for any L, is simply all possible programs of length L, which is expected to be a set of around 2^L in size. The sum over all programs for L going to infinity is thus the space of everything, the full Tegmark. In this case, the expected utility is simply the mean of the utility function over the full Tegmark.

Now consider O:{1}. We have cut out exactly half of the program space. O:{11}, cuts out 3/4th of the tegmark, and in general an observation history with length(O) filters the universe space down to 2^-length(O) of it's previous size, removing 1 - 2^-length(O) possible universes - but there are an infinite number of total universes.

Now, let's say we are ONLY interested in the contribution of universes of a certain prior likelihood (corresponding to a certain program length). These are the subsets of the tegmark with programs P where length(P) = L for some L. This is a FINITE, enumerable set.

Then for JUST the subset of universes with length(P)=L, there are 2^L universes in this set. For an observation history O with length(O) > L, it is not guaranteed that there are any valid programs that match the observation history. It could be 1, could be 0.

However, for length(P) > length(O) + C, for some small C, valid programs are absolutely guaranteed. Specifically for some constant C there are programs which simply directly encode random strings which happen to align with O. This set of programs correspond to 'chaos'.

Now consider the limit behavior as complexity goes to infinity. For any fixed observation history with length(O), as length(P) goes to infinity, the chaos set grows at the maximum possible rate, with 2^length(P), and dominates (because the chaos programs just fill extra length with any random bits).

In particular, for observation set O and the subset of universes with length(P)=L, there are expected to be roughly 2^-(length(O)+C) * 2^L observationally valid chaos universes. This simplifies to 2^(L-length(O)-C) valid chaos universes.

So when length(O)+C > L, there are unlikely to be any valid chaos universes. So the expected utility over this subset, EU[L], will be averaged over a small number of universes, possibly even 1 (if there are any at all that match O), or none. But as L grows larger than length(O)+C, the chaos universes suddenly appear (guaranteed) and their number grow exponentially with L, and the expected utility over that exponentially growing set quickly converges to the mean of the utility function (because the chaos universes are random).

Assuming a utility function with positive/negative bounds normalized around zero, the convergence should be to zero.

-6

The Generalized Anti-Pascal Principle: Utility Convergence of Infinitesimal Probabilities

-6

-6