Asymptotic Logical Uncertainty: Concrete Failure of the Solomonoff Approach

Scott Garrabrant

This post is part of the Asymptotic Logical Uncertainty series. In this post, I give a concrete example of how the Solomonoff Induction Inspired Aproach fails to converge to the correct probabilities when predicting the output of a simple Turing machine.

Let $f : N \to {0, 1}$ be a function such that (for large $n$ ) $f (n)$ can be computed in time $2^{n}$ , but there is no algorithm that can predict $f (n)$ in time less than $2^{n}$ consistently better than a coin which outputs 1 with probability $1 / 3$ .

Construct a Turing machine $E$ as follows:

$E (2 k) = f (2 k)$ , and

$E (2 k + 1) = f (2 k + 1)$ XOR $f (10^{k})$ .

Imagine trying to predict $E (k)$ with an algorithm which takes time at most $k^{2}$ . If $k$ is even, then it is determined by an event indistinguishable in time $k^{2}$ from a $1 / 3$ coin, so the predictor should output 1 or 0 with probability $1 / 3$ . If $k$ is odd, then we have the XOR of two bits that are indistinguishable from independently 1 with probability $1 / 3$ , so the predictor should output 1 with probability $4 / 9$ .

Therefore, if we let $T (k) = k^{3}$ and $R (k) = k^{2}$ , if SIA worked correctly, then we would have that the limit of the probability that $S I A (E, T, R)$ outputs 1 on the even bits would be $1 / 3$ .

However, this is not the case. In particular, the the probability that the $10^{k}$ th bit output by $S I A (E, T, R)$ is a 1 will not converge to $1 / 3$ , but rather switch between $4 / 9$ and $5 / 9$ .

This is because at the time that a predictor needs to predict $E (10^{k})$ , it will have already outputted a 1 with probability $4 / 9$ for $E (2 k + 1)$ , and had enough time to compute $f (2 k + 1)$ . Therefore, if $f (2 k + 1) = 0$ , then any predictor that does make its prediction for $E (10^{k})$ match its prediction for $E (2 k + 1)$ will contradict itself.

The machines that will do well when sampled in $S I A (E, T, R)$ will be the ones that output 1 independently with probability 1/3 for all even bits and 1 with probability $4 / 9$ for all odd bits, EXCEPT when predicting a bit in a location that is a power of 10, in which case, it will copy (or negate) a previous bit, and output 1 with probability either $4 / 9$ or $5 / 9$ , both of which are greater than the desired $1 / 3$ .

Further, a machine that gives a 1 with probability of $1 / 3$ on $E (10^{k})$ for all $k$ will have a likelyhood going to 0 relative to the machines which just copy or negate their old answers.

Note that the correct probability with which to believe $E (10^{k}) = 1$ really is $1 / 3$ , not $4 / 9$ or $5 / 9$ . The prediction given to $E (2 k + 1)$ was calculated from a state of ignorance about $f (2 n + 1)$ , and this prediction should have no effect on the prediction for $E (10^{k})$ .

LESSWRONG
LW

13

Asymptotic Logical Uncertainty: Concrete Failure of the Solomonoff Approach

13

Ω 9

13

Ω 9