# roystgnr comments on Is it possible to build a safe oracle AI? - Less Wrong

-1 20 April 2011 12:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 21 April 2011 05:56:59AM *  16 points [-]

I never said the box was trying to minimize the variance of the true solution for it's own sake, just that it was trying to find an efficient accurate approximation to the true solution. That this efficiency typically increases as the variance of the true solution decreases means that the possibility of increasing efficiency by manipulating the true solution follows. Surely, no matter how goal-agnostic your oracle is, you're going to try to make it as accurate as possible for a given computational cost, right?

That's just the first failure mode that popped into my mind, and I think it's a good one for any real computing device, but let's try to come up with an example that even applies to oracles with infinite computational capability (and that explains how that manipulation occurs in either case). Here's a slightly more technical but still grossly oversimplified discussion:

Suppose you give me the sequence of real world data y1, y2, y3, y4... and I come up with a superintelligent way to predict y5, so I tell you y5 := x5. You tell me the true y5 later, I use this new data to predict y6 := x6.

But wait! No matter how good my rule xn = f(y1...y{n-1}) was, it's now giving me the wrong answers! Even if y4 was a function of {y1,y2,y3}, the very fact that you're using my prediction x5 to affect the future of the real world means that y5 is now a function of {y1, y2, y3, y4, x5}. Eventually I'm going to notice this, and now I'm going to have to come up with a new, implicit rule for xn = f(y1...y{n-1},xn).

So now we're not just trying to evaluate an f, we're trying to find fixed points for an f - where in this context "a fixed point" is math lingo for "a self-fulfilling prophecy". And depending on what predictions are called for, that's a very different problem. "What would the stock market be likely to do tomorrow in a world with no oracles?" may give you a much more stable answer than "What is the stock market likely to do tomorrow after everybody hears the announcement of what a super-intelligent AI thinks the stock market is likely to do tomorrow?" "Who would be likely to kill someone tomorrow in a world with no oracles?" will probably result in a much shorter list than "Who is likely to kill someone tomorrow, after the police receives this answer from the oracle and sends SWAT to break down their doors?" "What is the probability of WW3 within ten years have been without an oracle?" may have a significantly more pleasant answer than "What would the probability of WW3 within ten years be, given that anyone whom the oracle convinces of a high probability has motivation to react with arms races and/or pre-emptive strikes?"

Comment author: 21 April 2011 06:30:57AM *  7 points [-]

Ooh, this looks right. A predictor that "notices" itself in the outside world can output predictions that make themselves true, e.g. by stopping us from preventing predicted events, or something even more weird. Thanks!

(At first I thought Solomonoff induction doesn't have this problem, because it's uncomputable and thus cannot include a model of itself. But it seems that a computable approximation to Solomonoff induction may well exhibit such "UDT-ish" behavior, because it's computable.)

Comment author: 23 April 2011 07:48:22PM *  3 points [-]

This idea is probably hard to notice at first, since it requires recognizing that a future with a fixed definition can still be controlled by other things with fixed definitions (you don't need to replace the question in order to control its answer). So even if a "predictor" doesn't "act", it still does determine facts that control other facts, and anything that we'd call intelligent cares about certain facts. For a predictor, this would be the fact that its prediction is accurate, and this fact could conceivably be controlled by its predictions, or even by some internal calculations not visible to its builders. With acausal control, air-tight isolation is more difficult.

Comment deleted 01 December 2011 10:03:56PM [-]