You are viewing revision 1.6.0, last edited by Eliezer Yudkowsky

Of course, I never wrote the “important” story, the sequel about the first amplified human. Once I tried something similar. John Campbell’s letter of rejection began: “Sorry—you can’t write this story. Neither can anyone else.”... Bookworm, Run!” and its lesson were important to me. Here I had tried a straightforward extrapolation of technology, and found myself precipitated over an abyss. It’s a problem writers face every time we consider the creation of intelligences greater than our own. When this happens, human history will have reached a kind of singularity—a place where extrapolation breaks down and new models must be applied—and the world will pass beyond our understanding. -- Vernor Vinge, True Names and other Dangers, p. 47.

Vingean unpredictability is how we think about a consequentialist intelligence which we believe is smarter than us in a domain: we think we can't predict exactly what an agent like that will do, because if we could, we would be that smart ourselves (Vinge_law).

For example, if you can predict exactly what action Deep Blue will take on a chessboard, you can play as well as Deep Blue by just making whatever chess move you predict Deep Blue would make in your shoes. Thus, Deep Blue's programmers necessarily sacrificed their ability to predict Deep Blue's exact moves in advance using their own intelligence, in the course of creating a superhuman chessplayer.

But this doesn't mean Deep Blue's programmers were confused about the abstract criterion on which Deep Blue chose actions, nor that Deep Blue's programmers couldn't predict in advance that Deep Blue would try to win rather than lose chess games. This suggests a rejection of the claim that we are epistemically helpless and can know absolutely nothing about beings smarter than us: we can predict the consequences of Deep Blue's moves much better than we can predict the moves themselves.

Furthermore, our ability to think about agents smarter than us isn't limited to predicting goal achievement if we know the goal. If we found a giant alien machine that seemed very well-designed, we might be able to infer the aliens were superhumanly intelligent even if we didn't know the aliens' ultimate goals. This is because properties like harvesting energy, distributing energy efficiently, hardness to maintain shape, etcetera, are recognizable as convergent instrumental strategies; if we can recognize that an alien machine is efficiently harvesting and distributing energy, we might recognize it as an intelligently designed artifact in the service of some goal even if we don't know the goal.

Noncontainment of belief within the action probabilities

A key aspect of reasoning under Vingean uncertainty is that, due to our lack of logical omniscience, our beliefs about the agent are not contained in our probability distribution over the agent's actions.

Suppose that on each turn of a chess game playing against Deep Blue, I ask you to put a probability distribution on Deep Blue's possible chess moves. If you are a rational agent you should be able to put a well-calibrated probability distribution on these moves - most trivially, by assigning every legal move an equal probability (if Deep Blue has 20 legal moves, and you assign each move 5% probability, you are guaranteed to be well-calibrated).

Now imagine a randomized game player that, on each round, draws randomly from the probability distribution you'd assign to Deep Blue's move from the same chess position. In every turn, your belief about where you'll observe this agent move, is equivalent to your belief about where you'd see Deep Blue move. But your belief about the probable end of the game is very different (this is only possible due to your lack of logical omniscience).

In particular, we could draw the following contrast between your reasoning about Deep Blue and your reasoning about RandomBlue:

  • When you see Deep Blue make a move to which you assigned a low probability, you think the rest of the game will go worse for you than you expected (that is, Deep Blue will do better than you previously expected).
  • When you see RandomBlue make a move that you assigned a low probability Deep Blue would make in that position, you expect to beat RandomBlue sooner than you previously expected (things will go worse for RandomBlue than you previously expected).

This reflects our belief in something like the instrumental efficiency of Deep Blue. When we estimate the probability that Deep Blue makes a move , we're estimating the probability that, as Deep Blue estimated each move 's expected probability of winning , Deep Blue found (neglecting the possibility of exact ties, which is unlikely with deep searches and floating-point position-value estimates). If Deep Blue picks instead of , we know that Deep Blue estimated and in particular that Deep Blue estimated . This could be because the expected worth of to Deep Blue was less than expected, but for low-probability move to be better than all other moves as well implies that had an unexpectedly high value relative to our own estimates. Thus, when Deep Blue makes a very unexpected move, we mostly expect that it saw an unexpectedly good move that was better than what we thought was the best available move by some unknown margin.

In contrast, when RandomBlue makes an unexpected move, we think the random number generator happened to land on a move that we justly assigned low worth, and hence we expect to defeat RandomBlue faster than we otherwise would have.

Features of Vingean reasoning

Some key features of reasoning under Vingean uncertainty are as follows:

  • We may find ourselves much more confident of the predicted consequences of an action than of the predicted action.
  • We may be more sure about the agent's instrumental strategies than its goals.
  • Due to our lack of logical omniscience, our beliefs about the system's relation to the environment are not contained strictly within our probability distribution over its probable next actions. In particular, we update on the probable consequence of an action, and on the relative value of the probable consequences of other actions, after observing that the agent actually outputs the action.
  • If there is a compact way to describe the startling consequences of the agent's actions, we might try to infer that this is a goal of the agent and infer similar consequences in the future, even without being able to predict the agent's specific actions.

Our expectation of Vingean unpredictability in a domain may break down if the domain is extremely simple and sufficiently closed.

Cognitive uncontainability

Vingean unpredictability is one of the core reasons to expect cognitive uncontainability in sufficiently intelligent agents.

Vingean reflection

Vingean unpredictability implies that when an agent is considering constructing a future version of itself, or an agent in the environment, its approval of that future version or subagent can't be conditioned on knowing that agent's exact policies or actions. The same way that Deep Blue's programmers decided to run Deep Blue without knowing Deep Blue's exact moves against Kasparov, but nonetheless having arrived at strong abstract beliefs about what Deep Blue was 'trying' to achieve, an agent's self-modification must be predicated on beliefs about what the future agent will be trying to achieve (and how it lawfully updates its beliefs, etcetera) rather than knowing exactly what the future agent will do. This forms the core premise of Vingean reflection.