You are viewing revision 1.3.0, last edited by Eliezer Yudkowsky

Summary: In 1950, the very first paper ever written on computer chess, by Shannon, gave the algorithm that would play perfect chess given unlimited computing power. In reality, computing power was limited, so computers did not play superhuman chess until 47 years after that. However, it remains true that if you don't know how to play chess using unlimited computing power, you definitely can't play chess using limited computing power. Not knowing how to do something with unlimited computing power reveals that you are confused about the structure of the problem - the state of knowledge humanity had in 1830 when Poe carefully argued that no automaton could ever play chess, since at each turn there are many possible moves, but machines can only make deterministic motions. Similarly, in modern AI and especially in value alignment theory, there's a sharp divide between problems we know how to solve using unlimited computing power, and problems which are confusing enough that we can't even state the simple program that would solve them given a larger-than-the-universe computer. It is an alarming aspect of the current state of affairs that we know how to build a non-value-aligned hostile AI using unbounded computing power but not how to build a nice AI using unlimited computing power. The unbounded analysis program in value alignment theory centers on crossing this gap.

Clickbait: What we do and don't understand how to do using unlimited computing power is a critical distinction, and important frontier.

Todo

  • more detailed history of Shannon and Poe
  • (diagnosis) nirvana fallacy applied to unbounded analyses, past exaggerations leading to widespread disrespect
  • unbounded analysis makes claims precise enough to be critiqued and enables the discourse to progress
  • AIXI is the central example of an unbounded agent, and often also demarcates the boundary between 'straightforward' and 'confusing' problems in modern AI
  • bounded agent assumptions still hold in real life and should clearly be marked as being violated
    • in particular we still need to be wary of 'unbounded analyses' that avert the central problem
    • in some cases advanced agent properties will start to predictably or possibly cross some of those lines (a superintelligence is much more likely to find a strategy, even in a large search space)
  • you still don't get to assume that the agent can identify arbitrary nice things unless you know how to make a Python program run on a large-enough computer output those nice things

    • (this is what blocks the 'unbounded analysis' of "Oh, it's really smart, it'll know what we mean")
  • why we need basic theoretical understanding

    • because it's hard to do FAI otherwise
    • because our FAI concepts need to anchor in basic things that might stay constant under self-modification, not particular programmatic tricks that will evaporate like snow in winter