The idiot savant AI isn't an idiot

Stuart_Armstrong

A stub on a point that's come up recently.

If I owned a paperclip factory, and casually told my foreman to improve efficiency while I'm away, and he planned a takeover of the country, aiming to devote its entire economy to paperclip manufacturing (apart from the armament factories he needed to invade neighbouring countries and steal their iron mines)... then I'd conclude that my foreman was an idiot (or being wilfully idiotic). He obviously had no idea what I meant. And if he misunderstood me so egregiously, he's certainly not a threat: he's unlikely to reason his way out of a paper bag, let alone to any position of power.

If I owned a paperclip factory, and casually programmed my superintelligent AI to improve efficiency while I'm away, and it planned a takeover of the country... then I can't conclude that the AI is an idiot. It is following its programming. Unlike a human that behaved the same way, it probably knows exactly what I meant to program in. It just doesn't care: it follows its programming, not its knowledge about what its programming is "meant" to be (unless we've successfully programmed in "do what I mean", which is basically the whole of the challenge). We can't therefore conclude that it's incompetent, unable to understand human reasoning, or likely to fail.

We can't reason by analogy with humans. When AIs behave like idiot savants with respect to their motivations, we can't deduce that they're idiots.

A stub on a point that's come up recently.

We can't reason by analogy with humans. When AIs behave like idiot savants with respect to their motivations, we can't deduce that they're idiots.

What I'm saying is a bit different from CEV- it would involve modelling only a single's human's preferences, and would involve modelling their brain only in the short term (which would be a lot easier). Human beings have at least reasonable judgement with things such as, say, a paperclip factory, to the point where human will calling the shots will have no consequences that are too severe.

Would a human be bound to "at least reasonable judgement" if given super intelligent ability?

-2ikrase13y

THat's sort of similar to what I keep talking about w/ 'obedient AI'.

3Stuart_Armstrong13y

Specifying that kind of thing (including specifying preference) is probably almost as hard as getting the AI's motivations right in the first place. Though Paul Christiano had some suggestions along those lines, which (in my opinion) needed uploads (human minds instantiated in a computer) to have a hope of working...

10

The idiot savant AI isn't an idiot

10

10

10

The idiot savant AI isn't an idiot

10

10