Evidence for the orthogonality thesis

Stuart_Armstrong

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Nick Bostrom has defined the "Orthogonality thesis" as the principle that motivation and intelligence are essentially unrelated: superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation). We're trying to get some rigorous papers out so that when that question comes up, we can point people to standard, and published, arguments. Nick has had a paper accepted that points out the orthogonality thesis is compatible with a lot of philosophical positions that would seem to contradict it.

I'm hoping to complement this with a paper laying out the positive arguments in favour of the thesis. So I'm asking you for your strongest arguments for (or against) the orthogonality thesis. Think of trying to convince a conservative philosopher who's caught a bad case of moral realism - what would you say to them?

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.

Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.

I've had several conversations that went like this:

Victim: But surely a smart artificial intelligence will be able to tell right from wrong, if we humans can do that?

Me: Forget about the word "intelligence" for a moment. Imagine a machine that looks at all actions in turn, and mechanically chooses the action that leads to producing the greatest number of paperclips, in whichever way possible. With enough computing power and enough knowledge about the outside world, the machine might find a way to convert the whole world into a paperclip factory. The machine will resist any attempts by humans to interfere, because the machine's goal function doesn't say anything about humans, only paperclips.

Victim: But such a machine would not be truly intelligent.

Me: Who cares about definitions of words? Humanity can someday find a way to build such a machine, and then we're all screwed.

Victim: ...okay, I see your point. Your machine is not intelligent, but it can be very dangerous because it's super-efficient.

Me (under my breath): Yeah. That's actually my definition of "superintelligent", but you seem to have a concept of "intelligence" that's entangled with many accidental facts about humans, so let's not go there.

But such a machine would not be truly intelligent....That's actually my definition of "superintelligent"

If no-one is actually working on that kind of intelligence, one that's highly efficient at arbitrary and rigid goals (an AOC)...then what's the problem?

-2TheAncientGeek13y

If your beef is about unintelligent, but super efficient machines, why communicate with the .AI community ? That's generally not what they are trying to build.

2satt14y

Did anyone else have their first reaction as wanting to attack the starting premise? Victim: But surely a smart artificial intelligence will be able to tell right from wrong, if we humans can do that? Me: But we humans can't even do that!