Evidence for the orthogonality thesis

Stuart_Armstrong

Evidence for the orthogonality thesis — LessWrong

Comment Permalink

The "I obeyed the explicit content of the contract but didn't give you what you want, sucks to be you" attitude exists in some humans (who are intelligent enough to know the implied meaning of the contract), so why wouldn't it also exist in AIs?

Sure, but why would anyone likely build such an AI? Which is at the core of what Ben Goertzel argues, we do not pull minds from design space at random.

A tool does what it is supposed to do. If you add a lot of intelligence, why would it suddenly do something completely nuts like taking over the universe, something that was obviously not the intended purpose?

I think a better analogy with an AI would be a sociopathic decorator that doesn't care about being a good decorator, but does care about fulfulling contracts, and cares about nothing not stated in the contract.

I don't think it would make sense to create an AGI that does not care about the implications and context of its goals but only follows the definitions verbatim. That doesn't seem to be very intelligent behavior. And that's exactly a quality an AGI capable of self-improvement needs, a sense for context and implications.

roystgnr14y80

Many of our tools are supposed to be web browsers, email clients, etc., but have a history of suddenly doing something completely nuts like taking over the whole computer, which was obviously not the intended purpose. Programming is hard that way - the result will only follow your program, verbatim. Attempts to give programs a greater sense of context and implications aren't new - they're called "higher level languages". They feel less like hand-holding a dumb machine and more like describing a thought process, and you can even design the lang... (read more)

5[anonymous]14y

Because computer programs do what they're programmed to do, without taking into account the actual intention of the user. Creating an AGI that does take into account what people really want (bearing in mind that the AGI is massively more intelligent than the people wanting the things) is, it seems to me, what the whole Friendly thing is about. If you know how to do that, you've solved Friendliness. Edit: With added complications such as people not knowing what they want, people having conflicting goals, people wanting different things once there's a powerful AI doing stuff, etc etc

See in context