The "I obeyed the explicit content of the contract but didn't give you what you want, sucks to be you" attitude exists in some humans (who are intelligent enough to know the implied meaning of the contract), so why wouldn't it also exist in AIs?
Sure, but why would anyone likely build such an AI? Which is at the core of what Ben Goertzel argues, we do not pull minds from design space at random.
A tool does what it is supposed to do. If you add a lot of intelligence, why would it suddenly do something completely nuts like taking over the universe, something that was obviously not the intended purpose?
I think a better analogy with an AI would be a sociopathic decorator that doesn't care about being a good decorator, but does care about fulfulling contracts, and cares about nothing not stated in the contract.
I don't think it would make sense to create an AGI that does not care about the implications and context of its goals but only follows the definitions verbatim. That doesn't seem to be very intelligent behavior. And that's exactly a quality an AGI capable of self-improvement needs, a sense for context and implications.
Many of our tools are supposed to be web browsers, email clients, etc., but have a history of suddenly doing something completely nuts like taking over the whole computer, which was obviously not the intended purpose. Programming is hard that way - the result will only follow your program, verbatim. Attempts to give programs a greater sense of context and implications aren't new - they're called "higher level languages". They feel less like hand-holding a dumb machine and more like describing a thought process, and you can even design the lang...
One of the most annoying arguments when discussing AI is the perennial "But if the AI is so smart, why won't it figure out the right thing to do anyway?" It's often the ultimate curiosity stopper.
Nick Bostrom has defined the "Orthogonality thesis" as the principle that motivation and intelligence are essentially unrelated: superintelligences can have nearly any type of motivation (at least, nearly any utility function-bases motivation). We're trying to get some rigorous papers out so that when that question comes up, we can point people to standard, and published, arguments. Nick has had a paper accepted that points out the orthogonality thesis is compatible with a lot of philosophical positions that would seem to contradict it.
I'm hoping to complement this with a paper laying out the positive arguments in favour of the thesis. So I'm asking you for your strongest arguments for (or against) the orthogonality thesis. Think of trying to convince a conservative philosopher who's caught a bad case of moral realism - what would you say to them?
Many thanks! Karma and acknowledgements will shower on the best suggestions, and many puppies will be happy.