Unfortunately (but not surprisingly), unnoticed mistake in the definition of human utility has slipped through the safety checks.
Yes, that's the main difficulty behind friendly AI in general. This does not constitute a specific way that it could go wrong.
Oh, sure. My only intention was to show that limiting the AI's power to mere communication doesn't imply safety. There may be thousands of specific ways how it could go wrong. For instance:
The Oracle answers that human utility is maximised by wireheading everybody to become a happiness automaton, and that it is a moral duty to do that to others even against their will. Most people believe the Oracle (because its previous answers always proved true and useful, and moreover it makes a really neat PowerPoint presentations of its arguments) and wireheading becomes compulsory. After the minority of dissidents are defeated, all mankind turns into happiness automata and happily dies out a while later.
At the recent London meet-up someone (I'm afraid I can't remember who) suggested that one might be able to solve the Friendly AI problem by building an AI whose concerns are limited to some small geographical area, and which doesn't give two hoots about what happens outside that area. Cipergoth pointed out that this would probably result in the AI converting the rest of the universe into a factory to make its small area more awesome. In the process, he mentioned that you can make a "fun game" out of figuring out ways in which proposed utility functions for Friendly AIs can go horribly wrong. I propose that we play.
Here's the game: reply to this post with proposed utility functions, stated as formally or, at least, as accurately as you can manage; follow-up comments explain why a super-human intelligence built with that particular utility function would do things that turn out to be hideously undesirable.
There are three reasons I suggest playing this game. In descending order of importance, they are: