I like this game. However, as a game, it needs some rules as to how formally the utility function must be defined, and whether you get points merely for avoiding disaster. One trivial answer would be: maximize utility by remaining completely inert! Just be an extremely expensive sentient rock!
On the other hand, it should be cheating to simply say: maximize your utility by maximizing our coherent extrapolated volition!
Or maybe it wouldn't be...are there any hideously undesirable results from CEV? How about from maximizing our coherent aggregated volition ?
Or maybe it wouldn't be...are there any hideously undesirable results from CEV?
Yes. Some other people are @#%@s. Or at least have significantly different preferences to me. They may get what they want. That would suck. Being human doesn't mean having compatible preferences and the way the preferences are aggregated and who gets to be included are a big deal.
At the recent London meet-up someone (I'm afraid I can't remember who) suggested that one might be able to solve the Friendly AI problem by building an AI whose concerns are limited to some small geographical area, and which doesn't give two hoots about what happens outside that area. Cipergoth pointed out that this would probably result in the AI converting the rest of the universe into a factory to make its small area more awesome. In the process, he mentioned that you can make a "fun game" out of figuring out ways in which proposed utility functions for Friendly AIs can go horribly wrong. I propose that we play.
Here's the game: reply to this post with proposed utility functions, stated as formally or, at least, as accurately as you can manage; follow-up comments explain why a super-human intelligence built with that particular utility function would do things that turn out to be hideously undesirable.
There are three reasons I suggest playing this game. In descending order of importance, they are: