After reading the current comments I’ve come up with this:
1) Restrict the AI’s sphere of influence to a specific geographical area (Define it in several different ways! You don’t want to confine the AI in “France” just to have it annex the rest of the world. Or by gps location and have it hack satellites so they show different coordinates.)
2) Tell it to not make another AI (this seems a bit vague but I don’t know how to make it more specific) (maybe: all computing must come from one physical core location. This could prevent an AI from tricking someone into isolating a back up, effectively making a copy)
3) Set an upper bound for the amount physical space all AI combined in that specific area can use.
4) As a safeguard, if it does find a way around 2, let it incorporate the above rules, unaltered, in any new AI it makes.
Restrict the AI’s sphere of influence to a specific geographical area (Define it in several different ways! You don’t want to confine the AI in “France” just to have it annex the rest of the world. Or by gps location and have it hack satellites so they show different coordinates.)
You can't do so much as move an air molecule without starting a ripple of effects that changes things everywhere, including outside the specified area. How do you distinguish effects outside the area that matter from effects that don't?
At the recent London meet-up someone (I'm afraid I can't remember who) suggested that one might be able to solve the Friendly AI problem by building an AI whose concerns are limited to some small geographical area, and which doesn't give two hoots about what happens outside that area. Cipergoth pointed out that this would probably result in the AI converting the rest of the universe into a factory to make its small area more awesome. In the process, he mentioned that you can make a "fun game" out of figuring out ways in which proposed utility functions for Friendly AIs can go horribly wrong. I propose that we play.
Here's the game: reply to this post with proposed utility functions, stated as formally or, at least, as accurately as you can manage; follow-up comments explain why a super-human intelligence built with that particular utility function would do things that turn out to be hideously undesirable.
There are three reasons I suggest playing this game. In descending order of importance, they are: