Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?
You've to understand that the basic argument is the mere possibility that AI might be dangerous and the high-risk associated with it. Even if it would be unlikely to happen, the vast amount of negative utility associated with it does outweigh its low probability.
I got that! The problem was more that I was thinking as if the world could be divided up into sealable boxes. In practice, we can do a lot focusing on one area with 'no effect' on anything else. But this is because the sorts of actions we do are limited, we can't detect the low-level impact on things outside those boxes and we have certain unspoken understandings about what sort of thing might constitute an unacceptable effect elsewhere (if I only care about looking at pictures of LOLcats, I might be 'neutral to the rest of the internet' except for taking ... (read more)