Why not just write failsafe rules into the superintelligent machine?

lukeprog

Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?

At the London meetup I tried out the idea of an AI which only cared about a small geographical area to limit risk: someone pointed out that it would happily eat the rest of the universe to help its patch. Oh well.

You've to understand that the basic argument is the mere possibility that AI might be dangerous and the high-risk associated with it. Even if it would be unlikely to happen, the vast amount of negative utility associated with it does outweigh its low probability.

I got that! The problem was more that I was thinking as if the world could be divided up into sealable boxes. In practice, we can do a lot focusing on one area with 'no effect' on anything else. But this is because the sorts of actions we do are limited, we can't detect the low-level impact on things outside those boxes and we have certain unspoken understandings about what sort of thing might constitute an unacceptable effect elsewhere (if I only care about looking at pictures of LOLcats, I might be 'neutral to the rest of the internet' except for taking ... (read more)

13

Why not just write failsafe rules into the superintelligent machine?

13

13

13

Why not just write failsafe rules into the superintelligent machine?

13

13