Why not just write failsafe rules into the superintelligent machine?

lukeprog

Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?

I believe that failsafes are necessary and desirable, but not sufficient. Thinking you can solve the friendly AI problem just by defining failsafe rules is dangerously naive. You not only need to guarantee that the AI correctly reimplements the safeguards in all its successors, you also need to guarantee that the safeguards themselves don't have bugs that cause disaster.

It is not necessarily safe or acceptable for the AI to shut itself down after it's been running for awhile, and there is not necessarily a clear line between the AI itself and the AI's technological creations. It is not necessarily safe or possible for the AI to wait for input from human operators. A failsafe does not necessarily trigger in quite the circumstances you expect, and a superintelligence may try to rules-lawyer its way around it.

13

Why not just write failsafe rules into the superintelligent machine?

13

13

13

Why not just write failsafe rules into the superintelligent machine?

13

13