Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?
The swift genie-like answer: the paperclip maximser would prioritise nobbling the Supreme Court and relevant legislatures. Or just controlling the pen that wrote the laws, if that could be acceptable within the failsafe.
More generally, I don't think it work. First, there's a problem of underspecification. Laws require constant interpretation of case law, including a lot of 'common sense' type verdicts. We can't assume AI would read them in the way we do. Second, they rely on key underlying concepts such as 'cause to' and 'negligence' that rely on a reasonable person's expectation. If we ask if a reasonable superintelligent AI knew that some negative/illegal consequences would occur from its act, then the result would nearly always be yes, thus opening it to breaking laws of negligance.
I think there are two types of law, neither of which are suitable.
Specific laws: e.g. no speeding, no stealing
These would mostly not apply, as they ban humans from doing things humans can do and wish to do. Neither would be likely to apply to AI
General laws: uphold life, liberty and the pursuit of happiness
These aren't failsafes, they're the underlying utlity-maximiser
Thanks for the thoughts.
You seem to imply that AIs motivations will be substantially humanlike. Why might AIs be motivated to nobble the courts, control pens, overturn vast segments of law, find loopholes, and engage in other such humanlike gamesmanship? Sounds like malicious programming to me.
They should be designed to treat the law as a fundamental framework to work within, akin to common sense, physical theories, and other knowledge they will accrue and use over the course of their operation.
I was glib in my post suggesting that "before taking acti... (read more)