Why not just write failsafe rules into the superintelligent machine?

lukeprog

Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?

Hmm... interesting ideas. I don't intend to suggest that the AI would have human intentions at all, I think we might be modelling the idea of a failsafe in a different way.

I was assuming that the idea was an AI with a separate utility-maximising system, but to also make it follow laws as absolute, inviolable rules, thus stopping unintended consequences from the utility maximisation. In this system, the AI would 'want' to pursue its more general goal and the laws would be blocks. As such, it would find other ways to pursue its goals, including changing the laws themselves.

If the corpus of laws instead form part of what the computer is trying to achieve/uphold we face different problems. Firstly, laws are prohibitions and it's not clear how to 'maximise' them beyond simple obedience. Unless it's stopping other people breaking them in a Robocop way. Second, failsafes are needed because even 'maximise human desire satisfaction' can throw up lots of unintended results. An entire corpus of law would be far more unpredictable in its effects as a core programme, and thus require even more failsafes!

On a side point, my argument about cause, negligence etc. was not that the computer would fail to understand them, but that as regards a superintelligence, they could easily be either meaningless or over-effective.

For an example of the latter, if we allow someone to die, that's criminal negligence. This is designed for walking past drowning people and ignoring them etc. A law-abiding computer might calculate, say, that even with cryonics etc, every life will end in death due to the universe's heat death. It might then sterilise the entire human population to avoid new births, as each birth would necessitate a death. And so on. Obviously this would clash with other laws, but that's part of the problem: every action would involve culpability in some way, due to greater knowledge of consequences.

The laws might be appropriately viewed primarily as blocks that keep the AI from taking actions deemed unacceptable by the collective. AIs could pursue whatever goals they sees fit within the constraints of the law.

However, the laws wouldn't be all prohibitions. The "general laws" would be more prescriptive, e.g., life, liberty, justice for all. The "specific laws" would tend to be more prohibition oriented. Presumably the vast majority of them would be written to handle common situations and important edge cases. If someone suspects th... (read more)

13

Why not just write failsafe rules into the superintelligent machine?

13

13

13

Why not just write failsafe rules into the superintelligent machine?

13

13