[link] Disjunctive AI Risk Scenarios

Kaj_Sotala

Arguments for risks from general AI are sometimes criticized on the grounds that they rely on a series of linear events, each of which has to occur for the proposed scenario to go through. For example, that a sufficiently intelligent AI could escape from containment, that it could then go on to become powerful enough to take over the world, that it could do this quickly enough without being detected, etc.

The intent of my following series of posts is to briefly demonstrate that AI risk scenarios are in fact disjunctive: composed of multiple possible pathways, each of which could be sufficient by itself. To successfully control the AI systems, it is not enough to simply block one of the pathways: they all need to be dealt with.

I've got two posts in this series up so far:

AIs gaining a decisive advantage discusses four different ways by which AIs could achieve a decisive advantage over humanity. The one-picture version is:

AIs gaining the power to act autonomously discusses ways by which AIs might come to act as active agents in the world, despite possible confinement efforts or technology. The one-picture version (which you may wish to click to enlarge) is:

These posts draw heavily on my old paper, Responses to Catastrophic AGI Risk, as well as some recent conversations here on LW. Upcoming posts will try to cover more new ground.

I've got two posts in this series up so far:

AIs gaining a decisive advantage discusses four different ways by which AIs could achieve a decisive advantage over humanity. The one-picture version is:

These posts draw heavily on my old paper, Responses to Catastrophic AGI Risk, as well as some recent conversations here on LW. Upcoming posts will try to cover more new ground.

The fact that we have many very different scenarios means that there is (almost) no any single intervention which may stop all of them. Exceptions are "destroys all computers" and "create Singleton based of FAI as soon as possible".

In all other cases we should think not only about correct AI safety theory, but also of the ways to implement it all over the world. For example we could prove that "many level AI boxing" create enough uncertainty for AI, so it will always think that real human could punish it for wrong doing, which would (may be) result in perfect alining. But these prove will be useless if we also do not find the ways to implement it all over AI field. (And we still can't win over computer viruses in the computer field, even if we know a lot how to prevent them, because a lot of people invest in violating.)

So we have three unknown and very complex tasks: AI, AI safety and delivery of AI safety theory to AI research. To solve the last one we need a system model of global AI research, which should show us where implement actions which will make global research safer.

The best interferences of this kind will help to solve all three hard problems simultaneously.

19

[link] Disjunctive AI Risk Scenarios

19

19

19

[link] Disjunctive AI Risk Scenarios

19

19