Arguments for risks from general AI are sometimes criticized on the grounds that they rely on a series of linear events, each of which has to occur for the proposed scenario to go through. For example, that a sufficiently intelligent AI could escape from containment, that it could then go on to become powerful enough to take over the world, that it could do this quickly enough without being detected, etc.
The intent of my following series of posts is to briefly demonstrate that AI risk scenarios are in fact disjunctive: composed of multiple possible pathways, each of which could be sufficient by itself. To successfully control the AI systems, it is not enough to simply block one of the pathways: they all need to be dealt with.
I've got two posts in this series up so far:
AIs gaining a decisive advantage discusses four different ways by which AIs could achieve a decisive advantage over humanity. The one-picture version is:
AIs gaining the power to act autonomously discusses ways by which AIs might come to act as active agents in the world, despite possible confinement efforts or technology. The one-picture version (which you may wish to click to enlarge) is:
These posts draw heavily on my old paper, Responses to Catastrophic AGI Risk, as well as some recent conversations here on LW. Upcoming posts will try to cover more new ground.
Hmm. There may have been a miscommunication here.
This sounds like you're assuming that I'm trying to argue in favor of Friendly AI as the best solution. Now I admittedly do currently find FAI one of the most promising options for trying to navigate AI risk, but I'm not committed to that. I just want to find whatever solution works, regardless of whether it happens to be FAI or something completely else. But in order to find out what's the best solution, one needs to have a comprehensive idea of what the problem is like and how it's going to manifest itself, and that's what I'm trying to do - map out the problem, so that we can figure out what the best solutions are.
Would appreciate hearing more about these.
Isn't that the standard way of figuring out the appropriate corrective actions? First figure out what would happen absent any intervention, then see which points seem like most amenable to correction.
(Responding to the whole paragraph but don't want to quote it all) I would be interested to hear a definition of "AI risk" that does not reduce to "risk of unfriendly outcome" which itself is defined in terms of friendliness aka relation to human morality. If, like me, you reject the idea of consistent, discoverable morality in the first place, and therefore find friendliness to be an ill-formed, inconsistent idea, then it's hard ... (read more)