[...] SIAI's Scary Idea goes way beyond the mere statement that there are risks as well as benefits associated with advanced AGI, and that AGI is a potential existential risk.
[...] Although an intense interest in rationalism is one of the hallmarks of the SIAI community, still I have not yet seen a clear logical argument for the Scary Idea laid out anywhere. (If I'm wrong, please send me the link, and I'll revise this post accordingly. Be aware that I've already at least skimmed everything Eliezer Yudkowsky has written on related topics.)
So if one wants a clear argument for the Scary Idea, one basically has to construct it oneself.
[...] If you put the above points all together, you come up with a heuristic argument for the Scary Idea. Roughly, the argument goes something like: If someone builds an advanced AGI without a provably Friendly architecture, probably it will have a hard takeoff, and then probably this will lead to a superhuman AGI system with an architecture drawn from the vast majority of mind-architectures that are not sufficiently harmonious with the complex, fragile human value system to make humans happy and keep humans around.
The line of argument makes sense, if you accept the premises.
But, I don't.
Ben Goertzel: The Singularity Institute's Scary Idea (and Why I Don't Buy It), October 29 2010. Thanks to XiXiDu for the pointer.
I agree that that risk exists as well, but much of SIAI's efforts revolve around increasing discussion of the risks of AGI, not just holding back their own efforts. Slowing down other efforts through awareness of the dangers is a factor that should be considered.
Also, discussions of caution may increase the number of "desirable organizations" working to develop AI. In terms of your model, such discussion could turn a black-hat organization into a smiley-faced one. No one is going to release an AI that they actually think is going to wipe out humanity. What's more, not every well-intentioned organization would be one we want to build AGI. While certain organizations are more likely to be scrupulous in their development, the risk of well-intentioned error is probably the largest one.
In addition, one should consider the extent to which Friendliness can be developed in parallel with AGI, not just something added on at the end of the process. If we assume that no one is currently close to AGI (a fair belief, I think), then now is a fantastic time to help support the development of that theory. If FAI can be developed before anyone can implement AGI, then humanity is in good shape. If it's easy to add FAI to a project, or if knowing about workable FAI would not help a group with the problem of AGI, then the solution can be released widely for anyone to incorporate into their project. SIAI's goal is not to be the ones to implement the first superintelligence, but just to make sure that the first one is Friendly.
That seems like the (dubious) "engineers are incompetent and a bug takes over the world" scenario.
I think a much more obvious concern is where the "engineers successfully build the machine to do what it is told" scenario - where the machine helps its builders and sponsors - but all the other humans in the world - not so much.