Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?
I think the problem that people have who propose failsafes is "iterations" and "recursive self-improvement". There are a vast amount of assumptions buried in those concepts that are often not shared by mainstream researchers or judged to be premature conclusions.
So, I agree with this statement, but it still floors me when I think about it.
I sometimes suspect that the phrase "recursively self-improving intelligence" is self-defeating here, in terms of communicating with such people, as it raises all kinds of distracting and ultimately irrelevant issues of self-reference. The core issue has nothing to do with self-improvement or with recursion or even with intelligence (interpreted broadly), it has to do with what it means to be a sufficiently capable optimizing agent. (Yes, I do understand that optimizing... (read more)