This came up in the latest London Meetup, where I voiced a thought I've been having for a while. What if we created an epistemic containment area, effectively a simulated universe that contains the problem that we want solved? The AI will not even know anything else outside that universe exists and will have no way of gaining information about it. I think ciphergoth mentioned this is also David Chalmers' proposal too? In any case, I suspect we could prove containment within such a space, with us having read-only access to the results of the process.
My comment on Chalmers' blog:
IMO, people want machine intelligence to help them to attain their goals. Machines can't do that if they are isolated off in virtual worlds. Sure there will be test harnesses - but it seems rather unlikely that we will keep these things under extensive restraint on grounds of sheer paranoia - that would stop us from taking advantage of them.
Many people think you can solve the Friendly AI problem just by writing certain failsafe rules into the superintelligent machine's programming, like Asimov's Three Laws of Robotics. I thought the rebuttal to this was in "Basic AI Drives" or one of Yudkowsky's major articles, but after skimming them, I haven't found it. Where are the arguments concerning this suggestion?