That just seems to be another confusion to me :-(
The argument - to the extent that I can make sense of it - is that you can't restrain an super-intelligent machine - since it will simply use its superior brainpower to escape from the constraints.
We successfully restrain intelligent agents all the time - in prisons. The prisoners may be smarter than the guards, and they often outnumber them - and yet still the restraints are usually successuful.
Some of the key observations to my mind are:
Discarding the standard testing-based methodology would be very silly, IMO.
Indeed, it would sabotage your project to the point that it would almost inevitably be beaten - and there is very little point in aiming to lose.
Are you familiar with the AI-Box experiment? We can restrain human-intelligence level agents in prisons, most of the time. But the question to ask is: how effective was the first prison? Because that's the equivalent case.
None of the safety measures you propose are safe enough. You're underestimating the power of a recursively self-improving AI by a factor I can't begin to estimate--which is kind of the point.
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.