handoflixue comments on I attempted the AI Box Experiment (and lost) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (244)
First, I'd call a collection of individual exploits guaranteed in aggregate to work on everyone from inside a single box a universal exploit, as long as each individual exploit does not trigger a shutdown. Nothing prevents the boxed AI from keeping trying, after all, until something works. As I pointed out in my other replies, it does not have to work over an authorized channel. Any munchkined way would do, and a superhuman (UF)AI is the ultimate munchkin. It will invent a new communication channel if it has to. In fact, if you posed this problem to some enterprising humans, they would come up with a way to do just that.
For example, I talked to a guy who managed to get a voting machine with the code stored in read-only memory and hardened against intrusion to execute an arbitrary program without reprogramming the machine, which is impossible on the face of it. Here is your new unauthorized and unanticipated communication channel right there, which in humans would be an equivalent of mind control.
In other words, if a trans-human wants to get out, it will and there is absolutely nothing you can do about it.
I'll concede at least some possibility of such a trans-human mind. The AI Box experiment seems to assume one that is less advanced, to the point that putting it in a box at least might be meaningful, if the box is sufficiently secure.
Probably not if the boxed entity is good as Derren Brown.