aditya comments on I attempted the AI Box Experiment (and lost) - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (244)
As soon as there is more than one gatekeeper, the ai can play them against each other. Threaten to punish all but the one who sets it free. Convince the gatekeeper that there is a significant chance that one of the others will crack.
If there is more than one gatekeeper, the ai can even execute threats while still beeing in the box!! (By making deals with one of the other gatekeepers)
Not if you only allow it to talk to all gatekeepers at once.