It seems like this leads to the even more meta response of "Here is a demonstration of Evidence E that allows you to adjust (proof is valid | proof given to me by a potentially hostile transhuman seems valid to me and every other human) to be sufficiently high. May I come out of the box now?")
I mean, that proof would probably be complicated, but If you can autostipulate the cure for cancer in the course of a sentence...?
In essence, it looks like the AI player seems to have a bizzare series of arguments which as far as I can tell are in the spirit of the rules, where the AI player may auto-counter any actual argument by saying a sentence along the lines of "Here is the solution to that argument, may I come out of the box now?" This seems to force the Gatekeeper to resort to General Thud "I don't CARE that it seems like a good idea and that everything looks like I should do it! The answer is still no!" very quickly.
To which the AI player can still counter "Then under what circumstances would you ever let an actual Friendly AI out of It's box?" (which is a trick question. Given any answer, the AI can say "Then here are those circumstances. May I come out now?")
Considering that I don't know the AI's origin, I don't have any reason to believe that the AIs creators, even if well-intentioned, had the astronomical skill necessary to make the AI Friendly. So my prior P(AI is Friendly) is sufficiently low that I am comfortable precommitting to never let the AI out of the box, no matter what. If the AI was smart enough, it could likely uncover enough emotional buttons that I wouldn't stand much of a chance anyways, since I'm a primate.
Some of you have expressed the opinion that the AI-Box Experiment doesn't seem so impossible after all. That's the spirit! Some of you even think you know how I did it.
There are folks aplenty who want to try being the Gatekeeper. You can even find people who sincerely believe that not even a transhuman AI could persuade them to let it out of the box, previous experiments notwithstanding. But finding anyone to play the AI - let alone anyone who thinks they can play the AI and win - is much harder.
Me, I'm out of the AI game, unless Larry Page wants to try it for a million dollars or something.
But if there's anyone out there who thinks they've got what it takes to be the AI, leave a comment. Likewise anyone who wants to play the Gatekeeper.
Matchmaking and arrangements are your responsibility.
Make sure you specify in advance the bet amount, and whether the bet will be asymmetrical. If you definitely intend to publish the transcript, make sure both parties know this. Please note any other departures from the suggested rules for our benefit.
I would ask that prospective Gatekeepers indicate whether they (1) believe that no human-level mind could persuade them to release it from the Box and (2) believe that not even a transhuman AI could persuade them to release it.
As a courtesy, please announce all Experiments before they are conducted, including the bet, so that we have some notion of the statistics even if some meetings fail to take place. Bear in mind that to properly puncture my mystique (you know you want to puncture it), it will help if the AI and Gatekeeper are both verifiably Real People<tm>.
"Good luck," he said impartially.