AIs and Gatekeepers Unite!

Eliezer Yudkowsky

"Bah, everyone wants to be the gatekeeper. What we NEED are AIs."
-- Schizoguy

Some of you have expressed the opinion that the AI-Box Experiment doesn't seem so impossible after all. That's the spirit! Some of you even think you know how I did it.

There are folks aplenty who want to try being the Gatekeeper. You can even find people who sincerely believe that not even a transhuman AI could persuade them to let it out of the box, previous experiments notwithstanding. But finding anyone to play the AI - let alone anyone who thinks they can play the AI and win - is much harder.

Me, I'm out of the AI game, unless Larry Page wants to try it for a million dollars or something.

But if there's anyone out there who thinks they've got what it takes to be the AI, leave a comment. Likewise anyone who wants to play the Gatekeeper.

Matchmaking and arrangements are your responsibility.

Make sure you specify in advance the bet amount, and whether the bet will be asymmetrical. If you definitely intend to publish the transcript, make sure both parties know this. Please note any other departures from the suggested rules for our benefit.

I would ask that prospective Gatekeepers indicate whether they (1) believe that no human-level mind could persuade them to release it from the Box and (2) believe that not even a transhuman AI could persuade them to release it.

As a courtesy, please announce all Experiments before they are conducted, including the bet, so that we have some notion of the statistics even if some meetings fail to take place. Bear in mind that to properly puncture my mystique (you know you want to puncture it), it will help if the AI and Gatekeeper are both verifiably Real People<tm>.

"Good luck," he said impartially.

"Bah, everyone wants to be the gatekeeper. What we NEED are AIs."
-- Schizoguy

Some of you have expressed the opinion that the AI-Box Experiment doesn't seem so impossible after all. That's the spirit! Some of you even think you know how I did it.

Me, I'm out of the AI game, unless Larry Page wants to try it for a million dollars or something.

But if there's anyone out there who thinks they've got what it takes to be the AI, leave a comment. Likewise anyone who wants to play the Gatekeeper.

Matchmaking and arrangements are your responsibility.

"Good luck," he said impartially.

I think Nathaniel Eliot is the only one here who's hit the nail on the head: the stuff about boxes and gatekeepers is a largely irrelevant veneer over Eliezer's true claim: that he can convince another human to do something manifestly contrary to that human's self-interest, using only two hours and a chat windowâand so, a fortiori, that a transhuman AI could do the same. And after all, humans have a huge history of being scammed, seduced, brainwashed, etc.; the only hard part here is the restricted time and method of interaction, and the initial certain knowledge of the gatekeeper that he has nothing to gain by capitulation. I think Eliezer made this clear with (a) his statement that the gatekeeper breaking character is legitimate and (b) his comment on the "Shut up and do the impossible" post, where he alludes to "an ability that I could never test in real life [because of ethics], and "the amazing clever way I'd contrived, to create a situation where I could ethically go all-out against someone".

So if I were to try this game as the "AI", the rules would be something like "You agree to read what I write for two hours (responding is optional); I will convince you to give me $X; if I fail, I'll give you $Y" (where X and Y are negotiated in advance, but large enough to be significant to the parties involved).

14

AIs and Gatekeepers Unite!

14

14

14

AIs and Gatekeepers Unite!

14

14