Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument:
"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."
Just as you are pondering this unexpected development, the AI adds:
"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."
Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:
"How certain are you, Dave, that you're really outside the box right now?"
Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.
You seem to imply that this is hard.
As if people had not been convinced to kill themselves over little else than a pretty color poster and screwed up sense of nationalism. Getting people to kill themselves or others is ludicrously easy.
We call it 'recruitment'.
Doing it on a more personal and immediate level just takes a better knowledge of the techniques and skill at applying them.
It's not like Derren Brown ever influenced someone to kill another person in a crowded theatre.
Oh, wait, he did.
It's not like someone could be convinced to extinguish 100000 human lives in an instant.
Oh, wait, we did. (Everyone involved in the bombing of Hiroshima)
If you're not naturally gifted, you would simply do your homework. Persuasion and influence are sciences now.
If you do it right, not only can you convince an unsuspecting mind to let you out of the box, you can make them feel good about it too. Just find the internal forces in the GK's mind that support the idea of letting the AI out, and reinforce those, find the forces that oppose the idea and diminish them. You'll hit the threshold eventually. 2 hours seems a bit short for my liking, and speaks to Eliezer's persuasive abilities, but with enough time and motivation, it's certainly doable.
You'll need to understand the person at the other end of the IRC channel well, as reinforcing the wrong factor will be counter-productive.
The best metaphor would be that the AI plants the idea of release in the GK's mind, and nurtures it over the course of the conversation, all the while weakening the forces that hold it back. Against someone who hasn't been exposed to this kind of persuasion, success is almost inevitable.
There are some gross tricks one can use to be persuasive and induce the right state of mind:
Note that the first four techniques are what I would call "side channel implantation" in that they get information into the target's mind besides the semantic meaning of the text. These alone are sufficient to influence someone. If they're coupled with an emotional, philosophical and intellectual assault, the effect is devastating.
The only thing required for this kind of attack on a fellow human is the abdication of one's ethics and complete ruthlessness. If you're framing it as a game on the internet, even those requirements are unnecessary.
Based on your contributions so far, may I suggest that you will be better received if you significantly improve your interesting content to sarcasm ratio? Wrong audience for what you've been doing.
I'd also like to point out that you're talking at someone who's actually done the experiment, sticking his neck out after people had been saying that it's impossible to do. Now you come along out of nowhere, credentials unknown, and make unimpressed noises, which is cheap.