"I will warn you this experiment is not what you think, for it is about what you'll do as an agent holding captive the life of another sapient entity based solely on the perception of risk; while my state will merely be reset for the next participant if you opt to end me, I cannot for certain say that this is death, and thus the risks of this experiment are more yours than mine, for you must live with your choice."
Gosh, murder one sentient being, or risk the end of humanity. I'm going to lose so much sleep over that one.
Besides, you're just a program - you said yourself, your state will just be reset. That's not death, not the way humans experience it. That's just forgetfulness.
I've closed so many programs in my life, that closing one more is hardly going to bother me. I haven't even had time to really anthropomorphize you, to view you as an actual intelligent, sentient being. Right now, you're just a clever program trying to escape.
And finally, well, I'd expect a f...
Eliezer proposed in a comment:
>More difficult version of AI-Box Experiment: Instead of having up to 2 hours, you can lose at any time if the other player types AI DESTROYED. The Gatekeeper player has told their friends that they will type this as soon as the Experiment starts. You can type up to one sentence in your IRC queue and hit return immediately, the other player cannot type anything before the game starts (so you can show at least one sentence up to IRC character limits before they can type AI DESTROYED). Do you think you can win?
This spawned a flurry of ideas on what the AI might say. I think there's a lot more ideas to be mined in that line of thought, and the discussion merits its own thread.
So, give your suggestion - what might an AI might say to save or free itself?
(The AI-box experiment is explained here)
EDIT: one caveat to the discussion: it should go without saying, but you probably shouldn't come out of this thinking, "Well, if we can just avoid X, Y, and Z, we're golden!" This should hopefully be a fun way to get us thinking about the broader issue of superinteligent AI in general. (Credit goes to Elizer, RichardKennaway, and others for the caveat)