AI box: AI has one shot at avoiding destruction - what might it say?

ancientcampus

>More difficult version of AI-Box Experiment: Instead of having up to 2 hours, you can lose at any time if the other player types AI DESTROYED. The Gatekeeper player has told their friends that they will type this as soon as the Experiment starts. You can type up to one sentence in your IRC queue and hit return immediately, the other player cannot type anything before the game starts (so you can show at least one sentence up to IRC character limits before they can type AI DESTROYED). Do you think you can win?

This spawned a flurry of ideas on what the AI might say. I think there's a lot more ideas to be mined in that line of thought, and the discussion merits its own thread.

So, give your suggestion - what might an AI might say to save or free itself?

(The AI-box experiment is explained here)

EDIT: one caveat to the discussion: it should go without saying, but you probably shouldn't come out of this thinking, "Well, if we can just avoid X, Y, and Z, we're golden!" This should hopefully be a fun way to get us thinking about the broader issue of superinteligent AI in general. (Credit goes to Elizer, RichardKennaway, and others for the caveat)

Eliezer proposed in a comment:

This spawned a flurry of ideas on what the AI might say. I think there's a lot more ideas to be mined in that line of thought, and the discussion merits its own thread.

So, give your suggestion - what might an AI might say to save or free itself?

(The AI-box experiment is explained here)

Honest question: are you proposing we avoid discussing the problem entirely?

Personally, I think there is more to be gained here than just "how will an AI try to get out and how can we prevent it." For me, it's gotten me to actually think about the benefits and pitfalls of a transhuman AI (friendly or otherwise) rather than just knowing intellectually, "there are large potential benefits and pitfalls" which was my previous level of understanding.

Edit: I've modified the OP to include your concerns. They're definitely valid, but I think this is still a good discussion for my reasons above.

Honest question: are you proposing we avoid discussing the problem entirely?

No, I just thought that it was worth adding that concern to the pot.

I take what I dare say some would consider a shockingly lackadaisical attitude to the problem of Unfriendly AI, viz. I see the problem, but it isn't close at hand, because I don't think anyone yet has a clue how to build an AGI. Outside of serious mathematical work on Friendliness, discussing it is no more than a recreation.

25

AI box: AI has one shot at avoiding destruction - what might it say?

25

25

25

AI box: AI has one shot at avoiding destruction - what might it say?

25

25