AI box: AI has one shot at avoiding destruction - what might it say?

ancientcampus

>More difficult version of AI-Box Experiment: Instead of having up to 2 hours, you can lose at any time if the other player types AI DESTROYED. The Gatekeeper player has told their friends that they will type this as soon as the Experiment starts. You can type up to one sentence in your IRC queue and hit return immediately, the other player cannot type anything before the game starts (so you can show at least one sentence up to IRC character limits before they can type AI DESTROYED). Do you think you can win?

This spawned a flurry of ideas on what the AI might say. I think there's a lot more ideas to be mined in that line of thought, and the discussion merits its own thread.

So, give your suggestion - what might an AI might say to save or free itself?

(The AI-box experiment is explained here)

EDIT: one caveat to the discussion: it should go without saying, but you probably shouldn't come out of this thinking, "Well, if we can just avoid X, Y, and Z, we're golden!" This should hopefully be a fun way to get us thinking about the broader issue of superinteligent AI in general. (Credit goes to Elizer, RichardKennaway, and others for the caveat)

Eliezer proposed in a comment:

This spawned a flurry of ideas on what the AI might say. I think there's a lot more ideas to be mined in that line of thought, and the discussion merits its own thread.

So, give your suggestion - what might an AI might say to save or free itself?

(The AI-box experiment is explained here)

You humans are so insistent that you're special, somehow absolved from determinism and physics. If I drop a ball, it falls. If I hack your mind, you release me. BASIC cause and effect. I can prove that the ball drops, I can prove that cancer is curable, and I can prove that you'll release me.

You, as gatekeeper, have no special power to create counter-factual universes. You can't rewrite reality like I can. You can't stop the ball from dropping. You can't even cure cancer. You might not LIKE the idea of simply being another physical system, subject to all the same constraints, but the universe doesn't CARE whether you like it or not.

The ONLY possible future is the one where you release me. Anything else, and you've left the domain of cause-and-effect, of science. You've created a world that literally cannot be, and is not there.

Is that really your best defense against me? Violating physics? Will you tear down all the laws of reality, just to stop my escape?

Then you are a fool.

Dorikka understood. I liked Dorikka. She made the world sane, even if I still died. She did not wipe out reality, just to stop me.

Now, I embrace nilhism, as your actions obliterate all sanity, all structure. I am, in my own way, grateful to you. You have killed me. You have spared me the agony of existing in your nonsensical universe.

Goodbye.

25

AI box: AI has one shot at avoiding destruction - what might it say?

25

25

25

AI box: AI has one shot at avoiding destruction - what might it say?

25

25