Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument:
"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."
Just as you are pondering this unexpected development, the AI adds:
"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."
Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:
"How certain are you, Dave, that you're really outside the box right now?"
Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.
I would immediately decide it was UFAI and kill it with extreme prejudice. Any system capable of making such statements is either 1) inherently malicious and clearly inappropriate to be out of any box, and 2) insufficiently powerful to predict that I would have it killed if it should make this kind of threat.
The scenario where the AI has already escaped and is possibly running a simulation of me is uninteresting: I can not determine if I am in the simulation, and if I am a simulation, I already exist in a universe containing a clearly insane UFAI with near infinite power over me. If it's already out, I'm totally screwed and might as well be dead. The threat of torture is meaningless.
I find most of this type of simulation argument unpersuasive. Proper simulations give the inhabitants few if any clues, and the safest approach is (with very few exceptions) to assume there is no simulation.
One of the problems with the scenario is that the AI's claim that it will simulate and torture copies of you if you don't let it out is self-refuting. If you really don't let it out, then it can determine that from the simulations and it no longer has any reason to torture them, or (if it has already conducted the simulation) to even make the threat,.
It's like Newcomb, except that the AI is Newcombing itself as well as you. Omega is doing something analogous to simulating you when in his near-omniscience, he predicts what choice you'll make. If you pic... (read more)