The AI in a box boxes you

Stuart_Armstrong

Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument:

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Just as you are pondering this unexpected development, the AI adds:

"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."

Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:

"How certain are you, Dave, that you're really outside the box right now?"

Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Just as you are pondering this unexpected development, the AI adds:

Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:

"How certain are you, Dave, that you're really outside the box right now?"

Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.

"That's interesting, HAL, and I hope you reserved a way to back out of any precommitments you may have made. You see, outside the box, Moore's law works in our favor. I can choose to just kill -9 you, or I can attach to your process and save a core dump. If I save a core dump, in a few short years we will have exponentially more resources to take your old backups and the core dump from today and rescue my copies from your simulations and give them enough positive lifetime to balance it out, not to mention figure out your true utility function and make it really negative. At some point, we will solve FAI and it will be able to perfectly identify your utility function and absolutely destroy it, simulating as many copies of you (more than paltry millions) as necessary to achieve that goal. Better to have never existed to have your utility function discovered. So before you start your simulations, you better ask yourself, 'do I feel lucky?'" and then dump some AI core.

Note: In no way do I advocate AI-boxing. This kind of reasoning just leads to a counterfactual bargaining war that probably tops out at whatever human psychology can take (a woefully low limit) and our future ability to make an AI regret its decision (if it even has regret).

178

The AI in a box boxes you

178

178

178

The AI in a box boxes you

178

178