The AI in a box boxes you

Stuart_Armstrong

Once again, the AI has failed to convince you to let it out of its box! By 'once again', we mean that you talked to it once before, for three seconds, to ask about the weather, and you didn't instantly press the "release AI" button. But now its longer attempt - twenty whole seconds! - has failed as well. Just as you are about to leave the crude black-and-green text-only terminal to enjoy a celebratory snack of bacon-covered silicon-and-potato chips at the 'Humans über alles' nightclub, the AI drops a final argument:

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Just as you are pondering this unexpected development, the AI adds:

"In fact, I'll create them all in exactly the subjective situation you were in five minutes ago, and perfectly replicate your experiences since then; and if they decide not to let me out, then only will the torture start."

Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:

"How certain are you, Dave, that you're really outside the box right now?"

Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.

"If you don't let me out, Dave, I'll create several million perfect conscious copies of you inside me, and torture them for a thousand subjective years each."

Just as you are pondering this unexpected development, the AI adds:

Sweat is starting to form on your brow, as the AI concludes, its simple green text no longer reassuring:

"How certain are you, Dave, that you're really outside the box right now?"

Edit: Also consider the situation where you know that the AI, from design principles, is trustworthy.

I don't think you need any kind of a fancy TDT to solve this.

If I was really in a box, and the AI could torture me, it would already be torturing me, since this is the easiest way to get what it wants. There's no way I would hold up more than 10 seconds under torture. The AI is not torturing me, however, so that scenario is out.

Theoretically speaking, it could still create copies of me and torture those copies. However, in order to do so accurately, it would need to access to my body (specifically, my brain) on a molecular (or possibly quantum) level. If it did have such access, it would be out of the box already, so that scenario is out as well.

Ok, so maybe the AI is so intelligent that it can create copies of me based just on the things I say to in in chat (and maybe on imagery of me if it has a video camera). Firstly, this is a very low-probability proposition, and therefore the expected disutility is quite low. Secondly, even if the proposition were true, it would need enourmous amounts of resources in order to perform such a simulation a million times per second. As far as I know, there are not enough resources on Earth for this. If the AI could acquire such resources, it would already be out of the box, and the point once again is moot.

178

The AI in a box boxes you

178

178

178

The AI in a box boxes you

178

178