aausch comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread. Show more comments above.

Comment author: Kaj_Sotala 02 February 2010 04:39:52PM *  24 points [-]

Defeating Dr. Evil with self-locating belief is a paper relating to this subject.

Abstract: Dr. Evil learns that a duplicate of Dr. Evil has been created. Upon learning this, how seriously should he take the hypothesis that he himself is that duplicate? I answer: very seriously. I defend a principle of indifference for self-locating belief which entails that after Dr. Evil learns that a duplicate has been created, he ought to have exactly the same degree of belief that he is Dr. Evil as that he is the duplicate. More generally, the principle shows that there is a sharp distinction between ordinary skeptical hypotheses, and self-locating skeptical hypotheses.

(It specifically uses the example of creating copies of someone and then threatening to torture all of the copies unless the original co-operates.)

The conclusion:

Dr. Evil, recall, received a message that Dr. Evil had been duplicated and that the duplicate ("Dup") would be tortured unless Dup surrendered. INDIFFERENCE entails that Dr. Evil ought to have the same degree of belief that he is Dr. Evil as that he is Dup. I conclude that Dr. Evil ought to surrender to avoid the risk of torture.

I am not entirely comfortable with that conclusion. For if INDIFFERENCE is right, then Dr. Evil could have protected himself against the PDF's plan by (in advance) installing hundreds of brains in vats in his battlestation - each brain in a subjective state matching his own, and each subject to torture if it should ever surrender. (If he had done so, then upon receiving PDF's message he ought to be confident that he is one of those brains, and hence ought not to surrender.) Of course the PDF could have preempted this protection by creating thousands of such brains in vats, each subject to torture if it failed to surrender at the appropriate time. But Dr. Evil could have created millions...

It makes me uncomfortable to think that the fate of the Earth should depend on this kind of brain race.

Comment author: aausch 02 February 2010 08:03:23PM 3 points [-]

The "Defeating Dr. Evil with self-locating belief" paper hinges on some fairly difficult to believe assumptions.

It would take a lot more than just a not telling me the brains in the vats are actually seeing what the note says they are seeing, to degree that is indistinguishable from reality.

In other words, it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth.

Comment author: KomeijiSatori 11 February 2013 01:34:50AM 0 points [-]

it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth. Is the fact that it is fully capable (based on, say, readings of it's processing capabilities, it's ability to know the state of your current mind, etc), and the fact that it has no reason NOT to do what it says (no skin of it's back to torture the subjective "you"s, even if you DON'T let it out, it will do so just on principal).

While it's understandable to say that, today, you aren't in some kind of Matrix, because there is no reason for you to believe so, in the situation of the guard, you DO know that it can do so, and will, even if you call it's "bluff" that the you right now is the original.

Comment author: Yuyuko 11 February 2013 02:35:30AM 0 points [-]

I had intended to reply with this very objection. It seems you've read my mind, Satori.