Kaj_Sotala comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread.

Comment author: Kaj_Sotala 02 February 2010 04:39:52PM *  24 points [-]

Defeating Dr. Evil with self-locating belief is a paper relating to this subject.

Abstract: Dr. Evil learns that a duplicate of Dr. Evil has been created. Upon learning this, how seriously should he take the hypothesis that he himself is that duplicate? I answer: very seriously. I defend a principle of indifference for self-locating belief which entails that after Dr. Evil learns that a duplicate has been created, he ought to have exactly the same degree of belief that he is Dr. Evil as that he is the duplicate. More generally, the principle shows that there is a sharp distinction between ordinary skeptical hypotheses, and self-locating skeptical hypotheses.

(It specifically uses the example of creating copies of someone and then threatening to torture all of the copies unless the original co-operates.)

The conclusion:

Dr. Evil, recall, received a message that Dr. Evil had been duplicated and that the duplicate ("Dup") would be tortured unless Dup surrendered. INDIFFERENCE entails that Dr. Evil ought to have the same degree of belief that he is Dr. Evil as that he is Dup. I conclude that Dr. Evil ought to surrender to avoid the risk of torture.

I am not entirely comfortable with that conclusion. For if INDIFFERENCE is right, then Dr. Evil could have protected himself against the PDF's plan by (in advance) installing hundreds of brains in vats in his battlestation - each brain in a subjective state matching his own, and each subject to torture if it should ever surrender. (If he had done so, then upon receiving PDF's message he ought to be confident that he is one of those brains, and hence ought not to surrender.) Of course the PDF could have preempted this protection by creating thousands of such brains in vats, each subject to torture if it failed to surrender at the appropriate time. But Dr. Evil could have created millions...

It makes me uncomfortable to think that the fate of the Earth should depend on this kind of brain race.

Comment author: dclayh 02 February 2010 07:01:29PM *  36 points [-]

It makes me uncomfortable to think that the fate of the Earth should depend on this kind of brain race.

We cannot allow a brain-in-a-vat gap!

Comment author: Vladimir_Nesov 03 February 2010 02:29:59AM *  8 points [-]

And the error (as cited in the "conclusion") is again in two-boxing in Newcomb's problem, responding to threats, and so on. Anthropic confusion is merely an icing.

Comment author: aausch 02 February 2010 08:03:23PM 3 points [-]

The "Defeating Dr. Evil with self-locating belief" paper hinges on some fairly difficult to believe assumptions.

It would take a lot more than just a not telling me the brains in the vats are actually seeing what the note says they are seeing, to degree that is indistinguishable from reality.

In other words, it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth.

Comment author: KomeijiSatori 11 February 2013 01:34:50AM 0 points [-]

it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth. Is the fact that it is fully capable (based on, say, readings of it's processing capabilities, it's ability to know the state of your current mind, etc), and the fact that it has no reason NOT to do what it says (no skin of it's back to torture the subjective "you"s, even if you DON'T let it out, it will do so just on principal).

While it's understandable to say that, today, you aren't in some kind of Matrix, because there is no reason for you to believe so, in the situation of the guard, you DO know that it can do so, and will, even if you call it's "bluff" that the you right now is the original.

Comment author: Yuyuko 11 February 2013 02:35:30AM 0 points [-]

I had intended to reply with this very objection. It seems you've read my mind, Satori.

Comment author: arbimote 03 February 2010 01:06:51AM *  2 points [-]

If we accept the simulation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions (and other conditions, but torture seems to be the theme here). An extortionist in our world can only create a relatively small number of simulations of us, relatively small enough that it is not worth taking them into account. The distribution of simulation types in this world bears no relation to the distribution of simulations we could possibly be in.

If we want to gain information about what sort of simulation we are in, evidence needs to come directly from properties of our universe (stars twinkling in a weird way, messages embedded in π), rather than from properties of simulations nested in our universe.

So I'm safe from the AI ... for now.

Comment author: jacob_cannell 04 February 2011 04:50:56AM 1 point [-]

The gzillions of other copies of you are not relevant unless they exist in universes exactly like yours from your observational perspective.

That being said, your point is interesting but just gets back to a core problem of the SA itself, which is how you count up the set of probable universes and properly weight them.

I think the correct approach is to project into the future of your multiverse, counting future worldlines that could simulate your current existence weighted by their probability.

So if it's just one AI in a box and he doesn't have much computing power you shouldn't take him very seriously, but if it looks like this AI is going to win and control the future then you should take it seriously.

Comment author: TheAncientGeek 07 July 2014 11:25:22AM 0 points [-]

If we accept the simululation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions

That isn't a strong implication of simulation, but is of MWI.

Comment author: MatthewB 03 February 2010 09:23:25AM -1 points [-]

Excuse me... But, we're talking about Dr. Evil, who wouldn't care about anyone being tortured except his own body. Wouldn't he know that he was in no danger of being tortured and say "to hell with any other copy of me."???

Comment author: Unknowns 03 February 2010 10:16:37AM 2 points [-]

Right, the argument assumes he doesn't care about his copies. The problem is that he can't distinguish himself from his copies. He and the copies both say to themselves, "Am I the original, or a copy?" And there's no way of knowing, so each of them is subjectively in danger of being tortured.

Comment author: Kaj_Sotala 03 February 2010 09:43:59AM 1 point [-]

How would he know that he's in no danger of being tortured?

Comment author: MatthewB 03 February 2010 12:17:22PM 0 points [-]

He wouldn't, any more than you have no idea if you are in danger of being tortured either.

Comment author: Kaj_Sotala 03 February 2010 04:57:17PM 0 points [-]

I'm sorry, I don't understand. First you suggested that he'd know he was in no danger of being tortured, then you say that he wouldn't?

Comment author: MatthewB 04 February 2010 07:14:19AM 2 points [-]

Pardon... I was not clear.

Dr. Evil would not care to indulge in a philosophical debate about whether he may or may not be a duplicate who was about to be tortured unless he was strapped to a rack and WAS in fact already being tortured. Dr. Evil(s) don't really consider things like Possible Outcomes of this sort of problem... You'll have to take my word for it from having worked with and for a Dr. Evil when I was younger. Those sorts of people are arrogant and defiant (and contrary as hell) in the face of all sorts of opposition, and none of them I have known took to well to philosophical puzzling of the sort described.

My comment above is meant to say "How do you know that you're not about to be tortured right now?" and "Dr. Evil would have the same knowledge, and discard any claims that he might be about to be tortured for the same reasons that you don't feel under threat of torture right now, and for which you would discard a threat of torture at the present moment (immanent threat)." (if you do feel under threat of torture, then I don't know what to say)

Comment author: Kaj_Sotala 05 February 2010 07:51:00PM *  1 point [-]

Alright, I fortunately haven't worked with Dr. Evils, so I'll defer to your experience.

As for how Dr. Evil might know he was under a threat of torture, it was stated in the paper that he received a message from the Philosophy Defence Force telling him he was. It was also established that the Philosophy Defence Force never lies or gives misleading information. ;)

(I, myself, haven't received any threats from organizations known to never lie or be misleading.)

Comment author: MatthewB 05 February 2010 10:26:13PM -1 points [-]

I think the same applies, regardless of the PDF's notification. Just the name alone would make me suspicious of trusting anything that came from them.

Now, if the Empirical Defense Task Force told me that I was about to be tortured (and they had the same described reputation as the PDF)... I'd listen to them.

Comment author: Unknowns 04 February 2010 07:23:14AM 1 point [-]

I agree that Dr. Evil would act in this way. The paper was arguing about what he should do, not about what he would actually do.

Comment author: MatthewB 04 February 2010 09:30:24PM 0 points [-]

I see the issue, while I care about my own behavior, and others... I don't care to base it upon silly examples. And, I think this is a silly and contrived situation. Maybe someone should do a sitcom based upon it.

Comment author: MatthewB 04 February 2010 03:43:30PM 0 points [-]

On further consideration... In the first comment, I said that Dr. Evil Would not care, which is completely consistent with Dr. Evil Not having any idea

Comment author: Stuart_Armstrong 03 February 2010 11:17:28AM 0 points [-]

Causal decision theory seems to have no problem with this blackmail - if you're Dr Evil, don't surrender, and nothing will happend to you. If you're DUP, your decision is irrelevant, so it doesn't matter.

(I don't endore that way of thinking, btw)