Kaj_Sotala comments on The AI in a box boxes you - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (378)
Defeating Dr. Evil with self-locating belief is a paper relating to this subject.
(It specifically uses the example of creating copies of someone and then threatening to torture all of the copies unless the original co-operates.)
The conclusion:
We cannot allow a brain-in-a-vat gap!
And the error (as cited in the "conclusion") is again in two-boxing in Newcomb's problem, responding to threats, and so on. Anthropic confusion is merely an icing.
The "Defeating Dr. Evil with self-locating belief" paper hinges on some fairly difficult to believe assumptions.
It would take a lot more than just a not telling me the brains in the vats are actually seeing what the note says they are seeing, to degree that is indistinguishable from reality.
In other words, it would take a lot for the AI to convince me that it has successfully created copies of me which it will torture, much more than just a propensity for telling the truth.
While it's understandable to say that, today, you aren't in some kind of Matrix, because there is no reason for you to believe so, in the situation of the guard, you DO know that it can do so, and will, even if you call it's "bluff" that the you right now is the original.
I had intended to reply with this very objection. It seems you've read my mind, Satori.
If we accept the simulation hypothesis, then there are already gzillions of copies of us, being simulated under a wide variety of torture conditions (and other conditions, but torture seems to be the theme here). An extortionist in our world can only create a relatively small number of simulations of us, relatively small enough that it is not worth taking them into account. The distribution of simulation types in this world bears no relation to the distribution of simulations we could possibly be in.
If we want to gain information about what sort of simulation we are in, evidence needs to come directly from properties of our universe (stars twinkling in a weird way, messages embedded in π), rather than from properties of simulations nested in our universe.
So I'm safe from the AI ... for now.
The gzillions of other copies of you are not relevant unless they exist in universes exactly like yours from your observational perspective.
That being said, your point is interesting but just gets back to a core problem of the SA itself, which is how you count up the set of probable universes and properly weight them.
I think the correct approach is to project into the future of your multiverse, counting future worldlines that could simulate your current existence weighted by their probability.
So if it's just one AI in a box and he doesn't have much computing power you shouldn't take him very seriously, but if it looks like this AI is going to win and control the future then you should take it seriously.
That isn't a strong implication of simulation, but is of MWI.
Excuse me... But, we're talking about Dr. Evil, who wouldn't care about anyone being tortured except his own body. Wouldn't he know that he was in no danger of being tortured and say "to hell with any other copy of me."???
Right, the argument assumes he doesn't care about his copies. The problem is that he can't distinguish himself from his copies. He and the copies both say to themselves, "Am I the original, or a copy?" And there's no way of knowing, so each of them is subjectively in danger of being tortured.
How would he know that he's in no danger of being tortured?
He wouldn't, any more than you have no idea if you are in danger of being tortured either.
I'm sorry, I don't understand. First you suggested that he'd know he was in no danger of being tortured, then you say that he wouldn't?
Pardon... I was not clear.
Dr. Evil would not care to indulge in a philosophical debate about whether he may or may not be a duplicate who was about to be tortured unless he was strapped to a rack and WAS in fact already being tortured. Dr. Evil(s) don't really consider things like Possible Outcomes of this sort of problem... You'll have to take my word for it from having worked with and for a Dr. Evil when I was younger. Those sorts of people are arrogant and defiant (and contrary as hell) in the face of all sorts of opposition, and none of them I have known took to well to philosophical puzzling of the sort described.
My comment above is meant to say "How do you know that you're not about to be tortured right now?" and "Dr. Evil would have the same knowledge, and discard any claims that he might be about to be tortured for the same reasons that you don't feel under threat of torture right now, and for which you would discard a threat of torture at the present moment (immanent threat)." (if you do feel under threat of torture, then I don't know what to say)
Alright, I fortunately haven't worked with Dr. Evils, so I'll defer to your experience.
As for how Dr. Evil might know he was under a threat of torture, it was stated in the paper that he received a message from the Philosophy Defence Force telling him he was. It was also established that the Philosophy Defence Force never lies or gives misleading information. ;)
(I, myself, haven't received any threats from organizations known to never lie or be misleading.)
I think the same applies, regardless of the PDF's notification. Just the name alone would make me suspicious of trusting anything that came from them.
Now, if the Empirical Defense Task Force told me that I was about to be tortured (and they had the same described reputation as the PDF)... I'd listen to them.
I agree that Dr. Evil would act in this way. The paper was arguing about what he should do, not about what he would actually do.
I see the issue, while I care about my own behavior, and others... I don't care to base it upon silly examples. And, I think this is a silly and contrived situation. Maybe someone should do a sitcom based upon it.
On further consideration... In the first comment, I said that Dr. Evil Would not care, which is completely consistent with Dr. Evil Not having any idea
Causal decision theory seems to have no problem with this blackmail - if you're Dr Evil, don't surrender, and nothing will happend to you. If you're DUP, your decision is irrelevant, so it doesn't matter.
(I don't endore that way of thinking, btw)