This is a thought exercise I came up with on IRC to help with the iffiness of "freezing yourself for a thousand years" with regards to continuity of self.

Let's say we live in a post-singularity world as uploads and are pretty bored and always up for terrible entertainment (our god is FAI but has a scary sense of humor ..). So some crazy person creates a very peculiar black cube in our shared reality. You walk into it, a fork of you is created and you duke it out via russian roulette. The winner walks out the other side.

Before entering, should you accurately anticipate dying with 50% probability?

I argued that you should anticipate surviving with 100% probability, since the single you that walked out of the box would turn out to be correct in his prediction. Surprisingly, someone disagreed.

So I extended the scenario by another black box with two doors, but this one is just a tunnel. In this case, everybody can agree that you should anticipate a 100% probability of surviving it unscathed. But if we delete our memory of what just happened when exiting the black boxes, and the boxes themselves, then the resulting universes would be indistinguishable!

One easy way to demonstrate this is to chain ten boxes and put a thousand dollars at the end. The person that anticipates dying with 50% probability (so over all the boxes, 1/1024 chance of surviving) would stay well outside. The person that anticipates surviving just walks through and comes away $1000 richer. "But at least my anticipation was correct", in this scenario, reminds me somewhat of the cries of "but at least my reasoning was correct" on the part of two-boxers.

What I'm wondering is: is there a general rule underlying this about the follies of allowing causally-indistinguishable-in-retrospect effects to differently affect our anticipation? Can somebody formalize this?

New Comment
26 comments, sorted by Click to highlight new comments since: Today at 3:29 AM

The person that anticipates surviving just walks through and comes away $1000 richer.

No; a person walks out, who has the memories of the person who walked in, plus the memories of winning ten duels to the death against a copy of themselves. But they don't have the memories of being killed by a copy of themselves, even though there were ten persons who experienced just that.

But if we delete our memory of what just happened when exiting the black boxes, and the boxes themselves, then the resulting universes would be indistinguishable!

If an alien civilization on the other side of the galaxy gets completely destroyed by a supernova, but humans never know about it, does that mean that nothing bad happened?

Treat it as just a black box. Person comes in, person comes out, atoms are indistinguishable, they're $1000 richer.

I know that's your idea, I'm saying it's stupid. If I torture you every night and wipe your memory before morning, are you just indifferent to that? I could add this to the torture: "I asked your daylight self after the mindwipe if it would be wrong to do what I'm doing to you, and he said no, because by black-box reasoning torturing you now doesn't matter, so long as I erase the effects by morning."

ETA: Maybe it's harsh to call it stupid when your original scenario wasn't about deliberately ignoring torture inside the black box. It was just an innocent exercise in being copied and then one of you deleted.

But you cannot presume that the person who anticipates surviving with certainty is correct, just because a copy of them certainly survives to get the bigger payoff. Your argument is: hey cryonics skeptic, here we see someone with a decision procedure which identifies the original with its copies, and it gets the bigger payoff; so judged by the criterion of results obtained ("winning") this is the superior attitude, therefore the more rational attitude, and so your objection to cryonics is irrational.

However, this argument begs the question of whether the copy is the same person as the original. A decision procedure would normally be regarded as defective if it favors an outcome because of mistaken identity - because person X gets the big payoff, and it incorrectly identifies X with the intended beneficiary of the decision making. And here I might instead reason as follows: that poor fool who volunteers for iterated russian roulette, the setup has fooled him into thinking that he gets to experience the payoff, just because a copy of him does.

As I recently wrote here, there is a "local self", the "current instance" of you, and then there may be a larger "extended self" made of multiple instances with which your current instance identifies. In effect, you are asking people to adopt a particular expansive identity theory - you want them to regard their copies as themselves - because it means bigger payoffs for them in your thought-experiment. But the argument is circular. For someone with a narrow identity theory ("I am only my current instance"), to run the gauntlet of iterated russian roulette really is to make a mistake.

The scenario where we torture you and then mindwipe you is not an outright rebuttal of an expansive attitude to one's own personal identity, but it does show that the black-box argument is bogus.

And your edit leaves you with an interesting conundrum.

It can put you in a situation where you see people around yourself adopting one of two strategies, and the people who adopt one strategy consistently win, and the people who adopt another strategy consistently lose, but you still refuse to adopt the winning strategy because you think the people who win are .. wrong.

I'm not sure if you can call that a win.

"Win" by what standards? If I think it is ontologically and factually incorrect - an intellectual mistake - to identify with your copies, then those who do aren't winning, any more than individual lemmings win when they dive off a cliff. If I am happy to regard a person's attitude to their copies as a matter of choice, then I may regard their choices as correct for them and my choices as correct or me.

Robin Hanson predicts a Malthusian galactic destiny, in which the posthuman intelligences of the far future are all poorer than human individuals of the present, because selection will favor value systems which are pro-replication. His readers often freak out over Robin's apparent approval of this scenario of crowded galactic poverty; he approves because he says that these far-future beings will be emotionally adapted to their world; they will want things to be that way.

So this is a similar story. I am under no obligation to adopt an expansive personal identity theory, even if that is a theory whose spread is favored by the conditions of uploaded life. That is merely a statement about how a particular philosophical meme prospers under new conditions, and about the implications of that for posthuman demographics; it is not a fact which would compel me to support the new regime out of self-interest, precisely because I do not already regard my copies as me, and I therefore do not regard their winnings as mine.

Winning by the standard that a person who thinks gaining $1k is worth creating 1023 doomed copies of themselves will, in this situation, get ahead by $1k.

The thing is, I'm genuinely not sure if it matters. To restate what you're doing another way, "If I make a copy of you every night and suspend it until morning, and also there's a you that gets tortured but it never causally affects anything else" - I think if you're vulnerable to coercion via that, you'd also have to be vulnerable to "a thousand tortured copies in a box" style arguments.

You may have missed the long addition I just made to my comment, which avoids the torture issue... however, being vulnerable to "a thousand tortured copies in a box" is not necessarily a bad thing! Just because viewing outcome A as bad renders you vulnerable to blackmail by the threat of A, doesn't automatically mean that you should change your attitude to A. Otherwise, why not just accept death and the natural lifespan, rather than bother with expensive attempts to live, like cryonics? If you care about dying, you end up spending all this time and energy trying to stay alive, when you could just be enjoying life; so why not change your value system and save yourself the trouble of unnatural life extension... I hope you see the analogy.

I can't say I do. Death doesn't care what I think. Other actors may care how you perceive things. Ironically, if you want to minimize torture for coercion, it may be most effective to ignore it. Like not negotiating with terrorists.

On one hand you're saying it's good to identify with your copies, because then you can play iterated russian roulette and win. On the other hand, you're saying it's bad to identify with your copies, to the extent of caring whether someone tortures them. Presumably you don't want to be tortured, and your copies don't want to be tortured, and your copies are you, but you don't care whether they are tortured... congratulations, I think you've invented strategic identity hypocrisy for uploads!

I think the issue of causal interpolation comes up. From where I'm standing right now, the tortured copies never become important in my future; what I'm doing with the boxes is sort of smooth out the becoming-important-ness so that even if I turn out to be a losing copy, I will identify with the winning copy since they're what dominates the future. Call it mangled-priorities. You could effectively threaten me by releasing the tortured copies into my future-coexistence, at which point it might be the most practical solution for my tortured copies to chose suicide, since they wouldn't want their broken existence to dominate set-of-copies-that-are-me-and-causally-interacting's future. How the situation would evolve if the tortured copies never interacted again - I don't know. I'd need to ask a superintelligence what ought to determine anticipation of subjective existence.

[edit] Honestly, what I'm really doing is trying to precommit to the stance that maximizes my future effectiveness.

Nah, I care about the copies that can interact with me in the future.

[edit] No that doesn't work. Rethinking.

[This comment is no longer endorsed by its author]Reply

If a tree falls in the forest, and no one is around, does it make a sound?

But someone was around to see it happen - everyone in the destroyed civilization.

Your argument definitely does not apply in general; I would not consent to walking through a box in which I am tortured and then my memory is erased in order to get $1000 on the other side.

I'm not sure if I would.

It's the same argument though.

How much money would I have to pay you for you to let me rape you in a way that causes no physical trauma, after dosing you with a drug that prevents you from remembering the hour before or after it?

Would that dollar amount change if I told you I had already given you the drug?

The problem I see is your treatment of this arrangement as a "black box" of you[entering] and you[exiting]. But this is illegitimate. There were ten rounds of you[copy-that-dies] that would also be you[entering].

Have you read EY's writings on timeless decision theory? It seems to me that this is a variation.

Your argument definitely does not apply in general; I would not consent to walking through a box in which I am tortured and then my memory is erased in order to get $1000 on the other side.

I would need more details about the erasure process, what are the side effects?

I'm not sure exactly what details you want, but there are no side effects; you are left with approximately the same memories as someone who walked through a box that does not torture people. The point of including the memory erasure in the thought experiment was to prevent any harm from occurring after you left the box.

I feel like you knew this and are asking about something else, but I am not sure what. Maybe something identity-related?

If "you" refers both to the person who dies and the person who lives, you will both die and live with 100% probability. However, if we create the clone with a few extra atoms of indium, the person with fewer atoms of indium will survive with 50% probability.

So your claims about probabilities are just window dressing on the bare assertion that both the person who lives and the person who dies should be referred to as "you."

Yes, 100% is my expectation for both outcomes as well. Otherwise it wouldn't be a fork.

One easy way to demonstrate this is to chain ten boxes and put a thousand dollars at the end. The person that anticipates dying with 50% probability (so over all the boxes, 1/1024 chance of surviving) would stay well outside. The person that anticipates surviving just walks through and comes away $1000 richer. "But at least my anticipation was correct", in this scenario, reminds me somewhat of the cries of "but at least my reasoning was correct" on the part of two-boxers.

If the person who anticipates dying (with 1023/1024 probability) actually does die (1023/1024 of the time), then they actually gained something by not walking through the box, namely not dying! There's no analogy to two-boxing, where the supposed benefit is simply "having had correct reasoning".

I'd walk through the roulette box (sounds like fun!) but not the torture box.

I think a way to see yourself through this is to look at the mutual information and distinct information between copies. Being identically prepared in a featureless box and quickly killed or surviving, there is little opportunity for divergence. The death does not result in the overall loss of information because of the mutual information between the copies and the lack of distinct information.

A scenario where the death is more drawn out is less acceptable. Similarly, a scenario where the copies live for an extended period, and then one dies abruptly, is similarly unacceptable, due to the gain of distinct information.

As a concrete fictional example. In one (arguably several) arcs of Sluggy Freelance, characters were scanned by nanoprobes. Then they proceeded about their lives, having major personal revelations. Then some of them died, and were restored to the scan point. The death of these characters was sad.

What I'm wondering is: is there a general rule underlying this about the follies of allowing causally-indistinguishable-in-retrospect effects to differently affect our anticipation? Can somebody formalize this?

I've seen similar stuff with people doing the Monty Hall problem. In some cases, not everyone agrees on the answer - like the Sleeping Beauty problem.

Other thoughts:

I think Cromwell's rule (0 and 1 are not probabilities) is relevant here. While it may be unlikely that if two copies of you fought, that both would die, they are 'evenly matched' and if we do this too many times - serially - (with the winner of a 1v1 round going to the next box to do it again), there is nonzero chance no one walks out on the other side. On the other hand, if 1023 identical copies are made of you, and then everyone fights it out, it still sounds like a riskier procedure than keeping a 'just in case' copy, and if you must, having it fight the victor. (Or the backup could be the original, if you think that sort of thing's important. Just because someone can make something that seems like an exact copy of you doesn't mean that it is. This 'duplicate and fight yourself for a prize' sounds like a great way to alter say, the genetic makeup of the world's population, without people noticing, because they never experience the procedure - just appearing somewhere they'd never been.)

That being said, I've always wondered why people always want their duplicates dead. It's not exactly immortality, but it might improve your odds... and since I was already wondering what you do with all the dead bodies from the Duplicate Games, preserving them sounds like an idea (though they did get murdered by themselves, which might be traumatic), although the resources involved might make that prohibitively expensive. (Would you play these games with the knowledge that the people running the game would keep the bodies? Use them as fertilizer? Reuse the atoms and molecules because humans are made out of similar compounds? I'm slightly wary of someone else getting even a damaged exact copy of my brain, let alone 1023.)

Before entering, should you accurately anticipate dying with 50% probability?

First, you discard the word that is causing all the confusion. When you are in a situation where:

  • "I want to live!" and
  • "I don't want to die!"

End up with such entirely different meanings any mental shortcuts for preference evaluation built around the word 'dead' become obsolete. Similar reasoning applies when you are wearing an Amulet of Life Saving or your friend can cast Resurrection on you. Death kind of loses its sting.

Just look at the expected future outcomes in the play scenario and the outcome in the not-play scenario and decide which one is more desirable. From what I understand the play scenario is one in which a whole bunch of 'you's lose Russian Roullette and 1 'you' wins Russian Roullette lots of times and gets money. Given the preferences you had us assume in the introduction, the uploads would clearly want to play. More generally it may depend on whether the 'you's experience anything when 'losing' or whether you have some specific considered negative preference for the event of a you being killed that is completely independent of both whether it results in there being less you left in the world and of any negative experiences that go with it.