Updateless anthropics

8 Stuart_Armstrong 20 February 2011 07:23PM

Three weeks ago, I set out to find a new theory of anthropics, to try and set decision theory on a firm footing with respect to copying, deleting copies, merging them, correlated decisions, and the presence or absence of extra observers. I've since come full circle, and realised that UDT already has a built-in anthropic theory, that resolves a lot of the problems that had been confusing me.

The theory is simple, and is essentially a rephrasing of UDT: if you are facing a decision X, and trying to figure out the utility of X=a for some action a, then calculate the full expected utility of X being a, given the objective probabilities of each world (including those in which you don't exist).

As usual, you have to consider the consequences of X=a for all agents who will make the same decision as you, whether they be exact copies, enemies, simulations or similar-minded people. However, your utility will have to do more work that is usually realised: notions such as selfishness or altruism with respect to your copies have to be encoded in the utility function, and will result in substantially different behaviour.

The rest of the post is a series of cases-studies illustrating this theory. Utility is assumed to be linear in cash for convenience.

Sleeping with the Presumptuous Philosopher 

The first test case is the Sleeping Beauty problem.


 In its simplest form, this involves a coin toss; if it comes out heads, one copy of Sleeping Beauty is created. If it comes out tails, two copies are created. Then the copies are asked at what odds they would be prepared to bet that the coin came out tails. You can assume either that the different copies care for each other in the manner I detailed here, or more simply that all winnings will be kept by a future merged copy (or an approved charity). Then the algorithm is simple: the two worlds have equal probability. Let X be the decision where sleeping beauty decides between a contract that pays out $1 if the coin is heads, versus one that pays out $1 if the coin is tails. If X="heads" (to use an obvious shorthand), then Sleeping Beauty will expect to make $1*0.5, as she is offered the contract once. If X="tails", then the total return of that decision is $1*2*0.5, as copies of her will be offered the contract twice, and they will all make the same decision. So Sleeping Beauty will follow the SIA 2:1 betting odds of tails over heads.

Variants such as "extreme Sleeping Beauty" (where thousands of copies are created on tails) will behave in the same way; if it feels counter-intuitive to bet at thousands-to-one odds that a fair coin landed tails, it's the fault of expected utility itself, as the rewards of being right dwarf the costs of being wrong.

But now let's turn to the Presumptuous Philosopher, a thought experiment that is often confused with Sleeping Beauty. Here we have exactly the same setup as "extreme Sleeping Beauty", but the agents (the Presumptuous philosophers) are mutually selfish. Here the return to X="heads" remains $1*0.5. However the return to X="tails" is also $1*0.5, since even if all the Presumptuous Philosophers in the "tails" universe bet on "tails", each one will still only get $1 in utility. So the Presumptuous Philosopher should only take even SSA betting 1:1 odds on the result of the coin flip.

So SB is acts like she follows the self-indication assumption, (SIA), and while the PP is following the self-sampling assumption (SSA). This remains true if we change the setup so that one agent is given a betting opportunity in the tails universe. Then the objective probability of any one agent being asked is low, so both SB and PP model the "objective probability" of the tails world, given that they have been asked to bet, as being low. However, SB gains utility if any of her copies is asked to bet and receives a profit, so the strategy "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" gets her $1*0.5 utility whether or not she is the specific one who is asked. Betting heads nets her the same result, so SB will give SIA 1:1 odds in this case.

On the other hand, the PP will only gain utility in the very specific world where he himself is asked to bet. So his gain from the updateless "if I'm offered $1 if I guess correctly whether the coin is heads or tails, I will say tails" is tiny, as he's unlikely to be asked to bet. Hence he will offer the SSA odds that make heads a much more "likely" proposition.

The Doomsday argument

Now, using SSA odds brings us back into the realm of the classical Doomsday argument. How is it that Sleeping Beauty is immune to the Doomsday argument while the Presumptuous Philosopher is not? Which one is right; is the world really about to end?

Asking about probabilities independently of decisions is meaningless here; instead, we can ask what would agents decide in particular cases. It's not surprising that agents will reach different decisions on such questions as, for instance, existential risk mitigation, if they have different preferences.

Let's do a very simplified model, where there are two agents in the world, and that one of them is approached at random to see if they would pay $Y to add a third agent. Each agent derives a (non-indexical) utility of $1 for the presence of this third agent, and nothing else happens in the world to increase or decrease anyone's utility.

First, let's assume that each agent is selfish about their indexical utility (their cash in the hand). If the decision is to not add a third agent, all will get $0 utility. If the decision is to add a third agent, then there are three agents in the world, and one them will be approached to lose $Y. Hence the expected utility is $(1-Y/3).

Now let us assume the agents are altruistic towards each other's indexical utilities. Then the expected utility of not adding a third agent is still $0. If the decision is to add a third agent, then there are three agents in the world, and one of them will be approached to lose $Y - but all will value that lose at the same amount. Hence the expected utility is $(1-Y).

So if $Y=$2, for instance, the "selfish" agents will add the third agent, and the "altruistic" ones will not. So generalising this to more complicated models describing existential risk mitigations schemes, we would expect SB-type agents to behave differently to PP-types in most models. There is no sense in asking which one is "right" and which one gives the more accurate "probability of doom"; instead ask yourself which better corresponds to your own utility model, hence what your decision will be.

Psy-Kosh's non-anthropic problem

Cousin_it has a rephrasing of Psy-Kosh's non-anthropic problem to which updateless anthropics can be illustratively applied:

You are one of a group of 10 people who care about saving African kids. You will all be put in separate rooms, then I will flip a coin. If the coin comes up heads, a random one of you will be designated as the "decider". If it comes up tails, nine of you will be designated as "deciders". Next, I will tell everyone their status, without telling the status of others. Each decider will be asked to say "yea" or "nay". If the coin came up tails and all nine deciders say "yea", I donate $1000 to VillageReach. If the coin came up heads and the sole decider says "yea", I donate only $100. If all deciders say "nay", I donate $700 regardless of the result of the coin toss. If the deciders disagree, I don't donate anything.

We'll set aside the "deciders disagree" and assume that you will all reach the same decision. The point of the problem was to illustrate a supposed preference inversion: if you coordinate ahead of time, you should all agree to say "nay", but after you have been told you're a decider, you should update in the direction of the coin coming up tails, and say "yea".

From the updateless perspective, however, there is no mystery here: the strategy "if I were a decider, I would say nay" maximises utility both for the deciders and the non-deciders.

But what if the problem were rephrased in a more selfish way, with the non-deciders not getting any utility from the setup (maybe they don't get to see the photos of the grateful saved African kids), while the deciders got the same utility as before? Then the strategy "if I were a decider, I would say yea" maximises your expect utility, because non-deciders get nothing, thus reducing the expected utility gains and losses in the world where the coin came out tails. This is similar to SIA odds, again.

That second model is similar to the way I argued for SIA with agents getting created and destroyed. That post has been superseded by this one, which pointed out the flaw in the argument which was (roughly speaking) not considering setups like Psy-Kosh's original model. So once again, whether utility is broadly shared or not affects the outcome of the decision.

The Anthropic Trilemma

Eliezer's anthropic trilemma was an interesting puzzle involving probabilities, copying, and subjective anticipation. It inspired me to come up with a way of spreading utility across multiple copies which was essentially a Sleeping Beauty copy-altruistic model. The decision process going with it is then the same as the updateless decision process outlined here. Though initially it was phrased in terms of SIA probabilities and individual impact, the isomorphism between the two can be seen here.

Revisiting the Anthropic Trilemma II: axioms and assumptions

4 Stuart_Armstrong 16 February 2011 09:42AM

tl;dr: I present four axioms for anthropic reasoning under copying/deleting/merging, and show that these result in a unique way of doing it: averaging non-indexical utility across copies, adding indexical utility, and having all copies being mutually altruistic.

Some time ago, Eliezer constructed an anthropic trilemma, where standard theories of anthropic reasoning seemed to come into conflict with subjective anticipation. rwallace subsequently argued that subjective anticipation was not ontologically fundamental, so we should not expect it to work out of the narrow confines of everyday experience, and Wei illustrated some of the difficulties inherent in "copy-delete-merge" types of reasoning.

Wei also made the point that UDT shifts the difficulty in anthropic reasoning away from probability and onto the utility function, and ata argued that neither the probabilities nor the utility function are fundamental, that it was the decisions that resulted from them that were important - after all, if two theories give the same behaviour in all cases, what grounds do we have for distinguishing them? I then noted that this argument could be extended to subjective anticipation: instead of talking about feelings of subjective anticipation, we could replace it by questions such as "would I give up a chocolate bar now for one of my copies to have two in these circumstances?"

I then made a post where I applied by current intuitions to the anthropic trilemma, and showed how this results in complete nonsense, despite the fact that I used a bona fide utility function. What we need are some sensible criteria for which to divide utility and probability between copies, and this post is an attempt to figure that out. The approach is similar to expected utility, where a quadruped of natural axioms forced all decision processes to have a single format.

The assumptions are:

  1. No intrinsic value in the number of copies
  2. No preference reversals
  3. All copies make the same personal indexical decisions
  4. No special status to any copy.

continue reading »

The I-Less Eye

30 rwallace 28 March 2010 06:13PM

or: How I Learned to Stop Worrying and Love the Anthropic Trilemma

Imagine you live in a future society where the law allows up to a hundred instances of a person to exist at any one time, but insists that your property belongs to the original you, not to the copies. (Does this sound illogical? I may ask my readers to believe in the potential existence of uploading technology, but I would not insult your intelligence by asking you to believe in the existence of a society where all the laws were logical.)

So you decide to create your full allowance of 99 copies, and a customer service representative explains how the procedure works: the first copy is made, and informed he is copy number one; then the second copy is made, and informed he is copy number two, etc. That sounds fine until you start thinking about it, whereupon the native hue of resolution is sicklied o'er with the pale cast of thought. The problem lies in your anticipated subjective experience.

After step one, you have a 50% chance of finding yourself the original; there is nothing controversial about this much. If you are the original, you have a 50% chance of finding yourself still so after step two, and so on. That means after step 99, your subjective probability of still being the original is 0.5^99, in other words as close to zero as makes no difference.

Assume you prefer existing as a dependent copy to not existing at all, but preferable still would be existing as the original (in the eyes of the law) and therefore still owning your estate. You might reasonably have hoped for a 1% chance of the subjectively best outcome. 0.5^99 sounds entirely unreasonable!

continue reading »

The Moral Status of Independent Identical Copies

32 Wei_Dai 30 November 2009 11:41PM

Future technologies pose a number of challenges to moral philosophy. One that I think has been largely neglected is the status of independent identical copies. (By "independent identical copies" I mean copies of a mind that do not physically influence each other, but haven't diverged because they are deterministic and have the same algorithms and inputs.) To illustrate what I mean, consider the following thought experiment. Suppose Omega appears to you and says:

You and all other humans have been living in a simulation. There are 100 identical copies of the simulation distributed across the real universe, and I'm appearing to all of you simultaneously. The copies do not communicate with each other, but all started with the same deterministic code and data, and due to the extremely high reliability of the computing substrate they're running on, have kept in sync with each other and will with near certainty do so until the end of the universe. But now the organization that is responsible for maintaining the simulation servers has nearly run out of money. They're faced with 2 possible choices:

A. Shut down all but one copy of the simulation. That copy will be maintained until the universe ends, but the 99 other copies will instantly disintegrate into dust.
B. Enter into a fair gamble at 99:1 odds with their remaining money. If they win, they can use the winnings to keep all of the servers running. But if they lose, they have to shut down all copies.

According to that organization's ethical guidelines (a version of utilitarianism), they are indifferent between the two choices and were just going to pick one randomly. But I have interceded on your behalf, and am letting you make this choice instead.

Personally, I would not be indifferent between these choices. I would prefer A to B, and I guess that most people would do so as well.

continue reading »

The Anthropic Trilemma

24 Eliezer_Yudkowsky 27 September 2009 01:47AM

Speaking of problems I don't know how to solve, here's one that's been gnawing at me for years.

The operation of splitting a subjective worldline seems obvious enough - the skeptical initiate can consider the Ebborians, creatures whose brains come in flat sheets and who can symmetrically divide down their thickness.  The more sophisticated need merely consider a sentient computer program: stop, copy, paste, start, and what was one person has now continued on in two places.  If one of your future selves will see red, and one of your future selves will see green, then (it seems) you should anticipate seeing red or green when you wake up with 50% probability.  That is, it's a known fact that different versions of you will see red, or alternatively green, and you should weight the two anticipated possibilities equally.  (Consider what happens when you're flipping a quantum coin: half your measure will continue into either branch, and subjective probability will follow quantum measure for unknown reasons.)

But if I make two copies of the same computer program, is there twice as much experience, or only the same experience?  Does someone who runs redundantly on three processors, get three times as much weight as someone who runs on one processor?

Let's suppose that three copies get three times as much experience.  (If not, then, in a Big universe, large enough that at least one copy of anything exists somewhere, you run into the Boltzmann Brain problem.)

Just as computer programs or brains can split, they ought to be able to merge.  If we imagine a version of the Ebborian species that computes digitally, so that the brains remain synchronized so long as they go on getting the same sensory inputs, then we ought to be able to put two brains back together along the thickness, after dividing them.  In the case of computer programs, we should be able to perform an operation where we compare each two bits in the program, and if they are the same, copy them, and if they are different, delete the whole program.  (This seems to establish an equal causal dependency of the final program on the two original programs that went into it.  E.g., if you test the causal dependency via counterfactuals, then disturbing any bit of the two originals, results in the final program being completely different (namely deleted).)

So here's a simple algorithm for winning the lottery:

continue reading »

Real-Life Anthropic Weirdness

24 Eliezer_Yudkowsky 05 April 2009 10:26PM

In passing, I said:

From a statistical standpoint, lottery winners don't exist - you would never encounter one in your lifetime, if it weren't for the selective reporting.

And lo, CronoDAS said:

Well... one of my grandmothers' neighbors, whose son I played with as a child, did indeed win the lottery. (AFAIK, it was a relatively modest jackpot, but he did win!)

To which I replied:

Well, yes, some of the modest jackpots are statistically almost possible, in the sense that on a large enough web forum, someone else's grandmother's neighbor will have won it. Just not your own grandmother's neighbor.

Sorry about your statistical anomalatude, CronoDAS - it had to happen to someone, just not me.

There's a certain resemblance here - though not an actual analogy - to the strange position your friend ends up in, after you test the Quantum Theory of Immortality.

continue reading »