Anthropic Decision Theory VI: Applying ADT to common anthropic problems

Stuart_Armstrong

A near-final version of my Anthropic Decision Theory paper is available on the arXiv. Since anthropics problems have been discussed quite a bit on this list, I'll be presenting its arguments and results in this and previous posts 1 2 3 4 5 6.

Having presented ADT previously, I'll round off this mini-sequence by showing how it behaves with common anthropic problems, such as the Presumptuous Philosopher, Adam and Eve problem, and the Doomsday argument.

The Presumptuous Philosopher

The Presumptuous Philosopher was introduced by Nick Bostrom as a way of pointing out the absurdities in SIA. In the setup, the universe either has a trillion observers, or a trillion trillion trillion observers, and physics is indifferent as to which one is correct. Some physicists are preparing to do an experiment to determine the correct universe, until a presumptuous philosopher runs up to them, claiming that his SIA probability makes the larger one nearly certainly the correct one. In fact, he will accept bets at a trillion trillion to one odds that he is in the larger universe, repeatedly defying even strong experimental evidence with his SIA probability correction.

What does ADT have to say about this problem? Implicitly, when the problem is discussed, the philosopher is understood to be selfish towards any putative other copies of himself (similarly, Sleeping Beauty is often implicitly assumed to be selfless, which may explain the diverge of intuitions that people have on the two problems). Are there necessarily other similar copies? Well, in order to use SIA, the philosopher must believe that there is nothing blocking the creation of presumptuous philosophers in the larger universe; for if there was, the odds would shift away from the larger universe (in the extreme case when only one presumptuous philosopher is allowed in any universe, SIA finds them equi-probable). So the expected number of presumptuous philosophers in the larger universe is a trillion trillion times greater than the expected number in the small universe.

Now if the philosopher is indeed selfish towards his copies, then ADT reduces to SSA-type behaviour: the philosopher will correctly deduce that in the larger universe, the other trillion trillion philosophers or so will have their decision linked with his. However, he doesn’t care about them: any benefit that accrue to them are not of his concern, and so if he correctly guesses that he resides in the larger universe, he will accrue a single benefit. Hence there will be no paradox: he will bet at 1:1 odds of residing in either the larger or the smaller universe.

If the philosopher is an altruistic total utilitarian, on the other hand, he will accept bets at odds of a trillion trillion to one of residing in the larger universe. But this no longer counter-intuitive (or at least, no more counter-intuitive than maximising expect utility with very small probabilities): the other presumptuous philosophers will make the same bet, so in the larger universe, their total profit and loss will be multiplied by a trillion trillion. And since the philosopher is altruistic, the impact on his own utility is multiplied by a trillion trillion in the large universe, making his bets rational.

At this point, it might be fair to ask what would happen if some of the philosophers were altruistic while others were selfish. How would the two interact; would the selfless philosopher be incorrectly believing his own decision was somehow ‘benefiting’ the selfish ones? Not at all. The decisions of the selfless and selfish philosophers are not linked: they both use ADT, but because they have very different utilities, they cannot prove that their decisions are linked. Which is fortunate, because they aren’t.

Adam and Eve

The presumptuous philosopher thought experiment was designed to show up problems with SIA reasoning. Another thought experiment by the same author as designed to show problems with SSA reasoning.

In this thought experiment, Adam and Eve, the first two humans, have the opportunity to breed or not to breed. If they breed, they will produce trillions of trillions of descendants. Under SSA odds, the probability of being Adam or Eve in a universe with trillions of trillion humans is tiny, while the corresponding probability in a universe with just two observers is one. Therefore Adam and Eve should conclude that whatever they do, it is tremendously unlikely that they will succeed in breeding. Nick Bostrom them proceeds to draw many amusing consequences from this ‘future affecting’ paradox, such as the couple forming the firm intention of having sex (and hence risking pregnancy) if an edible animal doesn’t wander through the entrance of their cave in the next few minutes. There seems to be something very wrong with the reasoning, but it is a quite natural consequence of SSA.

What does ADT have to say on the problem? This depends on whether the decisions of Adam and Eve and their (potential) descendants are linked. There is prima facia no reason for this to be the case; in any case, how could potential future descendants make the decision as to whether Adam and Eve should breed? One way of imagining this is if each human is born fully rational, aware of the possible world they is born into, but in ignorance as to their identity and position in that world. Call this the ignorant rational baby stage. They can then make a decision as to what they would do, conditional upon discovering their identity. They may then decide or not to stick to their decision. Hence we can distinguish several scenarios:

These agents have no ‘ignorant rational baby stage’, and do not take it into account.
These agents have an ‘ignorant rational baby stage’, but do not allow precommitments.
These agents have an ‘ignorant rational baby stage’, but do allow precommitments.
These agents have no ‘ignorant rational baby stage’, but require themselves to follow the hypothetical precommitments they would have made had they had such a stage.

To make this into a decision problem, assume all agents are selfish, and know they will be confronted by a choice between a coupon C₁ that pays out £1 to Adam and Eve if they have no descendants and C₂ that pays out £1 to Adam and Eve if they have (trillions of trillions of) descendants. Assume Adam and Eve will have sex exactly once, and the (objective) chance of them having a successful pregnancy is 50%. Now each agent must decide on the relative values of the two coupons.

Obviously in situation 1, the decisions of Adam and Eve and their descendants are not linked, and ADT means that Adam and Eve will value C₁ compared with C₂ according to their objective estimates of having descendants, i.e. they will value then equally. There is no SSA-like paradox here. Their eventual descendants will also value the coupons as equally worthless, as they will never derive any value from them.

Now assume there is a ‘ignorant rational baby stage’. During this stage, the decisions of all agents are linked, as they have the same information, the same (selfish) preferences, and they all know this. Each rational baby can then reason:

“If I am in the world with no descendants, then I am Adam or Eve, and the C₁ coupon is worth £1 to me (and C₂ is worthless). If, on the other hand, I am in the world with trillions of trillions of descendants, there is only two chances in 2+10²⁴ of me being Adam or Eve, so I value the C₂ coupon at £2/(2+10²⁴) (and C₁ is worthless). These worlds are equiprobable. So I would value C₁ as being 1+0.5 x 10²⁴ times more valuable than C₂.”

So the rational babies in situations 2 and 3 would take C₁ as much more valuable than C₂, if the deal was proposed to them immediately. Since there are no precommitments in situation 2, once the rational babies discover who they actually are, they would revert to situation 1 and take them as equally valuable. If precommitments are allowed, then the rational babies would further reason:

“The strategy ‘upon discovering I am Adam or Eve, take C₁’ nets me an expected £1/2, while the strategy ‘upon discovering I am Adam or Eve, take C₂’ nets me an expected £1/(2+10²⁴), because it is very unlikely that I would actually discover that I am Adam and Eve. Hence the strategy ‘upon discovering I am Adam or Eve, accept trades between C₁ and C₂ at 2:(2+10²⁴) ratios’ is neutral in expected utility, and so I will now precommit to accepting any trade at any ratios slightly better than this.”

So in situation 3, even after discovering that they are Adam or Eve, they will continue to accept deals at ratios that would seem to imply that they believe in the SSA odds, i.e. that they are nearly certain to not have descendants. But it is a lot less of a paradox now; it simply arises because there was a time when they were uncertain as to what their actual position was, and the effects of this uncertainty were ‘locked in’ by their precommitment.

Situation 4 is very interesting. By construction, it reproduces the seemingly paradoxical behaviour, but here there was never a rational baby stage where the behaviour made sense. Why would any agent follow such a behaviour? Well, mainly because it allows trade between agents who might not otherwise be able to agree on a ‘fair’ distribution of goods. If all agents agree to the division that they would have wanted had they been ignorant of their identity (a ‘rawlsian veil of ignorance’ situation), then they can trade between each other without threats or bargaining in these simple cases.

If the agents are simply altruistic average utilitarians, then Adam and Eve would accept SSA odds in all four situations; things that benefit them specifically are weighted more highly in a universe with few people. So the varying behaviour above is a feature of selfishness, not of SSA-type behaviour, and it seems precommitments become very important in the selfish case. This certainly merits further study. Temporal consistency is a virtue, but does it extend to situations like this, where the agent makes binding decisions before knowing their identity? Certainly if the agent had to make a decision immediately, and if there were anyone around to profit from temporal inconsistencies, the agent should remain consistent, which means following precommitments. However this is not entirely obvious that it should still be the case if there were no-one to exploit the inconsistency.

This is not, incidentally, a problem only of ADT - SIA has similar problem under the creation of ‘irrelevant’ selfless agents who don’t yet know who they are, while SSA has problems under the creation of agents who don’t yet know what reference class they are in.

The Doomsday argument

Closely related to the Adam and Eve paradox, though discovered first, is the Doomsday argument. Based on SSA’s preference for ‘smaller’ universes, it implies that there is a high probability of the human race becoming extinct within a few generations - at the very least, a much higher probability than objective factors would imply.

Under SIA, the argument goes away, so it would seem that ADT must behave oddly: depending on the selfishness and selflessness of the agents, they would give different probabilities to the extinction of the human race. This is not the case, however. Recall that under ADT, decisions matter, not probabilities. And agents that are selfish or average utilitarians would not be directly concerned with the extinction of the human race, so would not act in bets to prevent this.

This is not a specious point - there are ways of constructing the doomsday argument in ADT, but they all rely on odd agents who are selfish with respect to their own generation but selfless with respect to the future survival of the human race. This lacks the potency of the original formulation: having somewhat odd agents behaving in a somewhat odd fashion is not very surprising. For the moment, until a better version is produced, we should simply say that the doomsday argument is not present in ADT.

Sleeping Anti-Beauty

Sleeping Anti-Beauty is a thought experiment similar to the Sleeping Beauty experiment, but with one important caveat: the two copies in the tails world hate each other. This works best if the two copies are duplicates, rather than the same person at different times. One could imagine, for instance, that a week after the experiment, all copies of Sleeping Beauty are awakened and made to fight to the death - maybe they live in a civilization that prohibits more than one copy of an agent from existing. The single copy in the heads world will be left unmolested, as she has nobody to fight.

That means that the Sleeping Beauties in the tail world are in a zero sum game; any gain for one is a loss for the other, and vice-versa. Actually, we need a few more assumptions for this to be true: the Sleeping Beauties have to be entirely selfish apart from their rivalry, and they do not get offered any goods for immediate consumption. Given all these assumptions, what does ADT have to say about their decisions?

As usual, all existent copies of Sleeping Beauty are offered a coupon that pays out £1 if the coin fell tails, and asked how much she would be willing to give for that. ADT reasoning proceeds for each agent as follows:

“In the heads world, if I pay £x, all that happens is that I lose £x. In the tails, world, if I pay £x, I gain £(1-x). However my other hated copy will make the same decision, and also gain £(1-x). This causes me the same amount of loss as the gain of £(1-x) does, so I gain nothing at all in the tails world, whatever £x is. So I value the coupon precisely at zero: I would not pay any amount to get it.”

In this, and other similar decisions, Sleeping Beauty would act as if she had an absolute certainty of being in the heads world, offering infinity to one odds of this being the case, as she cannot realise any gains - or losses! - in the tails world.

It should be noted that selfish Sleeping Beauty can be correctly modelled by seeing it as a 50-50 mix of selfless Sleeping Beauty and Sleeping Anti-Beauty.

[-]turchin11y00

Do you mean that DA does not matter, because we still have to invest in x-risks prevention, if we are selfless, or partying if we are selfish? But I could suggest a class of DA with doom very soon, in 20 years. If it true, I have to invest more in prevention of it as it would mean my death? (This version of DA is about the one, where the only intelligent observers from which I am randomly chosen are the observers which knows DA. This class appeared only in 1983 and will probably end in 21 century).

[This comment is no longer endorsed by its author]Reply

moved to more recent post