A near-final version of my Anthropic Decision Theory paper is available on the arXiv. Since anthropics problems have been discussed quite a bit on this list, I'll be presenting its arguments and results in this and subsequent posts 1 2 3 4 5 6.
Many thanks to Nick Bostrom, Wei Dai, Anders Sandberg, Katja Grace, Carl Shulman, Toby Ord, Anna Salamon, Owen Cotton-barratt, and Eliezer Yudkowsky.
The Sleeping Beauty problem, and the incubator variant
The Sleeping Beauty problem is a major one in anthropics, and my paper establishes anthropic decision theory (ADT) by a careful analysis it. Therefore we should start with an explanation of what it is.
In the standard setup, Sleeping Beauty is put to sleep on Sunday, and awoken again Monday morning, without being told what day it is. She is put to sleep again at the end of the day. A fair coin was tossed before the experiment began. If that coin showed heads, she is never reawakened. If the coin showed tails, she is fed a one-day amnesia potion (so that she does not remember being awake on Monday) and is reawakened on Tuesday, again without being told what day it is. At the end of Tuesday, she is put to sleep for ever. This is illustrated in the next figure:
The incubator variant of the problem, due to Nick Bostrom, has no initial Sleeping Beauty, just one or two copies of her created (in different, identical rooms), depending on the result of the coin flip. The name `incubator' derived from the machine that was to do the birthing of these observers. This is illustrated in the next figure:
The question then is what probability a recently awoken or created Sleeping Beauty should give to the coin falling heads or tails and it being Monday or Tuesday when she is awakened (or whether she is in Room 1 or 2).
Selfishness, selflessness and altruism
I will be using these terms in precise ways in ADT, somewhat differently from how they are usually used. A selfish agent is one whose preferences are only about their own personal welfare; a pure hedonist would be a good example. A selfless agent, on the other hand is one that cares only about the state of the world, not about their own personal welfare - or anyone else's. They might not be nice (patriots are - arguably - selfless), but they do not care about their own welfare as a terminal goal.
Altruistic agents, on the other hand, care about the welfare of everyone, not just themselves. These can be divided into total utilitarians, and average utilitarians (there are other altruistic motivations, but they aren't relevant to the paper). In summary:
Selfish | "Give me that chocolate bar" |
---|---|
Selfless | "Save the rainforests" |
Average Utilitarian | "We must increase per capita GDP" |
Total Utilitarian | "Every happy child is a gift to the world" |
Yes.
But notice that you need the three elements - utility function, probabilities and impact of decision - in order to figure out the decision. So if you observe only the decision, you can't get at any of the three directly.
With some assumptions and a lot of observation, you can disentangle the utility function from the other two, but in anthropic situations, you can't generally disentangle the anthropic probabilities from the impact of decision.
Given only the decisions, you can't disentangle the probability from the utility function anyhow. You'd have to do something like ask nicely about the agent's utility or probability, or calculate from first principles, to get the other. So I don't feel like the situation is qualitatively different. If everything but the probabilities can be seen as a fixed property of the agent, the agent has some properties, and for each outcome it assigns some probabilities.