Restatement of: If you don't know the name of the game, just tell me what I mean to you. Alternative to: Why you must maximize expected utility. Related to: Harsanyi's Social Aggregation Theorem.
Summary: This article describes a theorem, previously described by Stuart Armstrong, that tells you to maximize the expectation of a linear aggregation of your values. Unlike the von Neumann-Morgenstern theorem, this theorem gives you a reason to behave rationally.1
The von Neumann-Morgenstern theorem is great, but it is descriptive rather than prescriptive. It tells you that if you obey four axioms, then you are an optimizer. (Let us call an "optimizer" any agent that always chooses an action that maximizes the expected value of some function of outcomes.) But you are a human and you don't obey the axioms; the VNM theorem doesn't say anything about you.
There are Dutch-book theorems that give us reason to want to obey the four VNM axioms: E.g., if we violate the axiom of transitivity, then we can be money-pumped, and we don't want that; therefore we shouldn't want to violate the axiom of transitivity. The VNM theorem is somewhat helpful here: It tells us that the only way to obey the four axioms is to be an optimizer.2
So now you have a reason to become an optimizer. But there are an infinitude of decision-theoretic utility functions3 to adopt — which, if any, ought you adopt? And there is an even bigger problem: If you are not already an optimizer, than any utility function that you're considering will recommend actions that run counter to your preferences!
To give a silly example, suppose you'd rather be an astronaut when you grow up than a mermaid, and you'd rather be a dinosaur than an astronaut, and you'd rather be a mermaid than a dinosaur. You have circular preferences. There's a decision-theoretic utility function that says
$\mbox{mermaid} \prec \mbox{astronaut} \prec \mbox{dinosaur}$which preserves some of your preferences, but if you have to choose between being a mermaid and being a dinosaur, it will tell you to become a dinosaur, even though you really really want to choose the mermaid. There's another decision-theoretic utility function that will tell you to pass up being a dinosaur in favor of being an astronaut even though you really really don't want to. Not being an optimizer means that any rational decision theory will tell you to do things you don't want to do.
So why would you ever want to be an optimizer? What theorem could possibly convince you to become one?
Stuart Armstrong's theorem
Suppose there is a set (for "policies") and some functions
("values") from
to
. We want these functions to satisfy the following convexity property:
For any policies and any
, there is a policy
such that for all
, we have
.
For policies , say that
is a Pareto improvement over
if for all
, we have
. Say that it is a strong Pareto improvement if in addition there is some
for which
. Call
a Pareto optimum if no policy is a strong Pareto improvement over it.
Theorem. Suppose and
satisfy the convexity property. If a policy in
is a Pareto optimum, then it is a maximum of the function
for some nonnegative constants
.
This theorem previously appeared in If you don't know the name of the game, just tell me what I mean to you. I don't know whether there is a source prior to that post that uses the hyperplane separation theorem to justify being an optimizer. The proof is basically the same as the proof for the complete class theorem and the hyperplane separation theorem and the second fundamental theorem of welfare economics. Harsanyi's utilitarian theorem has a similar conclusion, but it assumes that you already have a decision-theoretic utility function. The second fundamental theorem of welfare economics is virtually the same theorem, but it's interpreted in a different way.
What does the theorem mean?
Suppose you are a consequentialist who subscribes to Bayesian epistemology. And in violation of the VNM axioms, you are torn between multiple incompatible decision-theoretic utility functions. Suppose you can list all the things you care about, and the list looks like this:
- Your welfare
- Your family's welfare
- Everyone's total welfare
- The continued existence of human civilization
- All mammals' total welfare
- Your life satisfaction
- Everyone's average welfare
- ...
Suppose further that you can quantify each item on that list with a function from world-histories to real numbers, and you want to optimize for each function, all other things being equal. E.g.,
is large if
is a world-history where your welfare is great; and
somehow counts up the welfare of all mammals in world-history
. If the expected value of
is at stake (but none of the other values are at stake), then you want to act so as to maximize the expected value of
.4 And if only
is at stake, you want to act so as to maximize the expected value of
. What I've said so far doesn't specify what you do when you're forced to trade off value 1 against value 5.
If you're VNM-rational, then you are an optimizer whose decision-theoretic utility function is a linear aggregation of your values and you just optimize for that function. (The
are nonnegative constants.) But suppose you make decisions in a way that does not optimize for any such aggregation.
You will make many decisions throughout your life, depending on the observations you make and on random chance. If you're capable of making precommitments and we don't worry about computational difficulties, it is as if today you get to choose a policy for the rest of your life that specifies a distribution of actions for each sequence of observations you can make.5 Let be the set of all possible policies. If
, and for any
, let us say that
is the expected value of
given that we adopt policy
. Let's assume that these expected values are all finite. Note that if
is a policy where you make every decision by maximizing a decision-theoretic utility function
, then the policy
itself maximizes the expected value of
, compared to other policies.
In order to apply the theorem, we must check that the convexity property holds. That's easy: If and
are two policies and
, the mixed policy where today you randomly choose policy
with probability
and policy
with probability
, is also a policy.
What the theorem says is that if you really care about the values on that list (and the other assumptions in this post hold), then there are linear aggregations that you have reason to start optimizing for. That is, there are a set of linear aggregations and if you choose one of them and start optimizing for it, you will get more expected welfare for yourself, more expected welfare for others, less risk of the fall of civilization, ....
Adopting one of these decision-theoretic utility functions in the sense that doing so will get you more of the things you value without sacrificing any of them.
What's more, once you've chosen a linear aggregation, optimizing for it is easy. The ratio is a price at which you should be willing to trade off value
against value
. E.g., a particular hour of your time should be worth some number of marginal dollars to you.
Addendum: Wei_Dai and other commenters point out that the set of decision-theoretic utility functions that will Pareto dominate your current policy very much depends on your beliefs. So a policy that seems Pareto dominant today will not have seemed Pareto dominant yesterday. It's not clear if you should use your current (posterior) beliefs for this purpose or your past (prior) beliefs.
More applications
There's a lot more that could be said about the applications of this theorem. Each of the following bullet points could be expanded into a post of its own:
- Philanthropy: There's a good reason to not split your charitable donations among charities.
- Moral uncertainty: There's a good reason to linearly aggregate conflicting desires or moral theories that you endorse.
- Population ethics: There's a good reason to aggregate the welfare or decision-theoretic utility functions of a population, even though there's no canonical way of doing so.
- Population ethics: It's difficult to sidestep Parfit's Repugnant Conclusion if your only desiderata are total welfare and average welfare.
1This post evolved out of discussions with Andrew Critch and Julia Galef. They are not responsible for any deficiencies in the content of this post. The theorem appeared previously in Stuart Armstrong's post If you don't know the name of the game, just tell me what I mean to you.
2That is, the VNM theorem says that being an optimizer is necessary for obeying the axioms. The easier-to-prove converse of the VNM theorem says that being an optimizer is sufficient.
3Decision-theoretic utility functions are completely unrelated to hedonistic utilitarianism.
4More specifically, if you have to choose between a bunch of actions and for all the expected value of
is independent of which actions you take, then you'll choose an action that maximizes the expected value of
.
5We could formalize this by saying that for each sequence of observations , the policy determines a distribution over the possible actions at time
.
Let's walk through a simplified example, and see if we can find the point of disagreement. The primary simplification here is that I'll assume consequentialism, where utilities are mappings from outcomes to reals and the mapping from policies (i.e. a probabilistic collection of outcomes) to reals is the probabilistically weighted sum of the outcome utilities. Even without consequentialism, this should work, but there will be many more fiddly bits.
So, let's suppose that the two of us have a joint pool of money, which we're going to spend on a lottery ticket, which could win one of three fabulous prizes (that we would then jointly own):
We can express the various tickets (which all cost the same, and together we can only afford one) as vectors, like a=(.1,.1,.1,.7), which has a 10% chance of delivering each animal, and a 70% chance of delivering Nothing, or b=(.2,.02,.02,.76), which has a 20% chance of delivering a Koala, 76% chance of Nothing, and 2% chance for each of the Lemur and Macaw. Suppose there are three tickets, and the third is c=(0,.3,.04,.66).
By randomly spinning a wheel to determine which ticket we want to buy, we have access to a convex combination of any of the tickets. If half the wheel points to the a ticket, and the other half points to the b ticket, our final chance of getting any of the animals will be (.15,.6,.6,.73).
Now, before we look at the tickets actually available to us, you and I eat sit down separately and imagine four 'ideal tickets'- (1,0,0,0), (0,1,0,0), (0,0,1,0), and (0,0,0,1). We can express our preferences for those as another vector: mine, V, would be, say, (3;2;1;0). (That means, for example, that I would be indifferent between a Lemur for sure and a half chance of a Koala or a Macaw, because 2=(1+3)/2.) This is a column vector, and we can multiply a*V to get .6, b*V to get .66, and c*V to get .64, which says that I would prefer the b ticket to the c ticket to the a ticket. The magnitude of V doesn't matter, just the direction, and suppose we adjust it so that the least preferred outcome is always 0. I don't know what W, your preference vector, is; it could be any four-vector with non-negative values.
Note that any real ticket can be seen as a convex combination of the ideal tickets. It's a lottery, and so they won't let us just walk up and buy a koala for the price of a ticket, but if they did that'd be my preferred outcome. Instead, I look at the real tickets for sale, right multiply them by my preference column vector, and pick one of the tickets with the highest value, which is the b ticket.
But, the pool of money is partly yours, too; you have some preference ordering W. Suppose it's (2,4,0,1), and so a*W=1.3, b*W=1.24, and c*W=1.86, meaning you prefer c to a to b.
We can think of lots of different algorithms for determining which ticket (or convex combination of tickets) we end up buying. Suppose we want it to be consistent, i.e. there's some preference vector J that describes our joint decision. Any algorithm that doesn't depend on just your and my preference scores for the ticket being considered (suppose you wanted to scratch off our least favorite options until only one is left) will run into problems (how do you scratch off the infinite variety of convex combinations, and what happened to the probabilistic encoding of preferences?), and any function that maps from (V,W) to J that isn't a linear combination of V and W with nonnegative weights on V and W will introduce new preferences that we disagree with (assuming the combination was normed, or you have an affine combination of V and W). Suppose we pick some v and w, such that J=vV+wW; if we pick v=1 and w=1 then J=(5,6,1,1)->(4,5,0,0), a and b have the same score, and c is the clear winner. Note that, regardless of v and w, c will always be preferred to a, and the primary question is whether c or b is preferred, and that a wide range of v and w would lead to c being picked.
So far, we should be in agreement, since we haven't gotten to the issue that I think you're discussing, which sounds like: this is all fine and dandy for a, b, and c, but:
I think that the ideal tickets suggest that 1 isn't a serious concern. We may not have measured v and w very carefully with the tickets we had before, since even a rough estimate is sufficient to pin down our ticket choice (unless we were close to the edge), and we might be near the edge now, but supposing that we measured v and w exactly, we should be able to apply J as before.
I think that 2 is a slightly more serious concern, but I think it can be addressed.
First, we could have some constructive method of picking the weights. You and I, when deciding to pool our money to buy a lottery ticket, might have decided to normalize our preference functions some way and then combine them with weights relative to our financial contribution, or we might decide that your taste in animals is totally better than mine, and so v would be 0 and w 1, or we might decide that I'm better at arm wrestling, and V/w should be 5 after normalization. The outcomes don't play in to the weighting, and so we can be confident in the weights.
Second, we could find the weights with both lotteries in mind. The first lottery will give us an acceptable range for v/w, the second lottery will give us an acceptable range for v/w, and the two should overlap, and so we can pick one from the smaller range that satisfies both. (Is the issue that you're not sure they will overlap?)
Ok, I think what's going on is that we have different ideas in mind about how two people make joint decisions. What I have in mind is something like Nash Bargaining solution or Kalai-Smorodinsky Bargaining Solution (both described in this post), for which the the VNM-equivalent weights do change depending on the set of feasible outcomes. I have to read your comment more carefully and think over your suggestions, but I'm going to guess that there are situations where they do not work or do not make sense, otherwise the NBS and KSBS would not be "the two most popular ways of doing this".