# Expected utility without the independence axiom

John von Neumann and Oskar Morgenstern developed a system of four axioms that they claimed any rational decision maker must follow. The major consequence of these axioms is that when faced with a decision, you should always act solely to increase your expected utility. All four axioms have been attacked at various times and from various directions; but three of them are very solid. The fourth - independence - is the most controversial.

To understand the axioms, let A, B and C be lotteries - processes that result in different outcomes, positive or negative, with a certain probability of each. For 0<p<1, the mixed lottery pA + (1-p)B implies that you have p chances of being in lottery A, and (1-p) chances of being in lottery B. Then writing A>B means that you prefer lottery A to lottery B, A<B is the reverse and A=B means that you are indifferent between the two. Then the von Neumann-Morgenstern axioms are:

- (Completeness) For every A and B either A<B, A>B or A=B.
- (Transitivity) For every A, B and C with A>B and B>C, then A>C.
- (Continuity) For every A>B>C then there exist a probability p with B=pA + (1-p)C.
- (Independence) For every A, B and C with A>B, and for every 0<t≤1, then tA + (1-t)C > tB + (1-t)C.

In this post, I'll try and prove that even without the Independence axiom, you should continue to use expected utility in most situations. This requires some mild extra conditions, of course. The problem is that although these conditions are considerably weaker than Independence, they are harder to phrase. So please bear with me here.

The whole insight in this post rests on the fact that a lottery that has 99.999% chance of giving you £1 is very close to being a lottery that gives you £1 with certainty. I want to express this fact by looking at the narrowness of the probability distribution, using the standard deviation. However, this narrowness is not an intrinsic property of the distribution, but of our utility function. Even in the example above, if I decide that receiving £1 gives me a utility of one, while receiving zero gives me a utility of minus ten billion, then I no longer have a narrow distribution, but a wide one. So, unlike the traditional set-up, we have to assume a utility function as being given. Once this is chosen, this allows us to talk about the mean and standard deviation of a lottery.

Then if you define c(μ) as the lottery giving you a certain return of μ, you can use the following axiom instead of independence:

- (Standard deviation bound) For all ε>0, there exists a δ>0 such that for all μ>0, then any lottery B with mean μ and standard deviation less that μδ has B>c((1-ε)μ).

This seems complicated, but all that it says, in mathematical terms, is that if we have a probability distribution that is "narrow enough" around its mean μ, then we should value it are being very close to a certain return of μ. The narrowness is expressed in terms of its standard deviation - a lottery with zero SD is a guaranteed return of μ, and as the SD gets larger, the distribution gets wider, and the chances of getting values far away from μ increases. So risk, in other words, scales (approximately) with the SD.

We also need to make sure that we are not risk loving - if we are inveterate gamblers for the point of being gamblers, our behaviour may be a lot more complicated.

- (Not risk loving) If A has mean μ>0, then A≤c(μ).

I.e. we don't love a worse rate of return just because of the risk. This axiom can and maybe should be weakened, but it's a good approximation for the moment - most people are not risk loving with huge risks.

*Assume you are going to be have to choose n different times whether to accept independent lotteries with fixed mean β>0, and all with SD less than a fixed upper-bound K. Then if you are not risk loving and n is large enough, you must accept an arbitrarily large proportion of the lotteries.*

Proof: From now on, I'll use a different convention for adding and scaling lotteries. Treating them as random variables, A+B will mean the lottery consisting of A and B together, while xA will mean the same lottery as A, but with all returns (positive or negative) scaled by x.

Let X_{1}, X_{2}, ... , X_{n} be these n independent lotteries, with means β and variances v_{j}. The since the standard deviations are less than K, the variances must be less than K^{2}.

Let Y = X_{1} + X_{2} + ... + X_{n}. The mean of Y is nβ. The variance of Y is the sum of the v_{j}, which is less than nK^{2}. Hence the SD of Y is less than K√(n). Now pick an ε>0, and the resulting δ>0 from the standard deviation bound axiom. For large enough n, nβδ must be larger than K√(n); hence, for large enough n, Y > c((1-ε)nβ). Now, if we were to refuse more that εn of the lotteries, we would be left with a distribution with mean ≤ (1-ε)nβ, which, since we are not risk loving, is worse than c((1-ε)nβ), which is worse than Y. Hence we must accept more than a proportion (1-ε) of the lotteries on offer. **♦**

This only applies to lotteries that share the same mean, but we can generalise the result as:

*Assume you are going to be have to choose n different times whether to accept independent lotteries all with means greater than a fixed β>0, and all with SD less than a fixed upper-bound K. Then if you are not risk loving and n is large enough, you must accept lotteries whose means represent an arbitrarily large proportion of the total mean of all lotteries on offer.*

Proof: The same proof works as before, with nβ now being a lower bound on the true mean μ of Y. Thus we get Y > c((1-ε)μ), and we must accept lotteries whose total mean is greater than (1-ε)μ. **♦**

**Analysis:** Since we rejected independence, we must now consider the lotteries when taken as a whole, rather than just seeing them individually. When considered as a whole, "reasonable" lotteries are more tightly bunched around their total mean than they are individually. Hence the more lotteries we consider, the more we should treat them as if only their mean mattered. So if we are not risk loving, and expect to meet many lotteries with bounded SD in our lives, we should follow expected utility. Deprived of independence, expected utility sneaks in via aggregation.

**Note:** This restates the first half of my previous post - a post so confusingly written it should be staked through the heart and left to die on a crossroad at noon.

**Edit:** Rewrote a part to emphasis the fact that a utility function needs to be chosen in advance - thanks to Peter de Blanc and Nick Hay for bringing this up.

## Comments (65)

BestFolks, please write at least short reviews on technical articles: if someone parsed the math, whether it appears sensible, whether the message appears interesting, and what exactly this message consists in. Also, this article lacks references: is the stuff it describes standard, how does it relate to the field?

This article had an interesting title so I scanned it - but it lacked an abstract, a conclusion, had lots of maths in it - and I haven't liked most of Stuart's other articles - so I gave up on it early.

The article attempts to show that you don't need the independence axiom to justify using expected utility. So I replaced the independence axiom with another axiom that basically says that very thin distribution is pretty much the same as a guaranteed return.

Then I showed that if you had a lot of "reasonable" lotteries and put them together, you should behave approximately according to expected utility.

There's a lot of maths in it because the result is novel, and therefore has to be firmly justified. I hope to explore non-independent lotteries in future posts, so the foundations need to be solid.

*1 point [-]The result is my own work, but the reasoning is not particularly complex, and might well have been done before.

It's kind of a poor man's version of the central limit theorem, for differing distributions.

By this I mean that it's known that if you take the mean of identical independent distributions, it will tend to a narrow spike as the number of distributions increase. This post shows that similar things happen with non-identical distributions, if we bound the variances.

And please do point out any errors that anyone finds!

*1 point [-]The math looks valid - I believe the content is original to Stuart_Armstrong, attempting to show a novel set of preferences which imply expected-value calculation in (suficiently) iterated cases but not in isolated cases.

Edit: For example, an agent whose decision-making criteria satisfy Stuart_Armstrong's criteria might refuse to bet $1 for a 50% chance of winning $2.50 and 50% chance of losing his initial dollar if it were a one-off gamble, but would be willing to make 50 such bets in a row if the odds of winning each were independent. In both cases, the expected value is positive, but only in the latter case is the probable

variationfrom the expected value small enough to overcome the risk aversion.*3 points [-]I think the post is saying "if your preferences are somewhat coupled to the preferences of an expectation maximizer, then in some limit, your preferences match that expectation maximizer."

But so what? Why should your preferences have any relation to a real-valued function of the world? If you satisfy all the axioms, your preferences are

exactlyexpectation-maximizing for a function that vN and M tell you how to build. But if the whole point is to drop one of the axioms, why should you still expect such a function to be relevant?(this has been said elsewhere on the thread, but not too tentatively, and not at the top level.)

*0 points [-]The results are on the "expected" part of expected utility, not on the "utility" part. Independence is overstrong; replacing it with the somewhat coupling to an expectation maximizer is much weaker. And yet in the limit it mimics the expectation requirement, which is very useful result.

(dropping independence completely leaves you flailing all over the place)

*2 points [-]Is the new axiom sufficient to show that the agent cannot be money-pumped?

*0 points [-]It's enough to show that an agent cannot be repeatedly money-pumped. The more opportunities for money pumping, the less chances there are of it succeeding.

Contrast household applicance insurance versus health insurance. Both are a one-shot money-pump, as you get less than your expected utility out of then. An agent following these axioms will probably health-insure, but will not appliance insure.

Can you write out the math on that? To me it looks like the Allais Paradox or a simple variant would still go through. It is easy for the expected variance of a bet to increase as a result of learning additional information - in fact the Allais Paradox describes exactly this. So you could prefer A to B when they are bundled with variance-reducing most probable outcome C, and then after C is ruled out by further evidence, prefer B to A. Thus you'd pay a penny at the start to get A rather than B if not-C, and then after learning not-C, pay another penny to get B rather than A.

I'll try and do the maths. This is somewhat complex without independence, as you have to estimate what the total results of following a certain strategy is, over all the bets you are likely to face. Obviously you can't money pump me if I know you are going to do it; I just combine all the bets and see it's a money pump, and so don't follow it.

So if you tried to money pump me repeatedly, I'd estimate it was likely that I'd be money pumped, and adjust my strategy accordingly.

*0 points [-]I believe SilasBarta has correctly (if that is the word) noted that it does not - it is perfectly possible for an agent to satisfy the new axioms and fall victim to the Allais Paradox.

Edit: correction - he does not state this.

That sounds more like the exact opposite of my position.

I apologize. In the course of conversation with you, I came to that conclusion, but you reject that position.

To summarize my point: if you follow the new axioms, you will act differently in one-shot vs. massive-shot scenarios. Acting like the former in the latter will cause you to be money-pumped, but per the axioms, you never actually do it. So you can follow the new axioms, and still not get money-pumped.

Your axiom talks about expected utility, but you have not defined that term yet.

The post assumes a knowledge of basic statistics throughout - in such a context, the meaning of "expected utility" is transparent.

*1 point [-]Sorry, I meant the definition of utility.

[edit: this should have been a reply to Stuart Armstrong's comment below RobinZ's.]

Utility is the thing you want to maximize in your decision-making.

A decision-maker in general isn't necessarily maximizing anything. Von Neumann and Morgenstern showed that if you satisfy axioms 1 through 4, then you

doin fact take actions which maximize expected utility for some utility function. But this post is ignoring axiom 4 and assuming only axioms 1 through 3. In that case, why should we expect there to be a utility function?Thanks for bringing this up, and I've change my post to reflect your comments. Unfortunately, I have to decree a utility function ahead of time for this to make any sense, as I can change the mean and SD of any distribution by just changing my utility function.

I have a new post up that argues that where small sums are concerned, you have to have a utility function linear in cash.

? This is just the standard definition. The mean of the random variable, when it is expressed in terms of utils.

Should this be specified in the post, or is it common knowledge on this list?

*8 points [-]The Von-Neumann Morgenstern axioms talk just about preference over lotteries, which are simply probability distributions over outcomes. That is you have an unstructured set O of outcomes, and you have a total preordering over Dist(O) the set of probability distributions over O. They do not talk about a utility function. This is quite elegant, because to make decisions you must have preferences over distributions over outcomes, but you don't need to assume that O has a certain structure, e.g. that of the reals.

The expected utility theorem says that preferences which satisfy the first four axioms are exactly those which can be represented by:

A <= B iff E[U;A] <= E[U;B]

for some utility function U: O -> R, where

E[U;A] = \sum{o} A(o) U(o)

However, U is only defined up to positive affine transformation i.e. aU+b will work equally well for any a>0. In particular, you can amplify the standard deviation as much as you like by redefining U.

Your axioms require you to pick a particular representation of U for them to make sense. How do you choose this U? Even with a mechanism for choosing U, e.g. assume bounded nontrivial preferences and pick the unique U such that \sup{x} U(x) = 1 and \inf{x} U(x) = 0, this is still less elegant than talking directly about lotteries.

Can you redefine your axioms to talk only about lotteries over outcomes?

Alas no. I've changed my post to explain the difficulties as I can change the mean and SD of any distribution by just changing my utility function.

I have a new post up that argues that where small sums are concerned, you have to have a utility function linear in cash.

You started out by assuming a preference relation on lotteries with various properties. The completeness, transitivity, and continuity axioms talk about this preference relation. Your "standard deviation bound" axiom, however, talks about a utility function. What utility function?

*4 points [-]You are absolutely correct, and it pains me because this issue should have been settled a long time ago.

When Eliezer Yudkowsky first brought up the breakdown of independence in humans, way, way back during the discussion of the Allais Paradox, the poster "Gray Area" explained why people

aren'tbeing money-pumped, even though they violate independence. He/she came to the same conclusion in the quote above.Here's what Gray Area said back then:

I didn't see anyone even reply to Gray Area anywhere in that series, or anytime since.

So I bring up essentially the same point whenever Eliezer uses the Allais result, always concluding with a zinger like:

If getting lottery tickets is being exploited, I don't want to be empowered.Please, folks, stop equating a hypothetical money pump with the actual scenario.

*3 points [-]The Allais Paradox is

not aboutrisk aversion or lack thereof; it's about people's decisions beinginconsistent. There are definitely situations in which you would want to choose a 50% chance of $1M over a 10% chance of $10M. However, if you would do so, you shouldalsothen choose a 5% chance of $1M over a 1% chance of $10M, because the relative risk is the same. See Eliezer's followup post, Zut Allais.Turning a person into a money pump also isn't about playing the same gamble a zillion times (as any good investor will tell you, if you play the gamble a zillion times, all the risk disappears and you're left with only expected return, which leaves you with a different problem). The money pump works thusly: I sell you gamble A for $5. You then trade with me gamble A for gamble B. You then sell me back gamble B for $4. I then sell you gamble A for $5... wash, rinse, repeat. Nowhere in the cycle is either gamble actually paid out.

*2 points [-]Are you sure you're responding to the right person here?

1) I

wasn'tclaiming that Allais is about risk aversion.2) I

wasclaiming it doesn't show an inconsistency (and IMO succeeded).3) I

didread Zut Allais, and the other Allais article with the other ridiculous French pun, and it wasn't responsive to the point that Gray Area raised. (You may note that a strapping lad named "Silas" even noted this at the time.)4) You cannot substantiate the charge that you

shoulddo the latter if you did the former, since no negative consequence actually results from violating that "should" in the one-shot case. You know, the one people were actually tested on.ETA:(I think the second paragraph was just added in tommccabe's post.)My point never hinged on it being otherwise.

Okay, and where in the Allais experiment did it permit any of those exchanges to happen? Right, nowhere.

Believe it or not, when I say, "I prefer B to A", it doesn't mean "I hereby legally obligate myself to redeem on demand any B for an A", yet your money pump requires that.

*2 points [-]The problem is that you're losing money doing it

once.You would agree that c(0) > c(-2), yes? If they are willing totrade A for Bin a one-shot game, they shouldn't be willing topay more for A than for Bin a one-shot - you don't trade the more valuable item for the less valuable. That their preferences may reverse in the iterated situation has no bearing on the Allais problem.Edit: The text above following the question mark is incorrect. See my later comment quoting Eliezer for the correct statement.

*1 point [-]Again, if suddenly being offered the choice of 1A/1B then 2A/2B as described here, but being "inconsistent", is what you call "losing money", then I don't want to

gainmoney!But that's not what's happening the paradox. They're (doing something isomorphic to) preferring A to B

onceand then p*B to p*Aonce. At no point do they "pay" more for B than A while preferring A to B. At no point does anyone make or offer the money-pumping trades with the subjects,nor have they obligated themselves to do so!*1 point [-]Consider Eliezer's final remarks in The Allais Paradox (I link purely for the convenience of those coming in in the middle):

You're right insofar as Eliezer invokes the Axiom of Independence when he resolves the Allais Paradox using expected value; I do not yet see any way in which Stuart_Armstrong's criteria rule out the preferences (1A > 1B)u(2A < 2B). However, in the scenario Eliezer describes, an agent with those preferences either loses one cent or two cents relative to the agent with (1A > 1B)u(2A > 2B).

Your preferences between A and B might reasonably change if you

actually receivethe money from either gamble, so that you have more money in your bank account now than you did before. However, that'snotwhat's happening; the experimenter can use you as a money pump without ever actually paying out on either gamble.Yes, I know that a money pump doesn't involve doing the gamble itself. You don't have to repeat yourself, but apparently, I do have to repeat myself when I say:

The money pump

doesrequire that the experimenter makeactualfuther trades with you, not just imagine hypothetical ones. The subjects didn't make these trades, and if they saw many more lottery tickets potentially coming into play, so as to smooth out returns, they would quickly revert to standard EU maximization, as predicted by Armstrongs's derivation.*1 point [-]"Potentially coming into play, so as to smooth out returns" requires that there be the possibility of the subject actually

takingmore than one gamble, which never happens. If you mean that people might get suspicious after the tenth time the experimenter takes their money and gives them nothing in return, and thereafter stop doing it, I agree with you; however, all this proves is that making the original trade was stupid, and that people are able to learn to not make stupid decisions given sufficient repetition.The

possibilityhas to happen, if you're cycling all these tickets through the subject's hands. What, are they fake tickets that can't actually be used now?There are factors that come into play when you get to do lots of runs, but aren't present with only one run. A subject's choice in a one-shot scenario does not imply that they'll make the money-losing trades you describe. They

might, but you would have to actually test it out. They don't become irrational until such a thing actually happens."What, are they fake tickets that can't actually be used now?"

No, they're just the

sametickets. There's only ever one of each. If I sell you a chocolate bar, trade the chocolate bar for a bag of Skittles, buy the bag of Skittles, and repeat ten thousand times, this does not mean I have ten thousand of each; I'm just re-using the same ones."They might, but you would have to actually test it out. They don't become irrational until such a thing actually happens."

We did test it out, and yes, people did act as money pumps. See The Construction of Preference by Sarah Lichtenstein and Paul Slovic.

*0 points [-]"1) I wasn't claiming that Allais is about risk aversion."

The difference between your preferences over choosing lottery A vs. lottery B when both are performed a million times, and your preferences over choosing A vs. B when both are performed once,

isa measurement of your risk aversion; this is what Gray Area was talking about, is it not?"Believe it or not, when I say, "I prefer B to A", it doesn't mean "I hereby legally obligate myself to redeem on demand any B for an A""

Then you must be using a different (and, I might add, quite unusual) definition of the word "preference". To quote dictionary.com:

pre⋅fer /prɪˈfɜr/ [pri-fur] –verb (used with object), -ferred, -fer⋅ring. 1. to set or hold before or above other persons or things in estimation; like better;

choose rather than: to prefer beef to chicken.What does it mean to say that you prefer B to A, if you wouldn't trade B for A if the trade is offered? Could I say that I prefer torture to candy, even if I always choose candy when the choice is offered to me?

Typo: Did you mean "prefer A to B"?

*0 points [-]I prefer B to A does not imply I prefer 10B to 10A, or even I prefer 2B to 2A. Expected utility != expected return.

I agree pretty much completely with Silas. If you want to prove that people are money pumps, you need to actually get a random sample of people and then actually pump money out of them. You can't just take a single-shot hypothetical and extrapolate to other hypotheticals when the whole issue is how people deal with the variability of returns.

Strictly speaking, Eliezer's formulation of the Allais Paradox is not the one that has been experimentally tested. I believe a similar money pump can be implemented for the canonical version, however -- and Zut Allais! shows that people can be turned into money pumps in other situations.

"I prefer B to A does not imply I prefer 10B to 10A, or even I prefer 2B to 2A. Expected utility != expected return."

Of course, but, as I've said (I think?) five times now, you

never actually get2B or 2A at any point during the money-pumping process. You go from A, to B, to nothing, to A, to B... etc.For examples of Vegas gamblers actually having money pumped out of them, see The Construction of Preference by Sarah Lichtenstein and Paul Slovic.

*-1 points [-]No, it's not, and the problem asserted by Allais paradox is that the utility function is inconsistent, no matter what the risk preference.

I don't see anything in there that about how many times the choice has to happen, which is the very issue at stake.

If there's any unusualness, it's definitely on your side. When you buy a chocolate bar for a dollar, that "preference of a chocolate bar to a dollar" does not somehow mean that you are willing to trade every dollar you have for a chocolate bar, nor have you legally obligated yourself to redeem chocolate bars for dollars on demand (as a money pump would require), nor does anyone expect that you will trade the rest of your dollars this way.

It's called diminishing marginal utility. In fact, it's called marginal analysis in general.

It means you would trade B for A on the

nextopportunity to do so, not that you would indefinitely do it forever, as the money pump requires.*2 points [-]"When you buy a chocolate bar for a dollar, that "preference of a chocolate bar to a dollar" does not somehow mean that you are willing to trade every dollar you have for a chocolate bar, nor have you legally obligated yourself to redeem chocolate bars for dollars on demand (as a money pump would require), nor does anyone expect that you will trade the rest of your dollars this way."

Under normal circumstances, this is true, because the situation has changed after I bought the chocolate bar: I now have an additional chocolate bar, or (more likely) an additional bar's worth of chocolate in my stomach. My preferences change, because the situation has changed.

However, after you have bought A, and swapped A for B, and sold B, you have not gained anything (such as a chocolate bar, or a full stomach), and you have not lost anything (such as a dollar); you are in precisely the same position that you were before. Hence, consistency dictates that you should make the same decision as you did before. If, after buying the chocolate bar, it fell down a well, and another dollar was added to my bank account because of the chocolate bar insurance I bought, then yes, I should keep buying chocolate bars forever if I want to be consistent (assuming that there is no cost to my time, which there essentially isn't in this case).

Comment deleted29 October 2009 02:59:17PM*[+] (9 children)*1 point [-]I actually think that (for some examples) it's actually simpler than that. The Allais paradox assumes that the proposal of the bet itself has no effect on the utility of the proposee. In reality, if I took a 5% chance at $100M, instead of a 100% chance at $4M, there's a 95% chance I'd be kicking myself every time I opened my wallet for the rest of my life. Thus, taking the bet and losing is significantly worse than never having the bet proposed at all. If this is factored in correctly, EY's original formulation of the Allais Paradox is no longer functional: I prefer certainty, because losing when certainty was an option carries lower utility than never having bet.

This is more about how you calculate outcomes than it is about independence directly. If losing when you could have had a guaranteed (or nearly-guaranteed) win carries negative utility, and if you can only play once, it does not seem like it contradicts independence.

Glad this formulation is useful! I do indeed think that people often behave like you describe, without generally losing huge sums of cash.

However, the conclusion of my post is that it

isirational to deviate from expected utility for small sums. Agregating every small decision you make will give you expected utility.*0 points [-]It's a good result, but I wonder if the standard deviation is the best parameter. Loss-averse agents react differently to asymmetrical distributions allowing large losses than those allowing large gains.

Edit: For example, the mean of an exponential distribution f(x;t) = L * e^(-L*x) has mean and standard deviation 1/L, but a loss-averse agent is likely to prefer it to the normal distribution N(1/L, 1/L^2), which has the same mean and standard deviation.

Once you abanndon independence, the possibilities are litteraly infinite - and not just easily controllable infinities, either. I worked with SD as that's the simplest model I could use; but skewness, kurtosis or, Bayes help us, the higher moments, are also valid choices.

You just have to be careful that your choice of units is consistent; the SD and the mean are in the same unit, the variance is in units squared, the skewness and kurtosis are unitless, the k-th moment is in units to the power k, etc...

*0 points [-]That's true - and it occurred to me after I posted the comment that your criteria don't define the decision system

anyway, so even using some other method you might still be able to prove that it meets your conditions.*1 point [-]See also semivariance in the context of investment (and betting in general). NB: "semivariance" has a different meaning in the context of spatial statistics.

"The mean of Y is nβ. The variance of Y is the sum of the vj, which is less than nK2." Been a while for me, but doesn't this require the lotteries to be uncorrelated? If so, that should be listed with your axioms.

*2 points [-]It requires the lotteries to be independent, which implies uncorrelated. Stuart_Armstrong specified independence.

Ugh, color me stupid - I assumed the "independence" we were relaxing was probability-related. Thanks RobinZ.

*1 point [-]You know, I didn't even realise I'd used "independence" both ways! Most of the time, it's only worth pointing out the fact if the random variables are

notindependent.No problem. (Don't you love it when people use the same symbol for multiple things in the same work? I know as a mechanical engineer, I got so much

joyfrom remembering which "h" is the heat transfer coefficient and which is the height!)Comment deleted28 October 2009 04:23:36PM [-]