I believe this assumption typically comes from the Von Neumann–Morgenstern utility theorem, which says that, if your preferences are complete, transitive, continuous, and independent, then there is some utility function such that your preferences are equivalent to "maximize expected ".
Those four assumptions have technical meanings:
In my opinion, the continuity assumption is the one most likely to be violated (in particular, it excludes preferences where "no chance of any X is worth any amount of Y"), so these aren't necessarily a given, but if you do satisfy them, then there's some utility function that describes your preferences by maximizing it.
There is a problem with completeness that requires studying the actual theorem and its construction of utility from preference. The preference function does not range over just the possible “outcomes” (which we suppose are configurations of the world or of some local part of it). It ranges over lotteries among these outcomes, as explicitly stated on the VNM page linked above. This implies that the idea of being indifferent between a sure gain of 1 util and a 0.1% chance of 1000 utils is already baked into the setup of these theorems, even before the proof constructs the utility function. A theorem cannot be used to support its assumptions.
The question, “Why should I maximise expected utility (or if not, what should I do instead)?” is a deep one which I don’t think has ever been fully answered.
The argument from iterated gambles requires that the utility of the gambles can be combined linearly. The OP points out that this is not the case if utility is nonlinear in the currency of the gambles.
Even when the utilities do combine linearly, there may be no long run. An example now well-known is one constructed by Ole Peters, in which the longer you play, the greater the linearly expected ...
Complete is also in question for any real-world application, because it implies consistent-over-time.
It seems indeed quite reasonable to maximize utility if you can choose an option that makes it possible, my point is why you should maximize expected utility when the choice is under uncertainty
This is not a complete answer, it's just a way of thinking about the matter that was helpful to me in the past, and so might be to you too:
Saying that you ought to maximise the expected value of a real valued function of everything still leaves a huge amount of freedom; you can encode what you want by picking the right function over the right things.
So you can think of it as a language: a conventional way of expressing decision strategies. If you can write a decision strategy as , then you have written the problem in the language of utility.
Like any generic language, this won't stop you from expressing anything in general, but it will make certain things easier to express than others. If you know at least two languages, you'll have sometimes encountered short words that can't be efficaciously translated to a single word in the other language.
Similarly, thinking that you ought to maximise expected utility, and then asking "what is my utility then?", naturally suggests to your mind certain kinds of strategies rather than others.
Some decisions may need many epicycles to be cast as utility maximisation. That this indicates a problem with utility maximisation, with the specific decision, or with the utility function, is left to your judgement.
There is currently not a theory of decision that just works for everything, so there is not a totally definitive argument for maximum expected utility. You'll have to learn when and how you can not apply it with further experience.
Thank you for your insight. The problem with this view of utility "just as a language" is that sometimes I feel that the conclusion of utility maximization are not "rational" and I cannot figure out why they should be indeed rational if the language is not saying anything that is meaningful to my intuition.
In the context of decision making under uncertainty we consider the strategy of maximizing the expected monetary revenue and expected utility; we provide an argument to show that under certain hypothesis it is rational to maximize expected monetary revenue, then we show that the argument doesn't apply to expected utility. We are left with the question about how do we justify the rationality of the strategy of maximizing expected utility.
Expected monetary revenue
Suppose you have to choose one of two games A and B with an expected economic return of 1$ and 2$ respectively, which have a certain probability distribution.
If you play many times, say N, the law of large numbers and the central limit theorem might become relevant, and your probability distribution for the repetition of A and B will have their masses more and more sharply separated around N and 2N respectively.
At this point, it is clear that it is better for you to play B many times than to play A many times. You can predict this in advance by calculating the expected winnings of A and B. So assuming you can make "lots" of choices between A and B, you must prefer the one with the higher expected profit. But what if you can only make one choice? If the distributions of A and B overlap, does the expected profit still matter?
Even if you only have to make this particular choice once, it could be one of a long sequence of many different choices, not always between A and B, but between other sets of options. Even if the choices are different, we can still manage to take advantage of the LLN and the central limit theorem. Suppose we have a large number of choices between two games:
We can hope that again if you always choose the game with the higher expected return the statistical randomness will be increasingly irrelevant over time for large N and you will store a larger amount of money compared to a person with a different strategy.
In order for this hope to actually happen we need that the incomes of the games Ak and Bk are uniformly bounded and "small enough" compared to N. This could not happen if for example:
This could indeed happen if all the games have revenues that lie inside a "bounded" range of values while N is "big enough" compared to this range.
The elements that will make it happen are:
To summarize: the strategy to always choose the option that maximize the expected monetary revenue is indeed rational if you can store or spend all the money and the choice is part of a set of many choices which have a small mean and standard deviation compared to the total amount of money that is expected to be stored in total.
Expected utility
The effect of adding new money to our wealth can be different depending on the wealth we already have at the present moment (10'000 $ can be much more meaningful to someone who has no money than to a millionaire). We model this fact by defining a "utility" function which represents the impact of the extra money on the life of a specific person. We could expect this utility function U(x) to be increasing with the amount of money x (the more the better) and its slope to be always decreasing (the richer we are the less we care about 1 more dollar), like this:
A rational agent will not be expected to maximize his monetary gain but rather his "utility" gain, if he can. But what happens when he has to make a choice under uncertainty with this new target?
Suppose we have to choose to play game A or game B with some probability distributions for the revenues. In analogy to what we said in the previous paragraph about monetary revenue we could be tempted to say that instead of chosing the game with me maximum expected monetary revenue we choose the one with the maximum expected utility.
Example: if the game A can make us win 1$ with probability 0.5, or otherwise nothing, and game B can make us win $2 with probability 0.3, or otherwise nothing we are no longer interested in the expected revelue, that is:
E(A)=0.5×1=0.5, E(B)=0.3×2=0.6
we do not chose B because 0.6>0.5, we compute the expected utilities
E(U(A))=0.5×U(1), E(U(B))=0.2×U(2)
So if we have for example U(x)=√x (which is a function with a shape like in the graph above) then we have E(U(A))=0.5 and E(U(B))≈0.42 and therefore we choose A.
But what if we have to make the same choice more than one time? Now the difference between monetary revenue and utility becomes more extreme:
So basically there is no point in computing expected utility of every single choice, if you have to make a sequence of choices: you actually need to compute the expected utility of every possible sequence of choices and then choose the sequence of choices that makes the expected utility as big as possible.
But how did we came to the conclusion that it was meaningful to compare expected utilities in order to make a rational decision? Because we derived a similar conclusion about expected monetary revenue and utility looked like a refinement of that line of thought. But expected monetary revenue behaved very differently! It was additive, this allowed us to make use of central limit theorem which was crucial to justify the value of the expectation in making the decision under certain specific circumstances. The situation with expected utility is completely different, we cannot reproduce the argument above to justify the value of expected utility.
So here is the question we are left with: why a rational agent should maximize expected utility?