The idea is to compare not the results of actions, but the results of decision algorithms. The question that the agent should ask itself is thus:
"Suppose everyone1 who runs the same thinking procedure like me uses decision algorithm X. What utility would I get at the 50th percentile (not: what expected utility should I get), after my life is finished?"
Then, he should of course look for the X that maximizes this value.
Now, if you formulate a turing-complete "decision algorithm", this heads into an infinite loop. But suppose that "decision algorithm" is defined as a huge table for lots of different possible situations, and the appropriate outputs.
Let's see what results such a thing should give:
- If the agent has the possibility to play a gamble, and the probabilities involved are not small, and he expects to be allowed to play many gambles like this in the future, he should decide exactly as if he was maximizing expected utility: If he has made many decisions like this, he will get a positive utility difference in the 50th percentile if and only if his expected utility from playing the gamble is positive.
- However, if Pascal's mugger comes along, he will decline: The complete probability of living in a universe where people like this mugger ought to be taken seriously is small. In the probability distribution over expected utility at the end of the agent's lifetime, the possibility of getting tortured will manifest itself only very slightly at the 50th percentile - much less than the possibility of losing 5 Dollars.
The reason why humans will intuitively decline to give money to the mugger might be similar: They imagine not the expected utility with both decisions, but the typical outcome of giving the mugger some money, versus declining to.
1I say this to make agents of the same type cooperate in prisoner-like dilemmas.
The original justification for the Kelly criterion isn't that it maximizes a utility function that's logarithmic in wealth, but that it provides a strategy that, in the infinite limit, does better than any other strategy with probability 1. This doesn't mean that it maximizes expected utility (as your examples for linear utility show), but it's not obvious to me that the attractiveness of this property comes mainly from assigning infinite negative value to zero wealth, or that using the Kelly criterion is a similar error to the one Weitzman made.
Yes, if we have large populations of "all-in bettors" and Kelly bettors, then as the number of bets increase the all-in bettors lead in total wealth increases exponentially, while the probability of an all-in bettor being ahead of a Kelly bettor falls exponentially. And as you go to infinity the wealth multiplier of the all-in bettors goes to infinity, while the probability of an all-in bettor leading a Kelly bettor goes to zero. And that was the originally cited reasoning.
Now, one might be confused by the "beats any other constant bankroll ... (read more)