What happens if you're using this method and you're offered a gamble where you have a 49% chance of gaining 1000000utils and a 51% chance of losing 5utils (if you don't take the deal you gain and lose nothing). Isn't the "typical outcome" here a loss, even though we might really really want to take the gamble? Or have I misunderstood what you propose?
Depending on the rest of your utility distribution, that is probably true. Note, however, that an additional 10^6 utility in the right half of the utility function will change the median outcome of your "life": If 10^6 is larger than all the other utility you could ever receive, and you add a 49 % chance of receiving it, the 50th percentile utility after that should look like the 98th percentile utility before.
Could you rephrase this somehow? I'm not understanding it. If you actually won the bet and got the extra utility, your median expected utility would be higher, but you wouldn't take the bet, because your median expected utility is lower if you do.
In such a case, the median outcome of all agents will be improved if every agent with the option to do so takes that offer, even if they are assured that it is a once/lifetime offer (because presumably there is variance of more than 5 utils between agents).
But the median outcome is losing 5 utils?
Edit: Oh, wait! You mean the median total utility after some other stuff happens (with a variance of more than 5 utils)?
Suppose we have 200 agents, 100 of which start with 10 utils, the rest with 0. After taking this offer, we have 51 with -5, 51 with 5, 49 with 10000, and 49 with 10010. The median outcome would be a loss of -5 for half the agents, a gain of 5 for half, but only the half that would lose could actually get that outcome...
And what do you mean by "the possibility of getting tortured will manifest itself only very slightly at the 50th percentile"? I thought you were restricting yourself to median outcomes, not distributions? How do you determine the median distribution?
And what do you mean by "the possibility of getting tortured will manifest itself only very slightly at the 50th percentile"? I thought you were restricting yourself to median outcomes, not distributions? How do you determine the median distribution?
I don't. I didn't write that.
Your formulation requires that there be a single, high probability event that contributes most of the utility an agent has the opportunity to get over its lifespan. In situations where this is not the case (e.g. real life), the decision agent in question would choose to take all opportunities like that.
The closest real-world analogy I can draw to this is the decision of whether or not to start a business. If you fail (which there is a slightly more than 50% chance you will), you are likely to be in debt for quite some time. If you succeed, you will be very rich. This is not quite a perfect analogy, because you will have more than one chance in your life to start a business, and the outcomes of business ownership are not orders of magnitude larger than the outcomes in real life. However, it is much closer than the "51% chance to lose $5, 49% chance to win $10000" that your example intuitively brings to mind.
Ah! Sorry for the mixed-up identities. Likewise, I didn't come up with that "51% chance to lose $5, 49% chance to win $10000" example.
But, ah, are you retracting your prior claim about a variance of greater than 5? Clearly this system doesn't work on its own, though it still looks like we don't know A) how decisions are made using it or B) under what conditions it works. Or in fact C) why this is a good idea.
Certainly for some distributions of utility, if the agent knows the distribution of utility across many agents, it won't make the wrong decision on that particular example by following this algorithm. I need more than that to be convinced!
For instance, it looks like it'll make the wrong decision on questions like "I can choose to 1) die here quietly, or 2) go get help, which has a 1/3 chance of saving my life but will be a little uncomfortable." The utility of surviving presumably swamps the rest of the utility function, right?
Ah, it appears that I'm mixing up identities as well. Apologies.
Yes, I retract the "variance greater than 5". I think it would have to be variance of at least 10,000 for this method to work properly. I do suspect that this method is similar to decision-making processes real humans use (optimizing the median outcome of their lives), but when you have one or two very important decisions instead of many routine decisions, methods that work for many small decisions don't work so well.
If, instead of optimizing for the median outcome, you optimized for the average of outcomes within 3 standard deviations of the median, I suspect you would come up with a decision outcome quite close to what people actually use (ignoring very small chances of very high risk or reward).
A bounded utility function, on which increasing years of happy life (or money, or whatever) give only finite utility in the infinite limit, does not favor taking vanishing probabilities of immense payoffs. It also preserves normal expected utility calculations so that you can think about 90th percentile and 10th percentile, and lets you prefer higher payoffs in probable cases.
Basically, this "median outcome" heuristic looks like just a lossy compression of a bounded utility function's choice outputs, subject to new objections like APMason's. Why not just go with the bounded utility function?
I want that it is possible to have a very bad outcome: If I can play a lottery that has 1 utilium cost, 10^7 payoff and a winning chance of 10^-6, and if I can play this lottery enough times, I want to play it.
"Enough times" to make it >50% likely that you will win, yes? Why is this the correct cutoff point?
Kelly asked a question: given you have finite wealth, how do you decide how much to bet on a given offered bet in order to maximize the rate at whcih your expected wealth grows?
The Kelly criterion doesn't maximize expected wealth, it maximizes expected log wealth, as the article you linked mentions:
The conventional alternative is utility theory which says bets should be sized to maximize the expected utility of the outcome (to an individual with logarithmic utility, the Kelly bet maximizes utility, so there is no conflict)
Suppose that I can make n bets, each time wagering any proportion of my bankroll that I choose and then getting three times the wagered amount if a fair coin comes out Heads, and losing the wager on Tails. Expected wealth is maximized if I always bet the entire bankroll, with an expected wealth of (initial bankroll)(3^n)(the probability of all Heads=2^-n). The Kelly criterion trades off from that maximum expected wealth in favor of log wealth.
A utility function that goes with log wealth values gains less, but it also values losses much more, with insane implications at the extremes. With log utility, multiplying wealth by a 1,000,000 has the same marginal utility whatever your wealth, and dividing wealth by 1,000,000 has the negative of that utility. Consider these two gambles:
Gamble 1) Wealth of $1 with certainty.
Gamble 2) Wealth of $0.00000001 with 50% probability, wealth of $1,000,000 with 50% probability.
Log utility would favor $1, but for humans Gamble 2 is clearly better; there is very little difference for us between total wealth levels of $1 and a millionth of a cent.
Worse, consider these gambles:
Gamble 3) Wealth of $0.000000000000000000000000001 with certainty.
Gamble 4) Wealth of $1,000,000,000 with probability (1-1/3^^^3) and wealth of $0 with probability 1/3^^^3
Log utility favors Gamble 3, since it assigns $0 wealth infinite negative utility, and will sacrifice any finite gain to avoid it. But for humans Gamble 4 is vastly better, and a 1/3^^^3 chance of bankruptcty is negligibly worse than wealth of $1. Every day humans drive to engage in leisure activities, eat pleasant but not maximally healthful foods, and otherwise accept small, go white-water rafting, and otherwise accept small (1 in 1,000,000, not 1 in 3^^^3) probabilities of death for local pleasure and consumption.
This is not my utility function. I have diminishing utility over a range of wealth levels, which log utility can represent, but it weights losses around zero too highly, and still buys a 1 in 10^100 chance of $3^^^3 in exchange for half my current wealth if no higher EV bets are available, as in Pascal's Mugging.
Abuse of a log utility function (chosen originally for analytical convenience) is what led Martin Weitzman astray in his "Dismal Theorem" analysis of catastrophic risk, suggesting that we should pay any amount to avoid zero world consumption (and not on astronomical waste grounds or the possibility of infinite computation or the like, just considering the limited populations Earth can support using known physics).
The original justification for the Kelly criterion isn't that it maximizes a utility function that's logarithmic in wealth, but that it provides a strategy that, in the infinite limit, does better than any other strategy with probability 1. This doesn't mean that it maximizes expected utility (as your examples for linear utility show), but it's not obvious to me that the attractiveness of this property comes mainly from assigning infinite negative value to zero wealth, or that using the Kelly criterion is a similar error to the one Weitzman made.
The idea is to compare not the results of actions, but the results of decision algorithms. The question that the agent should ask itself is thus:
"Suppose everyone1 who runs the same thinking procedure like me uses decision algorithm X. What utility would I get at the 50th percentile (not: what expected utility should I get), after my life is finished?"
Then, he should of course look for the X that maximizes this value.
Now, if you formulate a turing-complete "decision algorithm", this heads into an infinite loop. But suppose that "decision algorithm" is defined as a huge table for lots of different possible situations, and the appropriate outputs.
Let's see what results such a thing should give:
The reason why humans will intuitively decline to give money to the mugger might be similar: They imagine not the expected utility with both decisions, but the typical outcome of giving the mugger some money, versus declining to.
1I say this to make agents of the same type cooperate in prisoner-like dilemmas.