A plan for Pascal's mugging?

yttrium

The idea is to compare not the results of actions, but the results of decision algorithms. The question that the agent should ask itself is thus:

"Suppose everyone¹ who runs the same thinking procedure like me uses decision algorithm X. What utility would I get at the 50th percentile (not: what expected utility should I get), after my life is finished?"
Then, he should of course look for the X that maximizes this value.

Now, if you formulate a turing-complete "decision algorithm", this heads into an infinite loop. But suppose that "decision algorithm" is defined as a huge table for lots of different possible situations, and the appropriate outputs.

Let's see what results such a thing should give:

If the agent has the possibility to play a gamble, and the probabilities involved are not small, and he expects to be allowed to play many gambles like this in the future, he should decide exactly as if he was maximizing expected utility: If he has made many decisions like this, he will get a positive utility difference in the 50th percentile if and only if his expected utility from playing the gamble is positive.
However, if Pascal's mugger comes along, he will decline: The complete probability of living in a universe where people like this mugger ought to be taken seriously is small. In the probability distribution over expected utility at the end of the agent's lifetime, the possibility of getting tortured will manifest itself only very slightly at the 50th percentile - much less than the possibility of losing 5 Dollars.

The reason why humans will intuitively decline to give money to the mugger might be similar: They imagine not the expected utility with both decisions, but the typical outcome of giving the mugger some money, versus declining to.

¹I say this to make agents of the same type cooperate in prisoner-like dilemmas.

The idea is to compare not the results of actions, but the results of decision algorithms. The question that the agent should ask itself is thus:

Let's see what results such a thing should give:

If the agent has the possibility to play a gamble, and the probabilities involved are not small, and he expects to be allowed to play many gambles like this in the future, he should decide exactly as if he was maximizing expected utility: If he has made many decisions like this, he will get a positive utility difference in the 50th percentile if and only if his expected utility from playing the gamble is positive.
However, if Pascal's mugger comes along, he will decline: The complete probability of living in a universe where people like this mugger ought to be taken seriously is small. In the probability distribution over expected utility at the end of the agent's lifetime, the possibility of getting tortured will manifest itself only very slightly at the 50th percentile - much less than the possibility of losing 5 Dollars.

¹I say this to make agents of the same type cooperate in prisoner-like dilemmas.

Kelly asked a question: given you have finite wealth, how do you decide how much to bet on a given offered bet in order to maximize the rate at whcih your expected wealth grows?

The Kelly criterion doesn't maximize expected wealth, it maximizes expected log wealth, as the article you linked mentions:

The conventional alternative is utility theory which says bets should be sized to maximize the expected utility of the outcome (to an individual with logarithmic utility, the Kelly bet maximizes utility, so there is no conflict)

Suppose that I can make n bets, each time wagering any proportion of my bankroll that I choose and then getting three times the wagered amount if a fair coin comes out Heads, and losing the wager on Tails. Expected wealth is maximized if I always bet the entire bankroll, with an expected wealth of (initial bankroll)(3^n)(the probability of all Heads=2^-n). The Kelly criterion trades off from that maximum expected wealth in favor of log wealth.

A utility function that goes with log wealth values gains less, but it also values losses much more, with insane implications at the extremes. With log utility, multiplying wealth by a 1,000,000 has the same marginal utility whatever your wealth, and dividing wealth by 1,000,000 has the negative of that utility. Consider these two gambles:

Gamble 1) Wealth of $1 with certainty.

Gamble 2) Wealth of $0.00000001 with 50% probability, wealth of $1,000,000 with 50% probability.

Log utility would favor $1, but for humans Gamble 2 is clearly better; there is very little difference for us between total wealth levels of $1 and a millionth of a cent.

Worse, consider these gambles:

Gamble 3) Wealth of $0.000000000000000000000000001 with certainty.

Gamble 4) Wealth of $1,000,000,000 with probability (1-1/3^^^3) and wealth of $0 with probability 1/3^^^3

Log utility favors Gamble 3, since it assigns $0 wealth infinite negative utility, and will sacrifice any finite gain to avoid it. But for humans Gamble 4 is vastly better, and a 1/3^^^3 chance of bankruptcty is negligibly worse than wealth of $1. Every day humans drive to engage in leisure activities, eat pleasant but not maximally healthful foods, and otherwise accept small, go white-water rafting, and otherwise accept small (1 in 1,000,000, not 1 in 3^^^3) probabilities of death for local pleasure and consumption.

This is not my utility function. I have diminishing utility over a range of wealth levels, which log utility can represent, but it weights losses around zero too highly, and still buys a 1 in 10^100 chance of $3^^^3 in exchange for half my current wealth if no higher EV bets are available, as in Pascal's Mugging.

Abuse of a log utility function (chosen originally for analytical convenience) is what led Martin Weitzman astray in his "Dismal Theorem" analysis of catastrophic risk, suggesting that we should pay any amount to avoid zero world consumption (and not on astronomical waste grounds or the possibility of infinite computation or the like, just considering the limited populations Earth can support using known physics).

The original justification for the Kelly criterion isn't that it maximizes a utility function that's logarithmic in wealth, but that it provides a strategy that, in the infinite limit, does better than any other strategy with probability 1. This doesn't mean that it maximizes expected utility (as your examples for linear utility show), but it's not obvious to me that the attractiveness of this property comes mainly from assigning infinite negative value to zero wealth, or that using the Kelly criterion is a similar error to the one Weitzman made.

0mwengler14y

Please enlighten a poor Physicist. You write: [...] I thought the log function operating on real positive numbers was real and monotonically increasing with wealth. I thought wealth for the purposes of the wikipedia article and Kelly criterion calculations was real and positive. So how can something which is said to maximize log(wealth) not also be said to maximize wealth with identical meaning? Seriously, if there is some meaningful sense in which something that maximizes log(wealth) does not also maximize wealth, I am at a loss to even guess what it is and would appreciate being enlightened.

-2

A plan for Pascal's mugging?

-2

-2

-2

A plan for Pascal's mugging?

-2

-2