banx comments on Open thread, January 25- February 1 - Less Wrong

8 Post author: NancyLebovitz 25 January 2014 02:52PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (316)

You are viewing a single comment's thread.

Comment author: banx 26 January 2014 12:27:13AM *  5 points [-]

Is it always correct to choose that action with the highest expected utility?

Suppose I have a choice between action A, which grants -100 utilons with 99.9% chance and +1000000 utilons with 0.1% chance, or action B which grants +1 utilon with 100% chance. A has an expected utility of +900.1 utilons, while B has an expected utility of +1 utilon. This decision will be available to me only once, and all future decision will involve utility changes on the order of a few utilons.

Intuitively, it seems like action A is too risky. I'll almost certainly end up with a huge decrease in utility, just because there's a remote chance of a windfall. Risk aversion doesn't apply here, since we're dealing in utility, right? So either I'm failing to truly appreciate the chance at getting 1M utilons -- I'm stuck thinking about it as I would money -- or this is a case where there's reason to not take the action that maximizes expected value. Help?

EDIT: Changed the details of action A to what was intended

Comment author: Alejandro1 26 January 2014 01:00:27AM *  14 points [-]

I think the non-intuitive nature of the A choice is because we naturally think of utilons as "things". For any valuable thing (money, moments of pleasure, whatever) anybody who is minimally risk adverse would choose B. But utllons are not things, they are abstractions defined by one's preferences. So that A is the rational choice is a tautology, in the standard versions of utility theory.

It may help to think it the other way around, starting from the actual preference. You would choose a 99.9% chance of losing ten cents and 0.1% chance of winning 10000 dollars over winning one cent with certainty, right? So then perhaps, as long as we don't think of other bets and outcomes, we can map winning 1 cent to +1 utilon, losing 10 cents to -100 utilons and winning 10000 dollars to +10000 utilons. Then we can refine and extend the "outcomes <=> utilons" map by considering your actual preferences under more and more bets. As long as your preferences are self-consistent in the sense of the VNM axioms, then there will a mapping that can be constructed.

ETA: of course, it is possible that your preferences are not self-consistent. The Allais paradox is an example where many people's intuitive preferences are not self-consistent in the VNM sense. But constructing such a case is more complicated that just considering risk-aversion on a single bet.

Comment author: [deleted] 26 January 2014 01:26:49AM 11 points [-]

Also, it's well possible that your utility function doesn't evaluate to +10000 for any value of its argument, i.e. it's bounded above.

Comment author: Matt_Simpson 31 January 2014 05:14:58AM 0 points [-]

Since utility functions are only unique up to affine transformation, I don't know what to make of this comment. Do you have some sort of canonical representation in mind or something?

Comment author: [deleted] 31 January 2014 07:25:49AM -1 points [-]

In the context of this thread, you can consider U(status quo) = 0 and U(status quo, but with one more dollar in my wallet) = 1. (OK, that makes +10000 an unreasonable estimate of the upper bound; pretend I said +1e9 instead.)

Comment author: jsteinhardt 26 January 2014 07:04:43AM -1 points [-]

Yes, this seems almost certainly true (and I think is even necessary if you want to satisfy the VNM axioms, otherwise you violate the continuity axiom).

Comment author: [deleted] 26 January 2014 09:48:15AM 2 points [-]

(and I think [a bounded utility function] is even necessary if you want to satisfy the VNM axioms, otherwise you violate the continuity axiom)

An unbounded function is one that can take arbitrarily large finite values, not necessarily one that actually evaluates to infinity somewhere.

Comment author: jsteinhardt 27 January 2014 08:41:18AM 1 point [-]

Yes I'm quite aware... note that if there's a sequence of outcomes whose values increase without bound, then you could construct a lottery that has infinite value by appropriately mixing the lotteries together, e.g. put probability 2^-k on the outcome with value 2^k. Then this lottery would be problematic from the perspective of continuity (or even having an evaluable utliity function).

Comment author: [deleted] 27 January 2014 09:03:51AM 0 points [-]

Are lotteries allowed to have infinitely many possible outcomes? (The Wikipedia page about the VNM axioms only says "many"; I might look it up on the original paper when I have time.)

Comment author: Oscar_Cunningham 27 January 2014 12:27:29PM *  3 points [-]

There are versions of the VNM theorem that allow infinitely many possible outcomes, but they either

1) require additional continuity assumptions so strong that they force your utility function to be bounded

or

2) they apply only to some subset of the possible lotteries (i.e. there will be some lotteries for which your agent is not obliged to define a utility).

I might look it up on the original paper when I have time.

The original statement and proof given by VNM are messy and complicated. They have since been neatened up a lot. If you have access to it, try "Follmer H., and Schied A., Stochastic Finance: An Introduction in Discrete Time, de Gruyter, Berlin, 2004"

EDIT: It's online.

Comment author: Matt_Simpson 31 January 2014 05:13:47AM 0 points [-]

See also Kreps, Notes on the Theory of Choice. Note that one of these two restrictions are required in order to specifically prevent infinite expected utility. So if a lottery spits out infinite expected utility, you broke something in the VNM axioms.

For anyone who's interested, a quick and dirty explanation is that the preference relation is primitive, and we're trying to come up with an index (a utility function) that reproduces the preference relation. In the case of certainty, we want a function U:O->R where O is the outcome space and R is the real numbers such that U(o1) > U(o2) if and only if o1 is preferred to o2. In the case of uncertainty, U is defined on the set of probability distributions over O, i.e. U:M(O) -> R. With the VNM axioms, we get U(L) = E_L[u(o)] where L is some lottery (i.e. a probability distribution over O). U is strictly prohibited from taking the value of infinity in these definition. Now you probably could extend them a little bit to allow for such infinities (at the cost of VNM utility perhaps), but you would need every lottery with infinite expected value to be tied for the best lottery according to the preference relation.

Comment author: jsteinhardt 27 January 2014 09:14:24AM 0 points [-]

I'm not sure, although I would expect VNM to invoke the Hahn-Banach theorem, and it seems hard to do that if you only allow finite lotteries. If you find out I'd be quite interested. I'm only somewhat confident in my original assertion (say 2:1 odds).

Comment author: Oscar_Cunningham 26 January 2014 11:48:49AM 3 points [-]

A, which grants -100 utilons with 99.9% chance and +10000 utilons with 0.1%

A has an expected utility of +900.1 utilons

Um, A actually has a utility of -89.9.

That explains why it seems less appealing!

Comment author: ThrustVectoring 26 January 2014 03:48:54AM 3 points [-]

I'd flip that around. Whatever action you end up choosing reveals what you think has highest utility, according to the information and utility function you have at the time. It's almost a definition of what utility is - if you consistently make choices that rank lower according to what you think your utility function is, then your model of your utility function is wrong.

If the utility function you think you have prefers B over A, and you prefer A over B, then there's some fact that's missing from the utility function you think you have (probably related to risk).

I've recently come to terms with how much fear/anxiety/risk avoidance is in my revealed preferences. I'm working on working with that to do effective long-term planning -- the best trick I have so far is weighing "unacceptable status quo continues" as a risk. That, and making explicit comparisons between anticipated and experienced outcomes of actions (consistently over-estimating risks doesn't help any, and I've been doing that).

Comment author: TylerJay 26 January 2014 07:48:33PM 0 points [-]

I sometimes have the same intuition as banx. You're right that the problem is not in the choice, but in the utility function and it most likely stems from thinking about utility as money.

Lets examine the previous example and make it into money (dollars): -100 [dollars] with 99.9% chance and +10,000 [dollars] with 0.1% vs 100% chance at +1 [dollar]

When doing the math, you have to take into future consequences as well. For example, if you knew you would be offered 100 loaded bets with an expected payoff of $0.50 in the future, each of which only cost you $1 to participate in, then you have to count this in your original payoff calculation if losing the $100 would prohibit you from being able to take these other bets.

Basically, you have to think through all the long-term consequences when calculating expected payoff, even in dollars.

Then when you try to convert this to utility, it's even more complicated. Is the utility per dollar gained in the +$10,000 case equivalent to the utility per dollar lost in the -$100 case? Would you feel guilty and beat yourself up afterwards if you took a bet that you had a 99.9% chance of losing? Even though a purely rational agent probably shouldn't feel this, it's still likely a factor in most actual humans' utility functions.

TrustVectoring summed it up well above: If the utility function you think you have prefers B over A, and you prefer A over B, then there's some fact that's missing from the utility function you think you have.

If you still prefer picking the +1 option, then most likely your assessment that the first choice only gives a negative utility of 100 is probably wrong. There are some other factors that make it a less attractive choice.

Comment author: Qiaochu_Yuan 27 January 2014 07:29:36PM *  0 points [-]

Depending on your preferred framework, this is in some sense backwards: utility is, by definition, that thing which it is always correct to choose the action with the highest expected value of (say, in the framework of the von Neumann-Morgenstern theorem).

Comment author: IlyaShpitser 27 January 2014 11:05:47AM *  -1 points [-]

Is it always correct to choose that action with the highest expected utility?

People who play with money don't like high variance, and sometimes trade off some of the mean to reduce variance.