You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Stuart_Armstrong comments on Median utility rather than mean? - Less Wrong Discussion

6 Post author: Stuart_Armstrong 08 September 2015 04:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (86)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 09 September 2015 10:40:13AM *  0 points [-]

I don't understand your argument that the median utility maximizer would buckle its seat belt in the real world.

It derives from the fact that median maximalisation doesn't consider decisions independently, even if their gains and losses are independent.

For illustration, compare the following deal: you pay £q, and get £1 with probability p. There are n independent deals (assume your utility is linear in £).

If n=1, the median maximiser accepts the deal iff q<1 and p>0.5. Not a very good performance! Now let's look at larger n. For m < n, accepting m deals gets you an expected reward of m(p-q). The median is a bit more complicated (see https://en.wikipedia.org/wiki/Binomial_distribution#Mode_and_median ), but it's within £1 of the mean reward.

So if p<q, the mean maximiser will reject all deals, and if p>q, it will accept all n deals.

For p<q, the median maximiser will accept at most 1/(q-p) deals. And for p>q, it will accept at least n - 1/(p-q) deals. In all cases, its expected loss, compared with the mean maximiser, is less than £1.

There's a similar effect going on when considering the seat-belt situation. Aggregation concentrates the distribution in a way that moved median and mean towards each other.

Comment author: AlexMennen 09 September 2015 04:24:43PM 0 points [-]

You appear to now be making an argument that you already conceded was incorrect in OP:

This means that the decision of a median maximiser will be close to those of a utility maximiser - they take almost the same precautions - though the outcomes are still pretty far apart: the median maximiser accepts a 49.99999...% chance of death.

You then go on to say that if the agent also faces many decisions of a different nature, it won't do that. That's where I get lost.

Comment author: Stuart_Armstrong 09 September 2015 05:00:47PM 0 points [-]

The median maximiser accepts a 49.99999...% chance of death, only because "death", "trivial cost" and "no cost" are the only options here. If I add "severe injury" and "light injury" to the outcomes, the maximiser will now accept less than a 49.9999...% chance of light injury. If we make light injury additive, and make the trivial cost also additive and not incomparable to light injuries, we get something closer to my illustrative example above.

Comment author: AlexMennen 09 September 2015 08:34:32PM 1 point [-]

Suppose it comes up with 2 possible policies, one of which involves a 49% chance of death and no chance of injury, and another which involves a 49% chance of light injury, and no chance of heavy injury or death. The median maximizer sees no reason to prefer the second policy if they have the same effects the other 51% of the time.

Comment author: Stuart_Armstrong 10 September 2015 08:48:26AM 0 points [-]

Er, yes, constructing single choice examples when the median behaves oddly/wrongly is trivial. My whole point is about what happens to median when you aggregate decisions.

Comment author: AlexMennen 10 September 2015 04:15:00PM -1 points [-]

You were claiming that in a situation where a median-maximizing agent has a large number of trivially inconvenient action that prevent small risks of death, heavy injury, or light injury, then it would accept a 49% chance of light injury, but you seemed to imply that it would not accept a 49% chance of death. I was trying to point out that this appears to be incorrect.

Comment author: Stuart_Armstrong 11 September 2015 08:30:29AM 1 point [-]

I'm not entirely sure what your objection is; we seem to be talking at cross purposes.

Let's try it simpler. If we assume that the cost of buckling seat belts is incommensurable (in practice) with light injury (and heavy injury, and death), then the median maximising agent will accept a 49.99..% chance of (light injury or heavy injury or death), over their lifetime. Since light injury is much more likely than death, this in effect forces the probability of death down to a very low amount.

It's just an illustration of the general point that median maximising seems to perform much better in real-world problems than its failure in simple theoretical ones would suggest.

Comment author: AlexMennen 11 September 2015 04:27:46PM -2 points [-]

Since light injury is much more likely than death, this in effect forces the probability of death down to a very low amount.

No, it doesn't. That does not address the fact that the agent will not preferentially accept light injury over death. Adopting a policy of immediately committing suicide once you've been injured enough to force you into the bottom half of outcomes does not decrease median utility. The agent has no incentive to prevent further damage once it is in the bottom half of outcomes. As a less extreme example, the value of house insurance to a median maximizer is 0, just because loosing your house is a bad outcome even if you get insurance money for it. This isn't a weird hypothetical that relies on it being an isolated decision; it's a real-life decision that a median maximizer would get wrong.

Comment author: Stuart_Armstrong 14 September 2015 11:39:59AM 0 points [-]

A more general way of stating how multiple decisions improve median maximalisation: the median maximaliser is indifferent of outcomes not at the median (eg suicide vs light injury). But as the decision tree grows and the number of possible situations does as well, the probability increases that outcomes not at the median in a one shot, will affect the median in the more complex situation.

Comment author: AlexMennen 14 September 2015 05:27:14PM 0 points [-]

This argument relies on your utility being a sum of effects from each of the decisions you made, but in reality, your decisions interact in much more complicated ways, so that isn't a realistic model.

Also, if your defense of median maximization consists entirely of an argument that it approximates mean maximization, then what's the point of all this? Why not just use expected utility maximization? I'm expecting you to bring up Pascal's mugging here, but since VNM-rationality does not force you to pay the mugger, you'll have to do better than that.

Comment author: Stuart_Armstrong 14 September 2015 11:37:58AM *  0 points [-]

Look, we're arguing past each other here. My logical response here would be to add more options to the system, which would remove the problem you identified (and I don't understand your house insurance example - this is just the seat-belt decision again as a one-shot, and I would address it by looking at all the financial decisions you make in your life - and if that's not enough, all the decisions, including all the "don't do something clearly stupid and pointless" ones).

What I think is clear is:

a) Median maximalisation makes bad decisions in isolated problems.

b) If we combine all the likely decisions that a median maximiser will have to make, the quality of the decisions increase.

If you want to argue against it, either say that a) is bad enough we should reject the approach anyway, even if it decides well in practice, or find examples where a real world median maximaliser will make bad decisions even in the real world (if you would pay Pascal's mugger, then you could use that as an example).

Comment author: AlexMennen 14 September 2015 05:11:35PM 0 points [-]

I don't understand your house insurance example - this is just the seat-belt decision again as a one-shot

We were modeling the seat-belt decision as something that makes the difference between being dead and being completely fine in the event of an accident (which I suppose is not very realistic, but whatever). I was trying to point to a situation where an event can happen which is bad enough to put in the bottom half of outcomes either way, so that nothing that happens conditional on the event can affect the median outcome, but a decision you can make ahead of time would make the difference between bad and worse.

I do think that a) is bad enough, because a decision procedure that does poorly in isolated problems is wrong, and thus cannot be expected to do well in realistic situations, as I mentioned previously. I guess b) is probably technically true, but it is not enough for the quality of the decisions to increase when the number increases; it should actually increase towards a limit that isn't still awful, and come close to achieving that limit (I'm pretty sure it fails on at least one of those, though which step it fails on might depend on how you make things precise). I've given examples where median maximizers make bad decisions in the real world, but you've dismissed them with vague appeals to "everything will be fine when you consider it in the context of all the other decisions it has to make".