You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

AlexMennen comments on Median utility rather than mean? - Less Wrong Discussion

6 Post author: Stuart_Armstrong 08 September 2015 04:35PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (86)

You are viewing a single comment's thread. Show more comments above.

Comment author: AlexMennen 09 September 2015 04:59:45AM 0 points [-]

How do you know that it's right to buckle your seatbelt? If you are only going to ride in a car once, never again.

I do think that the isolation of the decision is a red herring, but for the sake of the point I was trying to make, it is probably easier to replace the example with a structurally similar one in which the right answer is obvious: suppose you have the opportunity to press a button that will kill you will 49% probability, and give you $5 otherwise. This is the only decision you will ever make. Should you press the button?

Perhaps there is some compromise between them that gets the behavior we want.

As I was saying in my previous comment, I think that's the wrong approach. It isn't enough to kludge together a decision procedure that does what you want on the problems you thought of, because then it will do something you don't want on something you haven't thought of. You need a decision procedure that will reliably do the right thing, and in order to get that, you need it to do the right thing for the right reasons. EU maximization, applied properly, will tell you to do the correct things, and will do so for the correct reasons.

So there is no inherent reason to prefer mean over median.

Actually, there is: https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem

Comment author: Houshalter 09 September 2015 05:55:02AM 0 points [-]

suppose you have the opportunity to press a button that will kill you will 49% probability, and give you $5 otherwise.

Yes I said that median utility is not optimal. I'm proposing that there might be policies better than both EU or median.

Actually, there is: https://en.wikipedia.org/wiki/Von_Neumann%E2%80%93Morgenstern_utility_theorem

Please reread the OP and my comment. If you allow selection over policies instead of individual decisions, you can be perfectly consistent. EU and median are both special cases of ways to pick policies, based on the probability distribution of utility they produce.

You need a decision procedure that will reliably do the right thing, and in order to get that, you need it to do the right thing for the right reasons. EU maximization, applied properly, will tell you to do the correct things, and will do so for the correct reasons.

There is no law of the universe that some procedures are correct and others aren't. You just have to pick one that you like, and your choice is going to be arbitrary.

If you go with EU you are pascal muggable. If you go with median you are muggable in certain cases as well (though you should usually, with >50% probability, end up with better outcomes in the long run. Whereas EU could possibly fail 100% of the time. So it's exploitable, but it's less exploitable at least.)

Comment author: AlexMennen 09 September 2015 07:46:52AM 0 points [-]

If you allow selection over policies instead of individual decisions, you can be perfectly consistent.

I don't see how selecting policies instead of actions removes the motivation for independence.

You just have to pick one that you like, and your choice is going to be arbitrary.

Ultimately, it isn't the policy that you care about; it's the outcome. So you should pick a policy because you like the probability distributions over outcomes that you get from implementing it more than you like the probability distributions over outcomes that you would get from implementing other policies. Since there are many decision problems to use your policy on, this quite heavily constrains what policy you choose. In order to get a policy that reliably picks the actions that you decide are correct in the situations where you can tell what the correct action is, it will have to make those decisions for the same reason you decided that it was the best action (or at least something equivalent to or approximating the same reason). So no, the choice of policy is not at all arbitrary.

If you go with EU you are pascal muggable.

That is not true. EU maximizers with bounded utility functions reject Pascal's wager.

Comment author: Stuart_Armstrong 09 September 2015 10:52:25AM 1 point [-]

I don't see how selecting policies instead of actions removes the motivation for independence.

There are two reasons to like independence. First of all, you might like it for philosophical/aesthetic reasons: "these things really should be independent, these really should be irrelevant". Or you could like it because it prevents you from being money pumped.

When considering policies, money pumping is (almost) no longer an issue, because a policy that allows itself to be money-pumped is (almost) certainly inferior to one that doesn't. So choosing policies removes one of the motivations for independence, to my mind the important one.

Comment author: AlexMennen 09 September 2015 08:29:59PM 0 points [-]

While it's true that this does not tell you to pay each time to switch the outcomes around in a circle over and over again, it still falls prey to one step of a similar problem. Suppose their are 3 possible outcomes: A, B, and C, and there are 2 possible scenarios: X and Y. In scenario X, you get to choose between A and B. In scenario Y, you can attempt to choose between A and B, and you get what you picked with 50% probability, and you get outcome C otherwise. In each scenario, this is the only decision you will ever make. Suppose in scenario X, you prefer A over B, but in scenario Y, you prefer (B+C)/2 over (A+C)/2. But suppose you had to pay to pick A in scenario X, and you had to pay to pick (B+C)/2 in scenario Y, and you still make those choices. If Y is twice as likely as X a priori, then you are paying to get a probability distribution over outcomes that you could have gotten for free by picking B given X, and (A+C)/2 given Y. Since each scenario only involves you ever getting to make one decision, picking a policy is equivalent to picking a decision.

Comment author: Houshalter 09 September 2015 09:22:01PM 0 points [-]

Your example is difficult to follow, but I think you are missing the point. If there is only one decision, then it's actions can't be inconsistent. By choosing a policy only once - one that maximizes it's desired probability distribution of utility outcomes - it's not money pumpable, and it's not inconsistent.

Now by itself it still sucks because we probably don't want to maximize for the best median future. But it opens up the door to more general policies for making decisions. You no longer have to use expected utility if you want to be consistent. You can choose a tradeoff between expected utility and median utility (see my top level comment), or a different algorithm entirely.

Comment author: AlexMennen 09 September 2015 11:52:42PM *  0 points [-]

If there is only one decision point in each possible world, then it is impossible to demonstrate inconsistency within a world, but you can still be inconsistent between different possible worlds.

Edit: as V_V pointed out, the VNM framework was designed to handle isolated decisions. So if you think that considering an isolated decision rather than multiple decisions removes the motivation for the independence axiom, then you have misunderstood the motivation for the independence axiom.

Comment author: Stuart_Armstrong 10 September 2015 08:46:45AM 1 point [-]

So if you think that considering an isolated decision rather than multiple decisions removes the motivation for the independence axiom, then you have misunderstood the motivation for the independence axiom.

I understand the two motivations for the independence axiom, and the practical one ("you can't be money pumped") is much more important that the theoretical one ("your system obeys this here philosophically neat understanding of irrelevant information").

But this is kind of a moot point, because humans don't have utility functions. And therefore we will have to construct them. And the process of constructing them is almost certainly going to depend on facts about the world, making the construction process almost certainly inconsistent between different possible worlds.

Comment author: AlexMennen 10 September 2015 11:00:40PM 0 points [-]

And the process of constructing them is almost certainly going to depend on facts about the world

It shouldn't. If your preferences among outcomes depend on what options are actually available to you, then I don't see how you can justify claiming to have preferences among outcomes, as opposed to tendencies to make certain choices.

Comment author: Stuart_Armstrong 11 September 2015 08:37:05AM 1 point [-]

It shouldn't.

Then define me a process that takes people's current mess of preferences, makes these into utility functions, and, respecting bounded rationality, is independent of options available in the real world. Even then, we have the problem that this mess of preferences is highly dependent on real world experiences in the first place.

I don't see how you can justify claiming to have preferences among outcomes, as opposed to tendencies to make certain choices.

If I always go left at a road, I have tendency to make certain choices. If I have a full model of the entire universe with labelled outcomes ranked on a utility function, and use it with unbounded rationality to make decisions, I have preferences among outcomes. The extremes are clear.

I feel that a bounded human being with a crude mental model that is trying to achieve some goal, imperfectly (because of ingrained bad habits, for instance) is better described as having preferences among outcomes. You could argue that they have mere tendencies, but this seems to stretch the term. But in any case, this is a simple linguistic dispute. Real human beings cannot achieve independence.

Comment author: Houshalter 10 September 2015 12:08:00AM 0 points [-]

It can't be inconsistent within a world no matter how many decisions points there are. If we agree it's not inconsistent, then what are you arguing against?

I don't care about the VNM framework. As you said, it is designed to be optimal for decisions made in isolation. Because we don't need to make decisions in isolation, we don't need to be constrained by it.

Comment author: AlexMennen 10 September 2015 12:29:28AM 0 points [-]

If we agree it's not inconsistent...

No. Inconsistency between different possible worlds is still inconsistency.

Because we don't need to make decisions in isolation, we don't need to be constrained by it.

The difference doesn't matter that much in practice. If there are multiple decision points, you can combine them into one by selecting a policy, or by considering them sequentially and using your beliefs about what your choices will be in the future to compute the expected utilities of the possible decisions available to you now. The reason that the VNM framework was designed for one-shot decisions is that it makes things simpler without actually constraining what it can be applied to.

Comment author: Houshalter 11 September 2015 12:01:04AM 0 points [-]

No. Inconsistency between different possible worlds is still inconsistency.

It's perfectly consistent in the sense that it's not money pumpable, and always makes the same decisions given the same information. It will make different decisions in different situations, given different information. But that is not inconsistent by an reasonable definition of "inconsistent".

The difference doesn't matter that much in practice.

It makes a huge difference. If you want to get the best median future, then you can't make decisions in isolation. You need to consider every possible decision you will have to make, and their probability. And choose a decision policy that selects the best median outcome.