Wei_Dai comments on A fungibility theorem - Less Wrong

21 Post author: Nisan 12 January 2013 09:27AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread.

Comment author: Wei_Dai 12 January 2013 06:35:52PM *  6 points [-]

Suppose further that you can quantify each item on that list with a function from world-histories to real numbers, and you want to optimize for each function, all other things being equal.

If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)

What the theorem says is that if you really care about the values on that list, then there are linear aggregations that you should start optimizing for.

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that's mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real "optimizing" was already done prior to the time you form the linear aggregation.

(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you've already "optimized" prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)

Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can't obtain encodings of such functions since I don't have introspective access to my values, and therefore I can't compute linear aggregations of them. So I don't know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.

If you're capable of making precommitments and we don't worry about computational difficulties [...]

I'm glad you made these assumptions explicit, but shouldn't there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn't seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).

(I've skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)

Comment author: endoself 12 January 2013 08:32:09PM 3 points [-]

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights.

I agree with this concern. The theorem is basically saying that, given any sensible aggregation rule, there is a linear aggregation rule that produces the same decisions. However, it assumes that we already have a prior; the linear coefficients are allowed to depend on what we think the world actually looks like, rather than being a pure representation of values. I think people, especially those who don't understand the proof of this theorem, are likely to misinterpret it.

Comment author: Nisan 15 January 2013 12:49:50AM 1 point [-]

I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier.

Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one's values are.

I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn't seem unreasonable to me

Sticking with B by default sounds reasonable except when we know something about the ways in which B falls short of optimality and the ways in which B takes dynamical consistency issues into account. E.g., I can pretty confidently recommend that minor philanthropists donate all their charity to the single best cause, modulo a number of important caveats and exceptions. It's natural to feel that one should diversify their (altruistic, outcome-oriented) giving; but once one sees the theoretical justification for single-cause giving under ideal conditions and one explains away their intuitions with motives they don't endorse and heuristics that work okay in the EAA but not on this particular problem, I think they have a good reason to go with choice A.

Even then, the philanthropist still has to decide which cause to donate to. It's possible that once they believe they should construct a utility function for a particular domain, they'll be able to use other tools to come up with a utility function they're happy with. But this theorem doesn't guarantee that.

I tried not to claim too much in the OP. I hope no one reads this post and makes a really bad decision because of an overly-naive expected-utility calculation.

Comment author: Wei_Dai 15 January 2013 01:21:26PM 0 points [-]

Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one's values are.

Do you mean "figuring out what one's weights are"? Assuming yes, I think my point was a bit stronger than that, namely there's not necessarily a reason to figure out the weights at all, if in order to figure out the weights, you actually have to first come to a decision using some other procedure.

Sticking with B by default sounds reasonable except when we know something about the ways in which B falls short of optimality and the ways in which B takes dynamical consistency issues into account.

I think there's probably local Pareto improvements that we can make to B, but that's very different from switching to A (which is what your OP was arguing for).

E.g., I can pretty confidently recommend that minor philanthropists donate all their charity to the single best cause, modulo a number of important caveats and exceptions. It's natural to feel that one should diversify their (altruistic, outcome-oriented) giving;

I agree this seems like a reasonable improvement to B, but I'm not sure what relevance your theorem has for it. You may have to write that post you mentioned in the OP to explain.

I tried not to claim too much in the OP. I hope no one reads this post and makes a really bad decision because of an overly-naive expected-utility calculation.

Besides that, I'm concerned about many people seemingly convinced that VNM is rationality and working hard to try to justify it, instead of working on a bunch of open problems that seem very important and interesting to me, one of which is what rationality actually is.

Comment author: Nisan 18 January 2013 12:04:28AM *  0 points [-]

Yes, whether a set of weights leads to Pareto-dominance depends logically on the shape of the Pareto frontier. So the theorem does not help with the computational part of figuring out what one's values are.

Do you mean "figuring out what one's weights are"?

Yes

Assuming yes, I think my point was a bit stronger than that, namely there's not necessarily a reason to figure out the weights at all, if in order to figure out the weights, you actually have to first come to a decision using some other procedure.

I think any disagreement we have here is subsumed by our discussion elsewhere in this thread.

I think there's probably local Pareto improvements that we can make to B, but that's very different from switching to A (which is what your OP was arguing for).

Perhaps I will write that philanthropy post, and then we will have a concrete example to discuss.

Besides that, I'm concerned about many people seemingly convinced that VNM is rationality and working hard to try to justify it, instead of working on a bunch of open problems that seem very important and interesting to me, one of which is what rationality actually is.

I appreciate your point.

ETA: Wei_Dai and I determined that part of our apparent disagreement came from the fact that an agent with a policy that happens to optimize a function does not need to use a decision algorithm that computes expected values.

Comment author: [deleted] 13 January 2013 11:07:21AM *  1 point [-]

If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)

You refer to cases such as A = “I give the last candy to Alice”, B = “I give the last candy to Bob” and you strictly prefer the lottery {50% A, 50% B} to {100% A} or {100% B}?

But remember that we're talking about entire world histories, not just world states -- If you take A0 = “I arbitrarily give the last candy to Alice”, A1 = “I flip a coin to decide whom to give the last candy to, and Alice wins”, etc., you can easily have A1 = B1 > A0 = B0, since A1 and A0 are different (one includes you flipping a coin, the other doesn't). So a function from world histories would suffice, after all.

Comment author: Wei_Dai 13 January 2013 02:29:23PM 0 points [-]

I'm pretty sure Nisan meant to define "world-histories" in a way to exclude utility functions like that, otherwise it's hard to make sense of the convexity property that he assumes in his theorem. (Hopefully he will jump in and confirm or deny this.)

Comment author: Nisan 15 January 2013 01:07:11AM 0 points [-]

Yes, we should assume the agent has access to a source of uncertainty with respect to which the functions v_i are invariant.

In fact, let's assume a kind of Cartesian dualism, so that the agent (and a single fair coin) are not part of the world. That way the agent can't have preferences over its own decision procedure.

Comment author: Vaniver 13 January 2013 04:08:52AM 0 points [-]

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights.

I think these weights are descriptive, not prescriptive. Eliciting values is very important- and there's some work in the decision analysis literature on that- but there isn't much to be done theoretically, since most of the work is "how do we work around the limitations of human psychology?" rather than "how do we get the math right?".

Comment author: Wei_Dai 13 January 2013 05:22:56AM 2 points [-]

I think these weights are descriptive, not prescriptive.

What do you mean by that? Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist? But the whole point of the OP is to convince people who are not already EU maximizers to become EU maximizers.

Comment author: Vaniver 13 January 2013 04:23:56PM 1 point [-]

Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist?

I think my answer would be along the lines of "humans have preferences that could be consistently aggregated but they are bad at consistently aggregating them due to the computational difficulties involved." For example, much of the early statistical prediction rule work fit a linear regression to a particular expert's output on training cases, and found that the regression of that expert beat the expert on new cases- that is, it captured enough of their expertise but did not capture as much of their mistakes, fatigue, and off days. If you're willing to buy that a simple algorithm based on a doctor can diagnose a disease better than that doctor, then it doesn't seem like a big stretch to claim that a simple algorithm based on a person can satisfy that person's values better than that person's decisions made in real-time. (In order to move from 'diagnose this one disease' to 'make choices that impact my life trajectory' you need much, much more data, and probably more sophisticated aggregation tools than linear regression, but the basic intuition should hold.)

And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn't equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.* (The 'equivalent to' is what makes it only 'sort of' prescriptive.) If the weights aren't all positive, that's probably also a bug (since that means one of your subvalues has no impact on your preferences, and thus it's not actually a subvalue). But what should the relative weights for and be? Well, that depends on the tradeoffs that the person is willing to make; it's not something we can pin down theoretically.

*Or you erroneously identified two subvalues as distinct, when they are related and should be mapped jointly.

Comment author: Wei_Dai 13 January 2013 08:30:19PM 0 points [-]

And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn't equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.

I tried to argue against this in the top level comment of this thread, but may not have been very clear. I just came up with a new argument, and would be interested to know whether it makes more sense to you.