endoself comments on A fungibility theorem - Less Wrong

21 Post author: Nisan 12 January 2013 09:27AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 12 January 2013 06:35:52PM *  6 points [-]

Suppose further that you can quantify each item on that list with a function from world-histories to real numbers, and you want to optimize for each function, all other things being equal.

If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)

What the theorem says is that if you really care about the values on that list, then there are linear aggregations that you should start optimizing for.

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that's mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real "optimizing" was already done prior to the time you form the linear aggregation.

(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you've already "optimized" prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)

Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can't obtain encodings of such functions since I don't have introspective access to my values, and therefore I can't compute linear aggregations of them. So I don't know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.

If you're capable of making precommitments and we don't worry about computational difficulties [...]

I'm glad you made these assumptions explicit, but shouldn't there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn't seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).

(I've skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)

Comment author: endoself 12 January 2013 08:32:09PM 3 points [-]

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights.

I agree with this concern. The theorem is basically saying that, given any sensible aggregation rule, there is a linear aggregation rule that produces the same decisions. However, it assumes that we already have a prior; the linear coefficients are allowed to depend on what we think the world actually looks like, rather than being a pure representation of values. I think people, especially those who don't understand the proof of this theorem, are likely to misinterpret it.