army1987 comments on A fungibility theorem - Less Wrong

21 Post author: Nisan 12 January 2013 09:27AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (66)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 12 January 2013 06:35:52PM *  6 points [-]

Suppose further that you can quantify each item on that list with a function from world-histories to real numbers, and you want to optimize for each function, all other things being equal.

If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)

What the theorem says is that if you really care about the values on that list, then there are linear aggregations that you should start optimizing for.

I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that's mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real "optimizing" was already done prior to the time you form the linear aggregation.

(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you've already "optimized" prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)

Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can't obtain encodings of such functions since I don't have introspective access to my values, and therefore I can't compute linear aggregations of them. So I don't know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.

If you're capable of making precommitments and we don't worry about computational difficulties [...]

I'm glad you made these assumptions explicit, but shouldn't there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn't seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).

(I've skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)

Comment author: [deleted] 13 January 2013 11:07:21AM *  1 point [-]

If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)

You refer to cases such as A = “I give the last candy to Alice”, B = “I give the last candy to Bob” and you strictly prefer the lottery {50% A, 50% B} to {100% A} or {100% B}?

But remember that we're talking about entire world histories, not just world states -- If you take A0 = “I arbitrarily give the last candy to Alice”, A1 = “I flip a coin to decide whom to give the last candy to, and Alice wins”, etc., you can easily have A1 = B1 > A0 = B0, since A1 and A0 are different (one includes you flipping a coin, the other doesn't). So a function from world histories would suffice, after all.

Comment author: Wei_Dai 13 January 2013 02:29:23PM 0 points [-]

I'm pretty sure Nisan meant to define "world-histories" in a way to exclude utility functions like that, otherwise it's hard to make sense of the convexity property that he assumes in his theorem. (Hopefully he will jump in and confirm or deny this.)

Comment author: Nisan 15 January 2013 01:07:11AM 0 points [-]

Yes, we should assume the agent has access to a source of uncertainty with respect to which the functions v_i are invariant.

In fact, let's assume a kind of Cartesian dualism, so that the agent (and a single fair coin) are not part of the world. That way the agent can't have preferences over its own decision procedure.