Vaniver comments on A fungibility theorem - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (66)
If fairness is one of my values, it can't necessary be represented by such a function. (I.e., it may need to be a function from lotteries over world-histories to real numbers.)
I think before you make this conclusion, you have to say something about how one is supposed to pick the weights. The theorem itself seems to suggest that I can pick the weights by choosing a Pareto-optimal policy/outcome that's mutually acceptable to all of my values, and then work backwards to a set of weights that would generate a utility function (or more generally, a way to pick such weights based on a coin-flip) that would then end up optimizing for the same outcome. But in this case, it seems to me that all of the real "optimizing" was already done prior to the time you form the linear aggregation.
(ETA: I guess the key question here is whether the weights ought to logically depend on the actual shape of the Pareto frontier. If yes, then you have to compute the Pareto frontier before you choose the weights, in which case you've already "optimized" prior to choosing the weights, since computing the Pareto frontier involves optimizing against the individual values as separate utility functions.)
Also, even if my values can theoretically be represented by functions from world-histories to real numbers, I can't obtain encodings of such functions since I don't have introspective access to my values, and therefore I can't compute linear aggregations of them. So I don't know how I can start optimizing for a linear aggregation of my values, even if I did have a reasonable way to derive the weights.
I'm glad you made these assumptions explicit, but shouldn't there be a similar caveat when you make the final conclusions? The way I see it, I have a choice between (A) a solution known to be optimal along some dimensions not including considerations of logical uncertainty and dynamical consistency, or (B) a very imperfectly optimized solution that nevertheless probably does take them into account to some degree (i.e., the native decision making machinery that evolution gave me). Sticking with B for now doesn't seem unreasonable to me (especially given the other difficulties I mentioned with trying to implement A).
(I've skipped some of the supporting arguments in this comment since I already wrote about them under the recent Harsanyi post. Let me know if you want me clarify anything.)
I think these weights are descriptive, not prescriptive. Eliciting values is very important- and there's some work in the decision analysis literature on that- but there isn't much to be done theoretically, since most of the work is "how do we work around the limitations of human psychology?" rather than "how do we get the math right?".
What do you mean by that? Are you saying humans already maximize expected utility using some linear aggregation of individual values, so these weights already exist? But the whole point of the OP is to convince people who are not already EU maximizers to become EU maximizers.
I think my answer would be along the lines of "humans have preferences that could be consistently aggregated but they are bad at consistently aggregating them due to the computational difficulties involved." For example, much of the early statistical prediction rule work fit a linear regression to a particular expert's output on training cases, and found that the regression of that expert beat the expert on new cases- that is, it captured enough of their expertise but did not capture as much of their mistakes, fatigue, and off days. If you're willing to buy that a simple algorithm based on a doctor can diagnose a disease better than that doctor, then it doesn't seem like a big stretch to claim that a simple algorithm based on a person can satisfy that person's values better than that person's decisions made in real-time. (In order to move from 'diagnose this one disease' to 'make choices that impact my life trajectory' you need much, much more data, and probably more sophisticated aggregation tools than linear regression, but the basic intuition should hold.)
And so I think the methodology is (sort of) prescriptive: whatever you do, if it isn't equivalent to a linear combination of your subvalues, then your aggregation procedure is introducing new subvalues, which is probably a bug.* (The 'equivalent to' is what makes it only 'sort of' prescriptive.) If the weights aren't all positive, that's probably also a bug (since that means one of your subvalues has no impact on your preferences, and thus it's not actually a subvalue). But what should the relative weights for
and
be? Well, that depends on the tradeoffs that the person is willing to make; it's not something we can pin down theoretically.
*Or you erroneously identified two subvalues as distinct, when they are related and should be mapped jointly.
I tried to argue against this in the top level comment of this thread, but may not have been very clear. I just came up with a new argument, and would be interested to know whether it makes more sense to you.