Like the word "rational" is sometimes used instead of "optimal" or "good", words "utility function" are probably used to mean "good" or "our values" or something like that.
Therefore, analogically to the suggestion of only using the word 'rational' when talking about cognitive algorithms and thinking techniques, we should only use the words 'utility function' when talking about computer programs. When speaking about humans, "good / better / the best" probably expresses what we need well enough.
It seems worth reflecting on the fact that the point of the foundational LW material discussing utility functions was to make people better at reasoning about AI behavior and not about human behavior.
I think part of Eliezer's point was also to introduce decision theory as an ideal for human rationality. (See http://lesswrong.com/lw/my/the_allais_paradox/ for example.) Without talking about utility functions, we can't talk about expected utility maximization, so we can't define what it means to be ideally rational in the instrumental sense (and we also can't justify Bayesian epistemology based on decision theory).
So I agree with the problem stated here, but "let's stop talking about utility functions" can't be the right solution. Instead we need to emphasize more that having the wrong values is often worse than being irrational, so until we know how to obtain or derive utility functions that aren't wrong, we shouldn't try to act as if we have utility functions.
It's more than a metaphor; a utility function is the structure any consistent preference ordering that respects probability must have. It may or may not be a useful conceptual tool for practical human ethical reasoning, but "just a metaphor" is too strong a judgment.
a utility function is the structure any consistent preference ordering that respects probability must have.
This is the sort of thing I mean when I say that people take utility functions too seriously. I think the von Neumann-Morgenstern theorem is much weaker than it initially appears. It's full of hidden assumptions that are constantly violated in practice, e.g. that an agent can know probabilities to arbitrary precision, can know utilities to arbitrary precision, can compute utilities in time to make decisions, makes a single plan at the beginning of time about how they'll behave for eternity (or else you need to take into account factors like how the agent should behave in order to acquire more information in the future and that just isn't modeled by the setup of vNM at all), etc.
The biggest problematic unstated assumption behind applying VNM-rationality to humans, I think, is the assumption that we're actually trying to maximize something.
To elaborate, the VNM theorem defines preferences by the axiom of completeness, which states that for any two lotteries A and B, one of the following holds: A is preferred to B, B is preferred to A, or one is indifferent between them.
So basically, a “preference” as defined by the axioms is a function that (given the state of the agent and the state of the world in general) outputs an agent’s decision between two or more choices. Now suppose that the agent’s preferences violate the Von Neumann-Morgenstern axioms, so that in one situation it prefers to make a deal that causes it to end up with an apple rather than an orange, and in another situation it prefers to make a deal that causes it to end up with an orange rather than an apple. Is that an argument against having circular preferences?
By itself, it's not. It simply establishes that the function that outputs the agent’s actions behaves differently in different situations. Now the normal way to establish that this is bad is to assume that all choices are between monetar...
Essentially every post would have been better if it had included some additional thing. Based on various recent comments I was under the impression that people want more posts in Discussion so I've been experimenting with that, and I'm keeping the burden of quality deliberately low so that I'll post at all.
"a utility function is the structure any consistent preference ordering that respects probability must have."
Yes, but humans still don't have one. It's not even clear they can make themselves have one.
I don't think I have much to add to this discussion that you guys aren't already going to have covered, except to note that Qiaochu definitely understands what a utility function is and all of the standard arguments for why they "should" exist, so his beliefs are not a function of not having heard these arguments (just noting this because this thread and some of the siblings seem to be trying to explain basic concepts to Qiaochu that I'm confident he already knows, and I'm hoping that pointing this out will speed up the discussion).
There's a problem with discussing ethics in terms of UFs, which is that no attempt is made to separate morally relevant preferences from others. Which is a wider issue than UFs. There may be some further issue with UFs.
The problem partly is in utility functions being used as both: a) as a metaphor and b) as an exact mathematical tool with exact properties.
a) can be used to elucidate terminal values in a discussion or to structure and focus a discussion away from vague concepts in ethics. But as a metaphor it cannot be used to derive anything with strength. b) on the other hand can strictly only be used where the preconditions are satisfied. Mixing a) and b) means committing the mathematical fallacy: Believing that to have formulated something in an exact way solves the issue in practice.
Yes! Thank you for saying this clearly and distinctly.
Real-world objects are never perfect spheres or other mathematical entities. However, math is quite useful for modeling them. But the way we decide which math is the right math to use to model a particular sort of object is through repeated experiment. And sometimes the trajectory through spacetime of a given object (say, a gold coin of a certain mass) is best modeled by certain math (e.g. ballistics) and sometimes by other very different math (e.g. economics).
Utility functions belong to the math, not the territory.
It seems worth reflecting on the fact that the point of the foundational LW material discussing utility functions was to make people better at reasoning about AI behavior and not about human behavior.
For value extrapolation problem, you need to consider both what an AI could do with a goal (how to use it, what kind of thing it is), and which goal represents humane values (how to define it).
I still think there's too much confusion between ethics-for-AI and ethics-for-humans discussions here. There's no particular reason that a conceptual apparatus suited for the former discussion should also be suited for the latter discussion.
For practical purposes I agree that it does not help a lot to talk about utility functions. As the We Don't Have a Utility Function article points out, we simply do not know our utility functions but only vague terminal values. However, as you pointed out yourself that does not mean that we do not "have" a utility function at all.
The soft (and hard) failure seems to be a tempting but unnecessary case of pseudo-rationalization. Still, the concept of an agent "having" (maybe in the sense of "acting in a complex way towards optimizing...
On the one hand, you are correct regarding philosophy for humans: we do ethics and meta-ethics to reduce our uncertainty about our utility functions, not as a kind of game-tree planning based on already knowing those functions.
On the other hand, the Von-Neumann-Morgenstern Theorem says blah blah blah blah.
On the third hand, if you have a mathematical structure we can use to make no-Dutch-book decisions that better models the kinds of uncertainty we deal with as embodied human beings in real life, I'm all ears.
I don't think Dutch book arguments matter in practice. An easy way to avoid being Dutch booked is to refuse bets being offered to you by people you don't trust.
Can you give some specific examples of people misusing utility functions? Or if you don't want to point fingers, can you construct examples similar to those you've seen people use?
To the extent that we care about causing people to become better at reasoning about ethics, it seems like we ought to be able to do better than this.
What would you propose as an alternative?
The example that comes to mind to show the how the sex thing isn't a problem is that of a robot car with a goal to drive as many miles as possible. Every day it will burn through all its fuel and fuel up. Right after it fuels up, it will have no desire for further fuel - more fuel simply does not help it go further at this point, and forcing it can be detrimental. Clearly not contradictory
You could have a similar situation with a couple wanting sex iff they haven't had sex in a day, or wanting an orange if you've just eaten an apple but wanting an apple if you've just eaten an orange.
To strictly show that something violates vNM axioms, you'd have to show that this behavior (in context) can't be fulfilling any preferences better than other options that the agent is aware of - or at least be able to argue that the revealed utility function is contrived and unlikely to hold up in other situations (not what the agent "really wants").
Constantly wanting what one doesn't have can have this defect. If I keep paying you to switch my apple for your orange and back (without actually eating either), then you have a decent case, if you're pretty confident I'm not actually fulfilling my desire to troll you ;)
The "want's a relationship when single" and "wants to be single when not" thing does look like such a violation to me. If you let him flip flop as often as he desires, he's not going to end up happily endorsing his past actions. If you offered him a pill that would prevent him from flip flopping, he very well may take it. So there's a contradiction there.
To bring human-specific psychology into it, its not that his inherent desires are contradictory, but that he wants something like "freedom", which he doesn't know how to get in a relationship and something like "intimacy", which he doesn't know how to get while single. It's not that he want's intimacy when single and freedom when not, it's that he wants both always, but the unfulfilled need is the salient one.
Picture me standing on your left foot. "Oww! Get off my left foot!". Then I switch to the right "Ahh! Get off my right foot!". If you're not very quick and/or the pain is overwhelming, it might take you a few iterations to realize the situation you're in and to put the pain aside while you think of a way to get me off both feet (intimacy when single/freedom in a relationship). Or if you can't have that, it's another challenge to figure out what you want to do about it.
I wouldn't model you as "just VNM-irrational", even if your external behaviors are ineffective for everything you might want. I'd model you as "not knowing how to be VNM-rational in presence of strong pain(s)", and would expect you to start behaving more effectively when shown how.
(and that is what I find, although showing someone how to be more rational is not trivial and "here's a proof of the inconsistency of your actions now pick a side and stop feeling the desire for the other side" is almost never sufficient. You have to be able to model the specific way that they're stuck and meet them there)
tl;dr: We're not VNM-rational because we don't know how to be, not because it's not something we're trying to do.
How do you distinguish his preferences being irrationally inconsistent (he is worse off from entering and leaving relationships repeatedly) from him truly wanting to be in relationships periodically (like how it's rational to alternate between sleeping and waking rather than always doing one or the other)?
If there's a pill that can make him stop switching (but doesn't change his preferences), one of two things will happen: either he'll never be in a relationship (prevented from entering), or he'll stay in his current relationship forever (prevented from le...
I think we should stop talking about utility functions.
In the context of ethics for humans, anyway. In practice I find utility functions to be, at best, an occasionally useful metaphor for discussions about ethics but, at worst, an idea that some people start taking too seriously and which actively makes them worse at reasoning about ethics. To the extent that we care about causing people to become better at reasoning about ethics, it seems like we ought to be able to do better than this.
The funny part is that the failure mode I worry the most about is already an entrenched part of the Sequences: it's fake utility functions. The soft failure is people who think they know what their utility function is and say bizarre things about what this implies that they, or perhaps all people, ought to do. The hard failure is people who think they know what their utility function is and then do bizarre things. I hope the hard failure is not very common.
It seems worth reflecting on the fact that the point of the foundational LW material discussing utility functions was to make people better at reasoning about AI behavior and not about human behavior.