Related: Pinpointing Utility
If I ever say "my utility function", you could reasonably accuse me of cargo-cult rationality; trying to become more rational by superficially immitating the abstract rationalists we study makes about as much sense as building an air traffic control station out of grass to summon cargo planes.
There are two ways an agent could be said to have a utility function:
-
It could behave in accordance with the VNM axioms; always choosing in a sane and consistent manner, such that "there exists a U". The agent need not have an explicit representation of U.
-
It could have an explicit utility function that it tries to expected-maximize. The agent need not perfectly follow the VNM axioms all the time. (Real bounded decision systems will take shortcuts for efficiency and may not achieve perfect rationality, like how real floating point arithmetic isn't associative).
Neither of these is true of humans. Our behaviour and preferences are not consistent and sane enough to be VNM, and we are generally quite confused about what we even want, never mind having reduced it to a utility function. Nevertheless, you still see the occasional reference to "my utility function".
Sometimes "my" refers to "abstract me who has solved moral philosophy and or become perfectly rational", which at least doesn't run afoul of the math, but is probably still wrong about the particulars of what such an abstract idealized self would actually want. But other times it's a more glaring error like using "utility function" as shorthand for "entire self-reflective moral system", which may not even be VNMish.
But this post isn't really about all the ways people misuse terminology, it's about where we're actually at on the whole problem for which a utility function might be the solution.
As above, I don't think any of us have a utility function in either sense; we are not VNM, and we haven't worked out what we want enough to make a convincing attempt at trying. Maybe someone out there has a utility function in the second sense, but I doubt that it actually represents what they would want.
Perhaps then we should speak of what we want in terms of "terminal values"? For example, I might say that it is a terminal value of mine that I should not murder, or that freedom from authority is good.
But what does "terminal value" mean? Usually, it means that the value of something is not contingent on or derived from other facts or situations, like for example, I may value beautiful things in a way that is not derived from what they get me. The recursive chain of valuableness terminates at some set of values.
There's another connotation, though, which is that your terminal values are akin to axioms; not subject to argument or evidence or derivation, and simply given, that there's no point in trying to reconcile them with people who don't share them. This is the meaning people are sometimes getting at when they explain failure to agree with someone as "terminal value differences" or "different set of moral axioms". This is completely reasonable, if and only if that is in fact the nature of the beliefs in question.
About two years ago, it very much felt like freedom from authority was a terminal value for me. Those hated authoritarians and fascists were simply wrong, probably due to some fundamental neurological fault that could not be reasoned with. The very prototype of "terminal value differences".
And yet here I am today, having been reasoned out of that "terminal value", such that I even appreciate a certain aesthetic in bowing to a strong leader.
If that was a terminal value, I'm afraid the term has lost much of its meaning to me. If it was not, if even the most fundamental-seeming moral feelings are subject to argument, I wonder if there is any coherent sense in which I could be said to have terminal values at all.
The situation here with "terminal values" is a lot like the situation with "beliefs" in other circles. Ask someone what they believe in most confidently, and they will take the opportunity to differentiate themselves from the opposing tribe on uncertain controversial issues; god exists, god does not exist, racial traits are genetic, race is a social construct. The pedant answer of course is that the sky is probably blue, and that that box over there is about a meter long.
Likewise, ask someone for their terminal values, and they will take the opportunity to declare that those hated greens are utterly wrong on morality, and blueness is wired into their very core, rather than the obvious things like beauty and friendship being valuable, and paperclips not.
So besides not having a utility function, those aren't your terminal values. I'd be suprised if even the most pedantic answer weren't subject to argument; I don't seem to have anything like a stable and non-negotiable value system at all, and I don't think that I am even especially confused relative to the rest of you.
Instead of a nice consistent value system, we have a mess of intuitions and hueristics and beliefs that often contradict, fail to give an answer, and change with time and mood and memes. And that's all we have. One of the intuitions is that we want to fix this mess.
People have tried to do this "Moral Philosophy" thing before, myself included, but it hasn't generally turned out well. We've made all kinds of overconfident leaps to what turn out to be unjustified conclusions (utilitarianism, egoism, hedonism, etc), or just ended up wallowing in confused despair.
The zeroth step in solving a problem is to notice that we have a problem.
The problem here, in my humble opinion, is that we have no idea what we are doing when we try to do Moral Philosophy. We need to go up a meta-level and get a handle on Moral MetaPhilosophy. What's the problem? What are the relevent knowns? What are the unknowns? What's the solution process?
Ideally, we could do for Moral Philosphy approximately what Bayesian probability theory has done for Epistemology. My moral intuitions are a horrible mess, but so are my epistemic intuitions, and yet we more-or-less know what we are doing in epistemology. A problem like this has been solved before, and this one seems solvable too, if a bit harder.
It might be that when we figure this problem out to the point where we can be said to have a consistent moral system with real terminal values, we will end up with a utility function, but on the other hand, we might not. Either way, let's keep in mind that we are still on rather shaky ground, and at least refrain from believing the confident declarations of moral wisdom that we so like to make.
Moral Philosophy is an important problem, but the way is not clear yet.
You do have a utility function (though it may be stochastic). You just don't know what it is. "Utility function" means the same thing as "decision function"; it just has different connotations. Something determines how you act; that something is your utility function, even if it can be described only as a physics problem plus random numbers generated by your free will and adjustments made by God. (God must be encapsulated in an oracle function.) We call it a utility function to clue people into our purposes and the literature that we're going to draw on for our analysis. If we wished to regard a thing as deterministic rather than as an agent with free will, we would call its decision function a probability density function instead of a utility function.
If you truly have terminal values, they are mainly described by a large matrix of synaptic connections and weights.
When you say "I don't have a utility function" or "I don't have terminal values", you are mostly complaining that approximations are only approximations. You are thinking about some approximation of your utility function or your terminal values, expressed in language or logic, using symbols that conveniently but inaccurately cluster all possible sense-experience vectors into categories, and logical operations that throw away all information but the symbols (and perhaps some statistics, such as a probability or typicality for each symbol).
When we use the words "utility function", the level of abstraction to use to describe it, and hence its accuracy, depends on the purpose we have in mind. What's incoherent is talking about "my utility function" absent any such purpose. It's just like asking "What is the length of the coast of England?"
Whether you have terminal values is a more-complicated question, for uninteresting reasons such as quantum mechanical considerations. The short answer is probably, Any level of abstraction that is simple enough for you to think about, is too simple to capture values that are guaranteed not to change.
Underneath both these questions is the tricky question, "Which me is me?" Are you asking about the utility function enacted by the set of SNPs in your DNA, by your body, or by your conscious mind? These are not the same utility functions. (Whether your conscious mind has a utility function is a tricky question because we would have to separate actions controlled by your conscious mind from actions your body takes not controlled by your conscious mind. If consciousness is epiphenomenal, your mind does not have a useful utility function.)
One common use of terminal values on LW is to try to divine a set of terminal values for humans that can be used to guide an AI. So a specific, meaningful, useful question would be, "Can I discover and describe my terminal values in enough detail that I can be confident that an AI, controlled by these values, will enact the coherent extrapolated volition of these values?" ("Coherent extrapolated volition" may be meaningless, but that's a separate issue.) I believe the answer is no, which is one reason why I don't support MIRI's efforts toward FAI.
Eliezer spent a lot of time years ago explaining in detail why giving an AI goals like "Make humans happy" is problematic, and began to search for the appropriate level of description of goals/values. He unfortunately didn't pursue this to its conclusion, and chose to focus on errors caused by drift from the original utility function, or by logics that fail to achieve rationality, to the exclusion of consideration of changes caused by the inevitable inexactness of a representation of a utility function and the random component of the original utility function, or of the tricky ontological questions that crop up when you ask, "Whose utility function?"
What? Not having terminal values means that either you don't care about anything at all, or that “the recursive chain of valuableness” is infinitely deep. Neither of these seems likely to me.