Stanovich's paper on why humans are apparently worse at following the VNM axioms than some animals has some interesting things to say, although I don't like the way it says them. I quit halfway through the paper out of frustration, but what I got out of the paper (which may not be what the paper itself was trying to say) is more or less the following: humans model the world at different levels of complexity at different times, and at each of those levels different considerations come into play for making decisions. An agent behaving in this way can appear to be behaving VNM-irrationally when really it is just trying to efficiently use cognitive resources by not modeling the world at the maximum level of complexity all the time. Non-human animals may model the world at more similar levels of complexity over time, so they behave more VNM-rationally even if they have less overall optimization power than humans.
A related consideration, which is more about the methodology of studies claiming to measure human irrationality, is that the problem you think a test subject is solving is not necessarily the problem they're actually solving. I guess a well-known example is when you ask people ...
(Which reminds me: we don't talk anywhere near enough about computational complexity on LW for my tastes. What's up with that? An agent can't do anything right if it can't compute what "right" means before the Sun explodes.)
I agree with this concern (and my professional life is primarily focused on heuristic optimization methods, where computational complexity is huge).
I suspect it doesn't get talked about much here because of the emphasis on intelligence explosion, missing AI insights, provably friendly, normative rationality, and there not being much to say. (The following are not positions I necessarily endorse.) An arbitrarily powerful intelligence might not care much about computational complexity (though it's obviously important if you still care about marginal benefit and marginal cost at that level of power). Until we understand what's necessary for AGI, the engineering details separating polynomial, exponential, and totally intractable algorithms might not be very important. It's really hard to prove how well heuristics do at optimization, let alone robustness. The Heuristics and Biases literature focuses on areas where it's easy to show humans aren't using the right math, rather than how best to think given the hardware you have, and some of that may be deeply embedded in the LW culture.
I think that there's a strong interest in prescriptive rationality, though, and if you have something to say on that topic or computational complexity, I'm interested in hearing it.
I once spent a very entertaining day with a friend wandering around art exhibits once, with both of us doing a lot of "OK, you really like that and that and that and you hate that and that" prediction and subsequent correction.
One thing that quickly became clear was that I could make decent guesses about her judgments long before I could articulate the general rules I was applying to do so, which gave me a really strong sense of having modeled her really well.
One thing that became clear much more slowly was that the general rules I was applying, once I became able to articulate them, were not nearly as complex as they seemed to be when I was simply engaging with them as these ineffable chunks of knowledge.
I concluded from this that that strong ineffable sense of complex modeling is no more evidence of complex modeling than the similar strong ineffable sense of "being on someone's wavelength" is evidence of telepathy. It's just the way my brain feels when it's applying rules it can't articulate to predict the behavior of complex systems.
What you need to estimate for maximizing the utility is not utility but sign of the difference in expected utilities. "More accurate" estimation of utility on one side of the comparison can lead to less accurate estimation of the sign of the difference. Which is what Pascal muggers use.
The main implication is that actions based on comparison between most complete available estimations of utility do not maximize utility. It is similar to evaluating sums; when evaluating 1-1/2+1/3-1/4 and so on, the 1+1/3+1/5+1/7 is a more complete sum than 1 - you have processed more terms (and can pat yourself on the head for doing more arithmetics) , but less accurate. In practice one obtains highly biased "estimates" from someone putting a lot more effort into finding terms of the sign that benefits them the most, and sometimes, from some terms being easier to find.
Confidence in moral judgments is never a sound criterion for them being "terminal", it seems to me.
To see why, consider that ones working values are unavoidably a function of two related things: one's picture of oneself, and of the social world. Thus, confident judgments are likely to reflect confidence in relevant parts of these pictures, rather than the shape of the function. To take your example, your adverse judgement of authority could have been a reflection of a confident picture of your ideal self as not being submissive, and of human soc...
Can you describe the process by which you changed your view on authority? I suspect that could be important.
Read lots, think lots, do lots.
More specifically, become convinced of consequentialism so that pragmatic concerns and exceptions can be handled in a principled way, realize that rule by Friendly AI would be acceptable, attempt to actually run a LW meetup and learn of the pragmatic effectiveness of central decision making, notice major inconsistencies in and highly suspect origin of my non-authoritarian beliefs, notice aesthetic value of leadership and following, etc.
I hadn't come across the von Neumann-Morgenstern utility theorem before reading this post, thanks for drawing it to my attention.
Looking at Moral Philosophy through the lens of agents working with utility/value functions is an interesting exercise; it's something I'm still working on. In the long run, I think some deep thinking needs to be done about what we end up selecting as terminal values, and how we incorporate them into a utility function. (I hope that isn't stating something that is blindingly obvious.)
I guess where you might be headed is into Met...
One day we're going to have to unpack "aesthetic" a bit. I think it's more than just 'oh it feels really nice and fun', but after we used it as applied to HPMOR and Atlas Shrugged - or parable fiction in general - I've been giving it a similar meaning as 'mindset' or 'way of viewing'. It's becoming less clear to me as to how to use the term.
I've been using it in justifications of reading (certain) fiction now, but I want to be careful that I'm not talking about something else, or something that doesn't exist, so my rationality can aim true.
About two years ago, it very much felt like freedom from authority was a terminal value for me. Those hated authoritarians and fascists were simply wrong, probably due to some fundamental neurological fault that could not be reasoned with. The very prototype of "terminal value differences".
And yet here I am today, having been reasoned out of that "terminal value", such that I even appreciate a certain aesthetic in bowing to a strong leader.
On what basis do you assert you were "reasoned out" of that position? For example, ...
I wonder if, and when, we should behave as if we were VNM-rational. It seems vital to act VNM-rational if we're interacting with Omega or for that matter anyone else who is aware of VNM-rationality and capable of creating money pumps. But as you point out we don't have VNM-utility functions. Therefore, there exist some VNM-rational decisions that will make us unhappy. The big question is whether we can be happy about a plan to change all of our actual preferences so that we become VNM-rational, and if not, is there a way to remain happy while strategic...
I have not yet come to terms with how constructs of personal identity fit in with having or not having a utility function. What if it makes most sense to model my agency as a continuous limit of a series of ever more divided discrete agents who bring subsequent, very similar, future agents into existence? Maybe each of those tiny-in-time-extent agents have a utility function, and maybe that's significant?
I think your definition of terminal value is a little vague. The definition I prefer is as follows. A value is instrumental if derives its value from its ability to make other values possible. To the degree that a value is not instrumental, it is a terminal value. Values may be fully instrumental (money), partially instrumental (health [we like being healthy, but it also lets us do other things we like]) or fully terminal (beauty).
Terminal values do not have the warm fuzzy glow of high concepts. Beauty, truth, justice, and freedom may be terminal valu...
First off, I think your observations about terminal values are spot-on, and I was always confused by how little we actually talk about these queer entities known as terminal values.
This discussion reminds me a bit of Scanlon's What We Owe To Each Other. His formulation of moral discourse strikes me as a piece of Meta-Moral philosophy: 'An act is wrong if and only if any principle that permitted it would be one that could reasonably be rejected by people moved to find principles for the general regulation of behaviour that others, similarly motivated, could...
You do have a utility function (though it may be stochastic). You just don't know what it is. "Utility function" means the same thing as "decision function"; it just has different connotations. Something determines how you act; that something is your utility function, even if it can be described only as a physics problem plus random numbers generated by your free will and adjustments made by God. (God must be encapsulated in an oracle function.) We call it a utility function to clue people into our purposes and the literature that ...
The problem here, in my humble opinion, is that we have no idea what we are doing when we try to do Moral Philosophy. We need to go up a meta-level and get a handle on Moral MetaPhilosophy. What's the problem? What are the relevent knowns? What are the unknowns? What's the solution process?
Wasn't there a whole sequence on this?
Related: Pinpointing Utility
If I ever say "my utility function", you could reasonably accuse me of cargo-cult rationality; trying to become more rational by superficially immitating the abstract rationalists we study makes about as much sense as building an air traffic control station out of grass to summon cargo planes.
There are two ways an agent could be said to have a utility function:
It could behave in accordance with the VNM axioms; always choosing in a sane and consistent manner, such that "there exists a U". The agent need not have an explicit representation of U.
It could have an explicit utility function that it tries to expected-maximize. The agent need not perfectly follow the VNM axioms all the time. (Real bounded decision systems will take shortcuts for efficiency and may not achieve perfect rationality, like how real floating point arithmetic isn't associative).
Neither of these is true of humans. Our behaviour and preferences are not consistent and sane enough to be VNM, and we are generally quite confused about what we even want, never mind having reduced it to a utility function. Nevertheless, you still see the occasional reference to "my utility function".
Sometimes "my" refers to "abstract me who has solved moral philosophy and or become perfectly rational", which at least doesn't run afoul of the math, but is probably still wrong about the particulars of what such an abstract idealized self would actually want. But other times it's a more glaring error like using "utility function" as shorthand for "entire self-reflective moral system", which may not even be VNMish.
But this post isn't really about all the ways people misuse terminology, it's about where we're actually at on the whole problem for which a utility function might be the solution.
As above, I don't think any of us have a utility function in either sense; we are not VNM, and we haven't worked out what we want enough to make a convincing attempt at trying. Maybe someone out there has a utility function in the second sense, but I doubt that it actually represents what they would want.
Perhaps then we should speak of what we want in terms of "terminal values"? For example, I might say that it is a terminal value of mine that I should not murder, or that freedom from authority is good.
But what does "terminal value" mean? Usually, it means that the value of something is not contingent on or derived from other facts or situations, like for example, I may value beautiful things in a way that is not derived from what they get me. The recursive chain of valuableness terminates at some set of values.
There's another connotation, though, which is that your terminal values are akin to axioms; not subject to argument or evidence or derivation, and simply given, that there's no point in trying to reconcile them with people who don't share them. This is the meaning people are sometimes getting at when they explain failure to agree with someone as "terminal value differences" or "different set of moral axioms". This is completely reasonable, if and only if that is in fact the nature of the beliefs in question.
About two years ago, it very much felt like freedom from authority was a terminal value for me. Those hated authoritarians and fascists were simply wrong, probably due to some fundamental neurological fault that could not be reasoned with. The very prototype of "terminal value differences".
And yet here I am today, having been reasoned out of that "terminal value", such that I even appreciate a certain aesthetic in bowing to a strong leader.
If that was a terminal value, I'm afraid the term has lost much of its meaning to me. If it was not, if even the most fundamental-seeming moral feelings are subject to argument, I wonder if there is any coherent sense in which I could be said to have terminal values at all.
The situation here with "terminal values" is a lot like the situation with "beliefs" in other circles. Ask someone what they believe in most confidently, and they will take the opportunity to differentiate themselves from the opposing tribe on uncertain controversial issues; god exists, god does not exist, racial traits are genetic, race is a social construct. The pedant answer of course is that the sky is probably blue, and that that box over there is about a meter long.
Likewise, ask someone for their terminal values, and they will take the opportunity to declare that those hated greens are utterly wrong on morality, and blueness is wired into their very core, rather than the obvious things like beauty and friendship being valuable, and paperclips not.
So besides not having a utility function, those aren't your terminal values. I'd be suprised if even the most pedantic answer weren't subject to argument; I don't seem to have anything like a stable and non-negotiable value system at all, and I don't think that I am even especially confused relative to the rest of you.
Instead of a nice consistent value system, we have a mess of intuitions and hueristics and beliefs that often contradict, fail to give an answer, and change with time and mood and memes. And that's all we have. One of the intuitions is that we want to fix this mess.
People have tried to do this "Moral Philosophy" thing before, myself included, but it hasn't generally turned out well. We've made all kinds of overconfident leaps to what turn out to be unjustified conclusions (utilitarianism, egoism, hedonism, etc), or just ended up wallowing in confused despair.
The zeroth step in solving a problem is to notice that we have a problem.
The problem here, in my humble opinion, is that we have no idea what we are doing when we try to do Moral Philosophy. We need to go up a meta-level and get a handle on Moral MetaPhilosophy. What's the problem? What are the relevent knowns? What are the unknowns? What's the solution process?
Ideally, we could do for Moral Philosphy approximately what Bayesian probability theory has done for Epistemology. My moral intuitions are a horrible mess, but so are my epistemic intuitions, and yet we more-or-less know what we are doing in epistemology. A problem like this has been solved before, and this one seems solvable too, if a bit harder.
It might be that when we figure this problem out to the point where we can be said to have a consistent moral system with real terminal values, we will end up with a utility function, but on the other hand, we might not. Either way, let's keep in mind that we are still on rather shaky ground, and at least refrain from believing the confident declarations of moral wisdom that we so like to make.
Moral Philosophy is an important problem, but the way is not clear yet.