Gavin comments on On Terminal Goals and Virtue Ethics - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (205)
I'm pretty confident that I have a strong terminal goal of "have the physiological experience of eating delicious barbecue." I have it in both near and far mode, and remains even when it is disadvantageous in many other ways. Furthermore, I have it much more strongly than anyone I know personally, so it's unlikely to be a function of peer pressure.
That said, my longer term goals seem to be a web of both terminal and instrumental values. Many things are terminal goals as well as having instrumental value. Sex is a good in itself but also feeds needs other big picture psychological and social needs.
So who would you kill if they stood between you and a good barbecue?
( it's almost like you guys haven't thought about what terminal means)
It's almost like you haven't read the multiple comments explaining what "terminal" means.
It simply means "not instrumental". It has nothing to do with the degree of importance assigned relative to other goals, except in that, obviously, instrumental goals deriving from terminal goal X are always less important than X itself. If your utility function is U = A + B then A and B can be sensibly described as terminal, and the fact that A is terminal does not mean you'd destroy all B just to have A.
Yes, "terminal" means final. Terminal goals are final in that your interest in them derives not from any argument but from axiom (ie. built-in behaviours). This doesn't mean you can't have more than one.
Ok,well your first link is to Lumifers account of TGs as cognitivelyly inaccessible, since rescinded.
What? It doesn't say any such thing. It says they're inexplicable in terms of the goal system being examined, but that doesn't mean they're inaccessible, in the same way that you can access the parallel postulate within Euclidian geometry but can't justify it in terms of the other Euclidian axioms.
That said, I think we're probably good enough at rationalization that inexplicability isn't a particularly good way to model terminal goals for human purposes, insofar as humans have well-defined terminal goals.
Sorry, what is that "rescinded" part?
"It has nothing to do with comprehensibility"
Consider an agent trying to maximize its Pacman score. 'Getting a high Pacman score' is a terminal goal for this agent - it doesn't want a high score because that would make it easier for it to get something else, it simply wants a high score. On the other hand, 'eating fruit' is an instrumental goal for this agent - it only wants to eat fruit because that increases its expected score, and if eating fruit didn't increase its expected score then it wouldn't care about eating fruit.
That is the only difference between the two types of goals. Knowing that one of an agent's goals is instrumental and another terminal doesn't tell you which goal the agent values more.
Since you seem to be purposefully unwilling to understand my posts, could you please refrain from declaring that I have "rescinded" my opinions on the matter?
So you have a thing which is like an axiom in that it can't be explained in more basic terms...
..but is unlike an axiom in that you can ignore its implications where they don't suit.. you don't have to savage galaxies to obtain bacon...
..unless you're an AI and it's paperclips instead of bacon, because in that case these axiom like things actually are axiom like.
Terminal values can be seen as value axioms in that they're the root nodes in a graph of values, just as logical axioms can be seen as the root nodes of a graph of theorems.
They are unlike logical axioms in that we're using them to derive the utility consequent on certain choices (given consequentialist assumptions; it's possible to have analogs of terminal values in non-consequentialist ethical systems, but it's somewhat more complicated) rather than the boolean validity of a theorem. Different terminal values may have different consequential effects, and they may conflict without contradiction. This does not make them any less terminal.
Clippy has only one terminal value which doesn't take into account the integrity of anything that isn't a paperclip, which is why it's perfectly happy to convert the mass of galaxies into said paperclips. Humans' values are more complicated, insofar as they're well modeled by this concept, and involve things like "life" and "natural beauty" (I take no position on whether these are terminal or instrumental values w.r.t. humans), which is why they generally aren't.
Locally, human values usually are modelled by TGs.
What's conflict without contradiction?
If acquiring bacon was your ONLY terminal goal, then yes, it would be irrational not to do absolutely everything you could to maximize your expected bacon. However, most people have more than just one terminal goal. You seem to be using 'terminal goal' to mean 'a goal more important than any other'. Trouble is, no one else is using it this way.
EDIT: Actually, it seems to me that you're using 'terminal goal' to mean something analogous to a terminal node in a tree search (if you can reach that node, you're done). No one else is using it that way either.
Feel free to offer the correc definition. But note that you came define it as overridable, since non terminal goals are already defined that way.
There is no evidence that people have one or more terminal goals . At least you need to offer a definition such that multiple TGs don't collide, and are distinguishable from non TGs.
It looks to me (am I misunderstanding?) as if you take "X is a terminal goal" to mean "X is of higher priority than anything else". That isn't how I use the term, and isn't how I think most people here use it.
I take "X is a terminal goal" to mean "X is something I value for its own sake and not merely because of other things it leads to". Something can be a terminal goal but not a very important one. And something can be a non-terminal goal but very important because the terminal goals it leads to are of high priority.
So it seems perfectly possible for eating barbecue to be a terminal goal even if one would not generally kill to achieve it.
[EDITED to add the following.]
On looking at the rest of this thread, I see that others have pointed this out to you and you've responded in ways I find baffling. One possibility is that there's a misunderstanding on one or other side that might be helped by being more explicit, so I'll try that.
The following is of course an idealized thought experiment; it is not intended to be very realistic, merely to illustrate the distinction between "terminal" and "important".
Consider someone who, at bottom, cares about two things (and no others). (1) She cares a lot about people (herself or others) not experiencing extreme physical or mental anguish. (2) She likes eating bacon. These are (in my terminology, and I think that of most people here) her "terminal values". It happens that #1 is much more important to her than #2. This doesn't (in my terminology, and I think that of most people here) make #2 any less terminal; just less important.
She has found that simply attending to these two things and nothing else is not very effective in minimizing anguish and maximizing bacon. For instance, she's found that a diet of lots of bacon and nothing else tends to result in intestinal anguish, and what she's read leads her to think that it's also likely to result in heart attacks (which are very painful, and sometimes lead to death, which causes mental anguish to others). And she's found that people are more likely to suffer anguish of various kinds if they're desperately poor, if they have no friends, etc. And so she comes to value other things, not for their own sake, but for their tendency to lead to less anguish and more bacon later: health, friends, money, etc.
So, one day she has the opportunity to eat an extra slice of bacon, but for some complicated reason which this comment is too short to contain doing so will result in hundreds of randomly selected people becoming thousands of dollars poorer. Eating bacon is terminally valuable for her; the states of other people's bank accounts are not. But poorer people are (all else being equal) more likely to find themselves in situations that make them miserable, and so keeping people out of poverty is a (not terminal, but important) goal she has. So she doesn't grab the extra slice of bacon.
(She could in principle attempt an explicit calculation, considering only anguish and bacon, of the effects of each choice. But in practice that would be terribly complicated, and no one has the time to be doing such calculations whenever they have a decision to make. So what actually happens is that she internalizes those non-terminal values, and for most purposes treats them in much the same way as the terminal ones. So she isn't weighing bacon against indirect hard-to-predict anguish, but against more-direct easier-to-predict financial loss for the victims.)
Do you see some fundamental incoherence in this? Or do you think it's wrong to use the word "terminal" in the way I've described?
There's no incoherence in defining "terminal" as "not lowest priority", which is basically what you are saying.
It just not what the word means.
Literally, etymologically, that is not what terminal means. It means maximal, or final. A terminal illness is not an illness that is a bit more serious than some other illness.
It's not even what it usually means on LW. If Clippies goals were terminal in your sense, they would be overridable .....you would be able to talk Clippie out of papercliiping.
What you are talking about is valid, is a thing. If you have any hierarchy of goals, there are some at the bottom, some in the middle, and some at the top. But you need to invent a new word for the middle ones, because, "terminal" doesn't mean "intermediate".
OK, that makes the source of disagreement clearer.
I agree that "terminal" means "final" (but not that it means "maximal"; that's a different concept). But it doesn't (to me, and I think to others on LW) mean "final" in the sense I think you have in mind (i.e., so supremely important that once you notice it applies you can stop thinking), but in a different sense (when analysing goals or values, asking "so why do I want X?", this is a point at which you can go no further: "well, I just do").
So we're agreed on the etymology: a "terminal" goal or value is one-than-which-one-can-go-no-further. But you want it to mean "no further in the direction of increasing importance" and I want it to mean "no further in the direction of increasing fundamental-ness". I think the latter usage has at least the following two advantages:
The trouble with Clippy isn't that his paperclip-maximizing goal is terminal, it's that that's his only goal.
I'm not sure whether in your last paragraph you're suggesting that I'm using "terminal" to mean "intermediate in importance", but for the avoidance of doubt I am not doing anything at all like that. There are two separate things here that you could call hierarchies, one in terms of importance and one in terms of explanation, and "terminal" refers (in my usage, which I think is also the LW-usual one) only to the latter.
We can go a step further, actually: "teminal value" and various synonyms are well-established within philosophy, where they usually carry the familiar LW meaning of "something that has value in itself, not as a means to an end".
No. Clippy cannot be persuaded away from paperclipping because maximizing paperclips is its only terminal goal.
I feel like there's not much of a distinction being made here between terminal values and terminal goals. I think they're importantly different things.
Huh?
A goal I set is a state of the world I am actively trying to bring about, whereas a value is something which . . . has value to me. The things I value dictate which world states I prefer, but for either lack of resources or conflict, I only pursue the world states resulting from a subset of my values.
So not everything I value ends up being a goal. This includes terminal goals. For instance, I think that it is true that I terminally value being a talented artist - greatly skilled in creative expression - being so would make me happy in and of itself, but it's not a goal of mine because I can't prioritise it with the resources I have. Values like eliminating suffering and misery are ones which matter to me more, and get translated into corresponding goals to change the world via action.
I haven't seen a definition provided, but if I had to provide one for 'terminal goal' it would be that it's a goal whose attainment constitutes fulfilment of a terminal value. Possessing money is rarely a terminal value, and so accruing money isn't a terminal goal, even if it is intermediary to achieving a world state desired for its own sake. Accomplishing the goal of having all the hungry people fed is the world state which lines up with the value of no suffering, hence it's terminal. They're close, but not quite same thing.
I think it makes sense to possibly not work with terminal goals on a motivational/decision making level, but it doesn't seem possible (or at least likely) that someone wouldn't have terminal values, in the sense of not having states of the world which they prefer over others. [These world-state-preferences might not be completely stable or consistent, but if you prefer the world be one way than another, that's a value.]
I don't think that terminal goal means that it's the highest priority here, just that there is no particular reason to achieve it other than the experience of attaining that goal. So eating barbecue isn't about nutrition or socializing, it's just about eating barbecue.
I think the 'terminal' in terminal goal means 'end of that thread of goals', as in a train terminus. Something that is wanted for the sake of itself.
It does not imply that you will terminate someone to achieve it.
If g1 is you bacon eating goal, ,and g2 is your not killing people goal, and g2 overrides g1, then g2 is the end of the thread.
Hmm. I guess I would describe that as more of an urge than as a terminal goal. (I think "terminal goal" is supposed to activate a certain concept of deliberate and goal-directed behavior and what I'm mostly skeptical of is whether that concept is an accurate model of human preferences.) Do you, for example, make long-term plans based on calculations about which of various life options will cause you to eat the most delicious barbecue?
It's hard to judge just how important it is, because I have fairly regular access to it. However, food options definitely figure into long term plans. For instance, the number of good food options around my office are a small but very real benefit that helps keep me in my current job. Similarly, while plenty of things can trump food, I would see the lack of quality food to be a major downside to volunteering to live in the first colony on Mars. Which doesn't mean it would be decisive, of course.
I will suppress urges to eat in order to have the optimal experience at a good meal. I like to build up a real amount of hunger before I eat, as I find that a more pleasant experience than grazing frequently.
I try to respect the hedonist inside me, without allowing him to be in control. But I think I'm starting to lean pro-wireheading, so feel free to discount me on that account.