pjeby comments on Post Your Utility Function - Less Wrong

28 Post author: taw 04 June 2009 05:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (273)

You are viewing a single comment's thread. Show more comments above.

Comment author: pjeby 05 June 2009 06:24:10PM -2 points [-]

If I place value on e.g. "my friends actually liking and respecting me", rather than just "the subjective sense that my friends like and respect me" then my utility function seems to be responding directly to the territory rather than the map.

In practice, your definition of what "liking and respecting me" means -- i.e., what evidence you expect to see in the world of that -- is part of the map, not the territory.

Suppose, for example, that your friends really and truly like and respect you... but they have to beat you up and call you names, for some other reason. Does that match what you actually value? It's out there in the territory, after all.

That is, is merely knowing that they "like and respect you" enough? Or is that phrase really just a shorthand in your map for a set of behaviors and non-behaviors that you actually value?

Note that if you argue that, "if they really liked and respected me, they wouldn't do that", then you are now back to talking about your map of what that phrase means, as opposed to what someone else's map is.

System 2 thinking is very tricky this way -- it's prone to manipulating symbols as if they were the things they're merely pointing at, as though the map were the territory... when the only things that exist in its perceptual sphere are the labels on the map.

Most of the time, when we think we're talking about the territory, we're talking about the shapes on the map, but words aren't even the shapes on the map!

Comment author: orthonormal 09 June 2009 05:54:52AM *  7 points [-]

We have only access to our current map to tell us about the territory, yes. But we have strong intuitions about how we would act if we could explicitly choose that our future map permanently diverge from our current map (which we currently see as the territory). If we (again by our current map) believe that this divergence would conform less to the territory (as opposed to a new map created by learning information), many of us would oppose that change even against pretty high stakes.

I mean, if Omega told me that I had to choose between

  • (A) my sister on Mars being well but cut off from all contact with me, or
  • (B) my sister being killed but a nonsentient chatbot impersonating her to me in happy weekly chats,

and that in either case my memory of this choice would be wiped when I made it, I would choose (A) without hesitation.

I understand that calling our current map "the territory" looks like a categorical error, but rejecting conchis' point entirely is the wrong response. There's a very real and valid sense in which our minds oppose what they calculate (by the current map) to be divergences between the future map and the territory.

Comment author: pjeby 10 June 2009 03:14:40AM *  1 point [-]

I would choose (A) without hesitation.

Of course, because the immediate pain of the thought of choosing B would outweigh the longer-term lesser pain of the thought of losing contact with your sister.

This has nothing to do with whether the events actually occur, and everything to do with your mapping of the experience of the conditions, as you imagine them for purposes of making a decision.

That is, the model you make of the future may refer to a hypothetical reality, but the thing you actually evaluate is not that reality, but your own reaction to that reality -- your present-tense experience in response to a constructed fiction made of previous experiences

It so happens that there is some correspondence between this (real) process and the way we would prefer to think we establish and evaluate our preferences. Specifically, both models will generate similar results, most of the time. It's just that the reasons we end up with for the responses are quite different.

There's a very real and valid sense in which our minds oppose what they calculate (by the current map) to be divergences between the future map and the territory.

But calling that latter concept "territory" is still a category error, because what you are using to evaluate it is still your perception of how you would experience the change.

We do not have preferences that are not about experience or our emotional labeling thereof; to the extent that we have "rational" preferences it is because they will ultimately lead to some desired emotion or sensation.

However, our brains are constructed in such a way so as to allow us to plausibly overlook and deny this fact, so that we can be honestly "sincere" in our altruism... specifically by claiming that our responses are "really" about things outside ourselves.

For example, your choice of "A" allows you to self-signal altruism, even if your sister would actually prefer death to being imprisoned on Mars for the rest of her life! Your choice isn't about making her life better, it's about you feeling better for the brief moment that you're aware you did something.

(That is, if you cared about something closer to the reality of what happens to your sister, rather than your experience of it, you'd have hesitated in that choice long enough to ask Omega whether she would prefer death to being imprisoned on Mars.)

Comment author: Cyan 10 June 2009 03:49:29AM *  2 points [-]

That is, the model you make of the future may refer to a hypothetical reality, but the thing you actually evaluate is not that reality, but your own reaction to that reality -- your present-tense experience in response to a constructed fiction made of previous experiences.

I affirm this, but it does not follow that:

This has nothing to do with whether the events actually occur...

Just because the events that occur are not the proximate cause of an experience or preference does not mean that these things have nothing to do with external reality. This whole line of argument ignores the fact that our experience of life is entangled with the territory, albeit as mediated by our maps.

Comment author: pjeby 10 June 2009 05:25:17PM 2 points [-]

Just because the events that occur are not the proximate cause of an experience or preference does not mean that these things have nothing to do with external reality. This whole line of argument ignores the fact that our experience of life is entangled with the territory, albeit as mediated by our maps.

And a thermostat's map is also "entangled" with the territory, but as loqi pointed out, what it really prefers is only that its input sensor match its temperature setting!

I am not saying there are no isomorphisms between the shape of our preferences and the shape of reality, I am saying that assuming this isomorphism means the preferences are therefore "about" the territory is mind projection.

If you look at a thermostat, you can project that it was made by an optimizing process that "wanted" it to do certain things by responding to the territory, and that thus, the thermostat's map is "about" the territory. And in the same way, you can look at a human and project that it was made by an optimizing process (evolution) that "wanted" it to do certain thing by responding to the territory.

However, the "aboutness" of the thermostat does not reside in the thermostat; it resides in the maker of the thermostat, if it can be said to exist at all! (In fact, this "aboutness" cannot exist, because it is not a material entity; it's a mental entity - the idea of aboutness.)

So despite the existence of inputs and outputs, both the human and the thermostat do their "preference" calculations inside the closed box of their respective models of the world.

It just so happens that humans' model of the world also includes a Mind Projection device, that causes humans to see intention and purpose everywhere they look. And when they look through this lens at themselves, they imagine that their preferences are about the territory... which then keeps them from noticing various kinds of erroneous reasoning and subgoal stomps.

For that matter, it keeps them from noticing things like the idea that if you practice being a pessimist, nothing good can last for you, because you've trained yourself to find bad things about anything. (And vice versa for optimists.)

Ostensibly, optimism and pessimism are "about" the outside world, but in fact, they're simply mechanical, homeostatic processes very much like a thermostat.

I am not a solipsist nor do I believe people "create your own reality", with respect to the actual territory. What I'm saying is that people are deluded about the degree of isomorphism between their preferences and reality, because they confuse the map with the territory. And even with maximal isomorphism between preference and reality, they are still living in the closed box of their model.

It is reasonable to assume that existence actually exists, but all we can actually reason about is our experience of it, "inside the box".

Comment author: saturn 10 June 2009 04:21:49AM 1 point [-]

That is, if you cared about something closer to the reality of what happens to your sister, rather than your experience of it, you'd have hesitated in that choice long enough to ask Omega whether she would prefer death to being imprisoned on Mars.

And what if he did ask?

Comment author: pjeby 10 June 2009 05:02:08PM 0 points [-]

And what if he did ask?

Then, as I said, he cares about something closer to the reality.

The major point I've been trying to make in this thread is that because human preferences are not just in the map but of the map, is that it allows people to persist in delusions about their motivations. And not asking the question is a perfect example of the sort of decision error this can produce!

However, asking the question doesn't magically make the preference about the territory either; in order to prefer the future include his sister's best interests, he must first have an experience of the sister and a reason to wish well of her. But it's still better than not asking, which is basically wireheading.

The irony I find in this discussion is that people seem to think I'm in favor of wireheading because I point out that we're all doing it, all the time. When in fact, the usefulness of being aware that it's all wireheading, is that it makes you better at noticing when you're doing it less-usefully.

The fact that he hadn't asked his sister, or about his sister's actual well-being instantly jumped off the screen at me, because it was (to me) obvious wireheading.

So, you could say that I'm biased by my belief to notice wireheading more, but that's an advantage for a rationalist, not a disadvantage.

Comment author: Vladimir_Nesov 10 June 2009 05:07:32PM *  0 points [-]

The major point I've been trying to make in this thread is that because human preferences are not just in the map but of the map, is that it allows people to persist in delusions about their motivations.

Is human knowledge also not just in the map, but exclusively of the map? If not, what's the difference?

Comment author: pjeby 10 June 2009 06:00:00PM 1 point [-]

Is human knowledge also not just in the map, but exclusively of the map? If not, what's the difference?

Any knowledge about the actual territory can in principle be reduced to mechanical form without the presence of a human being in the system.

To put it another way, a preference is not a procedure, process, or product. The very use of the word "preference" is a mind projection - mechanical systems do not have "preferences" - they just have behavior.

The only reason we even think we have preferences in the first place (let alone that they're about the territory!) is because we have inbuilt mind projection. The very idea of having preferences is hardwired into the model we use for thinking about other animals and people.

Comment author: Vladimir_Nesov 10 June 2009 06:14:01PM 0 points [-]

You never answered my question.

Comment author: pjeby 11 June 2009 01:42:35AM 0 points [-]

You said, "if not, what's the difference", and I gave you the difference. i..e, we can have "knowledge" of the territory.

Comment author: Vladimir_Nesov 11 June 2009 07:33:09AM *  0 points [-]

So, knowledge exists in the structure of map and is about the territory, while preference can't be implemented in natural artifacts. Preference is a magical property of subjective experience, and it is over maps, or about subjective experience, but not, for example, about the brain. Saying that preference exists in the structure of map or that it is about the territory is a confusion, that you call "mind projection" Does that summarize your position? What are the specific errors in this account?

Comment author: orthonormal 10 June 2009 03:53:31PM 0 points [-]

(That is, if you cared about something closer to the reality of what happens to your sister, rather than your experience of it, you'd have hesitated in that choice long enough to ask Omega whether she would prefer death to being imprisoned on Mars.)

Be charitable in your interpretation, and remember the Least Convenient Possible World principle. I was presuming that the setup was such that being alive on Mars wouldn't be a 'fate worse than death' for her; if it were, I'd choose differently. If you prefer, take the same hypothetical but with me on Mars, choosing whether she stayed alive on Earth; or let choice B include subjecting her to an awful fate rather than death.

That is, the model you make of the future may refer to a hypothetical reality, but the thing you actually evaluate is not that reality, but your own reaction to that reality -- your present-tense experience in response to a constructed fiction made of previous experiences.

I would say rather that my reaction is my evaluation of an imagined future world. The essence of many decision algorithms is to model possible futures and compare them to some criteria. In this case, I have complicated unconscious affective criteria for imagined futures (which dovetail well with my affective criteria for states of affairs I directly experience), and my affective reaction generally determines my actions.

We do not have preferences that are not about experience or our emotional labeling thereof; to the extent that we have "rational" preferences it is because they will ultimately lead to some desired emotion or sensation.

To the extent this is true (as in the sense of my previous sentence), it is a tautology. I understand what you're arguing against: the notion that what we actually execute matches a rational consequentialist calculus of our conscious ideals. I am not asserting this; I believe that our affective algorithms do often operate under more selfish and basic criteria, and that they fixate on the most salient possibilities instead of weighing probabilities properly, among other things.

However, these affective algorithms do appear to respond more strongly to certain facets of "how I expect the world to be" than to facets of "how I expect to think the world is" when the two conflict (with an added penalty for the expectation of being deceived), and I don't find that problematic on any level.

Comment author: pjeby 10 June 2009 05:31:20PM *  -1 points [-]

If you prefer, take the same hypothetical but with me on Mars, choosing whether she stayed alive on Earth; or let choice B include subjecting her to an awful fate rather than death.

As I said, it's still going to be about your experience during the moments until your memory is erased.

I understand what you're arguing against: the notion that what we actually execute matches a rational consequentialist calculus of our conscious ideals.

I took that as a given, actually. ;-) What I'm really arguing against is the naive self-applied mind projection fallacy that causes people to see themselves as decision-making agents -- i.e., beings with "souls", if you will. Asserting that your preferences are "about" the territory is the same sort of error as saying that the thermostat "wants" it to be a certain temperature. The "wanting" is not in the thermostat, it's in the thermostat's maker.

Of course, it makes for convenient language to say it wants, but we should not confuse this with thinking the thermostat can really "want" anything but for its input and setting to match. And the same goes for humans.

(This is not a mere fine point of tautological philosophy; human preferences in general suffer from high degrees of subgoal stomp, chaotic loops, and other undesirable consequences arising as a direct result of this erroneous projection. Understanding the actual nature of preferences makes it easier to dissolve these confusions.)

Comment author: Alicorn 09 June 2009 06:56:41AM 0 points [-]

I wish I could upvote this two or three times. Thank you.

Comment author: Vladimir_Nesov 09 June 2009 12:19:16PM 0 points [-]

What features of that comment made it communicate something new to you? What was it that got communicated?

The comment restated a claim that a certain relationship is desirable as a claim that given that it's desirable, there is a process that establishes it to be true. It's interesting how this restatement could pierce inferential distance: is preference less trustworthy than a fact, and so demonstrating the conversion of preference into a fact strengthens the case?

Comment author: Alicorn 09 June 2009 04:22:12PM *  0 points [-]

I'd been following this topic and getting frustrated with my inability to put my opinion on the whole preferences-about-the-territory thing into words, and I thought that orthonomal's comment accomplished it very nicely. I don't think I understand your other question.

Comment author: orthonormal 09 June 2009 11:50:00PM 0 points [-]

Given the length of the thread I branched from, it looks like you and P.J. Eby ended up talking past each other to some extent, and I think that you both failed to distinguish explicitly between the current map (which is what you calculate the territory to be) and a hypothetical future map.

P.J. Eby was (correctly) insisting that your utility function is only in contact with your current map, not the territory directly. You were (correctly) insisting that your utility function cares about (what it calculates to be) the future territory, and not just the future map.

Is that a fair statement of the key points?

Comment author: Vladimir_Nesov 10 June 2009 12:36:56AM 0 points [-]

Utility function is no more "in contact" with your current map than the actual truth of 2+2=4 is in contact with display of a calculator that displays the statement. Utility function may care about past territory (and even counterfactual territory) as well as future territory, with map being its part. Keeping a map in good health is instrumentally a very strong move: just by injecting an agent with your preferences somewhere in the territory you improve it immensely.

Comment author: orthonormal 10 June 2009 03:38:12PM 0 points [-]

While there might exist some abstracted idealized dynamic that is a mathematical object independent of your map, any feasible heuristic for calculating your utility function (including, of course, any calculation you actually do) will depend on your map.

If Omega came through tomorrow and made all pigs conscious with human-like thoughts and emotions, my moral views on pig farming wouldn't be instantly changed; only when information about this development gets to me and my map gets altered will I start assigning a much higher disutility to factory farming of pigs.

Or, to put it another way, a decision algorithm refers directly to the possible worlds in the territory (and their probabilities, etc), but it evaluates these referents by looking at the corresponding objects in its current map. I think that, since we're talking about practical purposes, this is a relevant point.

Keeping a map in good health is instrumentally a very strong move: just by injecting an agent with your preferences somewhere in the territory you improve it immensely.

Agree completely. Of the worlds where my future map looks to diverge from the territory, though, I'm generally more repulsed by the ones in which my map says it's fine where it's not than by the opposite.

Comment author: conchis 10 June 2009 03:51:37PM *  1 point [-]

any feasible heuristic for calculating your utility function (including, of course, any calculation you actually do) will depend on your map.

This something of a nitpick, but this isn't strictly true. If others are trying to calculate your utility function (in order to help you), this will depend on their maps rather than yours (though probably including their map of your map). The difference becomes important if their maps are more accurate than yours in some respect (or if they can affect how accurate your map is).

For example, if you know that I value not being deceived (and not merely the subjective experience of not being deceived), and you care about my welfare, then I think that you should not deceive me, even if you know that I might perceive my welfare to be higher if you did.

Comment author: orthonormal 10 June 2009 03:55:06PM 0 points [-]

Oh, good point. I should have restricted it to "any calculation you personally do", in which case I believe it holds.

Comment author: Vladimir_Nesov 10 June 2009 04:24:28PM *  0 points [-]

At which point it becomes trivial: any calculation that is done on your map is done using your map, just Markovity of computation...

A related point is that you can create tools that make decisions themselves, in situations only of possibility of which you are aware.

Comment author: saturn 06 June 2009 07:48:47AM *  5 points [-]

Utility is about the territory in the same sense that the map is about the territory; the map tells us the way the territory is, utility tells us the way we want the territory to be. Us non-wireheaders want an accurate map because it's the territory we care about.

Supposing utility is not about the territory but about the map, we get people who want nothing more than to sabotage their own mapmaking capabilities. If the territory is not what we care about, maintaining the correspondence of map to territory would be a pointless waste of effort. Wireheading would look like an unambiguously good idea, not just to some people but to everyone.

Conchis' example of wanting his friends to really like and respect him is correct. He may have failed to explicitly point out that he also has an unrelated preference for not being beaten up. He's also in the unfortunate position of valuing something he can't know about without using long chains of messy inductive inferences. But his values are still about the territory, and he wants his map to accurately reflect the territory because it's the territory he cares about.

Comment author: pjeby 06 June 2009 04:06:41PM -1 points [-]

Utility is about the territory in the same sense that the map is about the territory; the map tells us the way the territory is, utility tells us the way we want the territory to be. Us non-wireheaders want an accurate map because it's the territory we care about.

I am only saying that the entire stack of concepts you have just mentioned exists only in your map.

Supposing utility is not about the territory but about the map, we get people who want nothing more than to sabotage their own mapmaking capabilities.

Permit me to translate: supposing utility is not about the (portion of map labeled) territory but about the (portion of map labeled) map, we get people who want nothing more than to sabotage their own mapmaking capabilities.

Does that make it any clearer what I'm saying?

This is a "does the tree make a sound" argument, and I'm on the, "no it doesn't" side, due to using a definition of "sound" that means "the representation of audio waves within a human nervous system". You are on the "of course it makes a sound" side, because your definition of sound is "pressure waves in the air."

Make sense?

Comment author: saturn 07 June 2009 12:17:08AM 1 point [-]

I am only saying that the entire stack of concepts you have just mentioned exists only in your map.

As far as I can tell, you're saying that there is no territory, or that the territory is irrelevant. In other words, solipsism. You've overcome the naive map/territory confusion, but only to wind up with a more sophisticated form of confusion.

This isn't a "does the tree make a sound" argument. It's more like a "dude... how do we even really know reality is really real" argument. Rationality is entirely pointless if all we're doing is manipulating completely arbitrary map-symbols. But in that case, why not leave us poor, deluded believers in reality to define the words "map", "territory", and "utility" the way we have always done?

Comment author: pjeby 07 June 2009 03:58:22AM 0 points [-]

In other words, solipsism.

No, general semantics. There's a difference.

Comment author: saturn 07 June 2009 10:49:45PM *  2 points [-]

Can you point out the difference?

Even though "this is not a pipe", the form of a depiction of a pipe is nevertheless highly constrained by the physical properties of actual pipes. Do you deny that? If not, how do you explain it?

Comment author: conchis 06 June 2009 04:19:20PM *  1 point [-]

This is a "does the tree make a sound" argument, and I'm on the, "no it doesn't" side, due to using a definition of "sound" that means "the representation of audio waves within a human nervous system". You are on the "of course it makes a sound" side, because your definition of sound is "pressure waves in the air."

I've been trying to be on the "it depends on your definition and my definition sits within the realm of acceptable definitions" side. Unfortunately, whether this is what you intend or not, most of your comments come across as though you're on the "it depends on the definition, and my (PJ's) defintion is right and yours is wrong" side, which is what seems to be getting people's backs up.

Comment author: Vladimir_Nesov 06 June 2009 04:29:58PM 0 points [-]

This confusion is dissolved in the post Disputing Definitions.

Comment author: conchis 06 June 2009 04:32:50PM *  0 points [-]

Which confusion? I didn't think I was confused. Now I'm confused about whether I'm confused. ;)

Comment author: Vladimir_Nesov 06 June 2009 04:40:14PM 0 points [-]

You mentioned this confusion as possibly playing a role in you and Eby talking past each other, the ambiguous use of the word "utility".

Comment author: conchis 06 June 2009 04:59:44PM *  1 point [-]

OK, cool. Now, given that we've already identified that, what does Disputing Definitions tell us that we don't already know?

Comment author: Cyan 05 June 2009 06:40:30PM *  2 points [-]

There's two things to say in response to this: first, I can define "liking and respecting me" as "experiencing analogous brain states to mine when I like and respect someone else". That's in the territory (modulo some assumptions about the cognitive unity of humankind): I could verify it in principle, although not in practice.

The second thing is that even if we grant that the example was poor, the point was still valid. For example, one might prefer that one's spouse never cheat to one's spouse cheating but never being aware of that fact. (ETA: but maybe you weren't arguing against the point, only the example.)

Comment author: pjeby 05 June 2009 08:10:08PM -1 points [-]

There's two things to say in response to this: first, I can define "liking and respecting me" as "experiencing analogous brain states to mine when I like and respect someone else". That's in the territory (modulo some assumptions about the cognitive unity of humankind): I could verify it in principle, although not in practice.

But what if they experience that state, and still, say, beat you up and treat you like jerks, because that's what their map says you should do when you feel that way?

This isn't about the example being poor, it's about people thinking things in the map actually exist in the territory. Everything you perceive is mediated by your maps, even if only in the minimal sense of being reduced to human sensory-pattern recognition symbols first, let alone all the judgments about the symbols that we add on top.

For example, one might prefer that one's spouse never cheat to one's spouse cheating but never being aware of that fact. (ETA: but maybe you weren't arguing against the point, only the example.)

How about the case where you absolutely believe the spouse is cheating, but they really aren't?

This is certainly a better example, in that it's easier to show that it's not reality that you value, but the part of your map that you label "reality". If you really truly believe the spouse is cheating, then you will feel exactly the same as if they really are.

IOW, when you say that you value something "in the territory", all you are ever really talking about is the part of your map that you label "the territory", whether that portion of the map actually corresponds to the territory or not.

This is not some sort of hypothetical word-argument, btw. (I have no use for them, which is why I mostly avoid the Omega discussions.) This is a practical point for minimizing one's suffering and unwanted automatic responses to events in the world. To the extent you believe that your map is the territory, you will suffer when it is out-of-sync.

Comment author: Cyan 05 June 2009 08:27:17PM *  2 points [-]

But what if they experience that state, and still, say, beat you up and treat you like jerks, because that's what their map says you should do when you feel that way?

It's still possible to prefer this state of affairs to one where they are beating me because they are contemptuous of me. Remember, we're talking about a function from some set X to the real numbers, and we're trying to figure out what sorts of things are members of X. In general, people do have preferences about the way things actually are.

How about the case where you absolutely believe the spouse is cheating, but they really aren't?... If you really truly believe the spouse is cheating, then you will feel exactly the same as if they really are.

But my spouse won't, and I have preferences about that fact. All other things being equal, my preference ordering is "my spouse never cheats and I believe my spouse never cheats" > "my spouse cheats and I find out" > "my spouse cheats and I believe my spouse never cheats" > "my spouse never cheats but I believe she does". If a utility function exists that captures this preference, it will be a function that takes both reality and my map as arguments.