orthonormal comments on Post Your Utility Function - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (273)
We have only access to our current map to tell us about the territory, yes. But we have strong intuitions about how we would act if we could explicitly choose that our future map permanently diverge from our current map (which we currently see as the territory). If we (again by our current map) believe that this divergence would conform less to the territory (as opposed to a new map created by learning information), many of us would oppose that change even against pretty high stakes.
I mean, if Omega told me that I had to choose between
and that in either case my memory of this choice would be wiped when I made it, I would choose (A) without hesitation.
I understand that calling our current map "the territory" looks like a categorical error, but rejecting conchis' point entirely is the wrong response. There's a very real and valid sense in which our minds oppose what they calculate (by the current map) to be divergences between the future map and the territory.
Of course, because the immediate pain of the thought of choosing B would outweigh the longer-term lesser pain of the thought of losing contact with your sister.
This has nothing to do with whether the events actually occur, and everything to do with your mapping of the experience of the conditions, as you imagine them for purposes of making a decision.
That is, the model you make of the future may refer to a hypothetical reality, but the thing you actually evaluate is not that reality, but your own reaction to that reality -- your present-tense experience in response to a constructed fiction made of previous experiences
It so happens that there is some correspondence between this (real) process and the way we would prefer to think we establish and evaluate our preferences. Specifically, both models will generate similar results, most of the time. It's just that the reasons we end up with for the responses are quite different.
But calling that latter concept "territory" is still a category error, because what you are using to evaluate it is still your perception of how you would experience the change.
We do not have preferences that are not about experience or our emotional labeling thereof; to the extent that we have "rational" preferences it is because they will ultimately lead to some desired emotion or sensation.
However, our brains are constructed in such a way so as to allow us to plausibly overlook and deny this fact, so that we can be honestly "sincere" in our altruism... specifically by claiming that our responses are "really" about things outside ourselves.
For example, your choice of "A" allows you to self-signal altruism, even if your sister would actually prefer death to being imprisoned on Mars for the rest of her life! Your choice isn't about making her life better, it's about you feeling better for the brief moment that you're aware you did something.
(That is, if you cared about something closer to the reality of what happens to your sister, rather than your experience of it, you'd have hesitated in that choice long enough to ask Omega whether she would prefer death to being imprisoned on Mars.)
I affirm this, but it does not follow that:
Just because the events that occur are not the proximate cause of an experience or preference does not mean that these things have nothing to do with external reality. This whole line of argument ignores the fact that our experience of life is entangled with the territory, albeit as mediated by our maps.
And a thermostat's map is also "entangled" with the territory, but as loqi pointed out, what it really prefers is only that its input sensor match its temperature setting!
I am not saying there are no isomorphisms between the shape of our preferences and the shape of reality, I am saying that assuming this isomorphism means the preferences are therefore "about" the territory is mind projection.
If you look at a thermostat, you can project that it was made by an optimizing process that "wanted" it to do certain things by responding to the territory, and that thus, the thermostat's map is "about" the territory. And in the same way, you can look at a human and project that it was made by an optimizing process (evolution) that "wanted" it to do certain thing by responding to the territory.
However, the "aboutness" of the thermostat does not reside in the thermostat; it resides in the maker of the thermostat, if it can be said to exist at all! (In fact, this "aboutness" cannot exist, because it is not a material entity; it's a mental entity - the idea of aboutness.)
So despite the existence of inputs and outputs, both the human and the thermostat do their "preference" calculations inside the closed box of their respective models of the world.
It just so happens that humans' model of the world also includes a Mind Projection device, that causes humans to see intention and purpose everywhere they look. And when they look through this lens at themselves, they imagine that their preferences are about the territory... which then keeps them from noticing various kinds of erroneous reasoning and subgoal stomps.
For that matter, it keeps them from noticing things like the idea that if you practice being a pessimist, nothing good can last for you, because you've trained yourself to find bad things about anything. (And vice versa for optimists.)
Ostensibly, optimism and pessimism are "about" the outside world, but in fact, they're simply mechanical, homeostatic processes very much like a thermostat.
I am not a solipsist nor do I believe people "create your own reality", with respect to the actual territory. What I'm saying is that people are deluded about the degree of isomorphism between their preferences and reality, because they confuse the map with the territory. And even with maximal isomorphism between preference and reality, they are still living in the closed box of their model.
It is reasonable to assume that existence actually exists, but all we can actually reason about is our experience of it, "inside the box".
And what if he did ask?
Then, as I said, he cares about something closer to the reality.
The major point I've been trying to make in this thread is that because human preferences are not just in the map but of the map, is that it allows people to persist in delusions about their motivations. And not asking the question is a perfect example of the sort of decision error this can produce!
However, asking the question doesn't magically make the preference about the territory either; in order to prefer the future include his sister's best interests, he must first have an experience of the sister and a reason to wish well of her. But it's still better than not asking, which is basically wireheading.
The irony I find in this discussion is that people seem to think I'm in favor of wireheading because I point out that we're all doing it, all the time. When in fact, the usefulness of being aware that it's all wireheading, is that it makes you better at noticing when you're doing it less-usefully.
The fact that he hadn't asked his sister, or about his sister's actual well-being instantly jumped off the screen at me, because it was (to me) obvious wireheading.
So, you could say that I'm biased by my belief to notice wireheading more, but that's an advantage for a rationalist, not a disadvantage.
Is human knowledge also not just in the map, but exclusively of the map? If not, what's the difference?
Any knowledge about the actual territory can in principle be reduced to mechanical form without the presence of a human being in the system.
To put it another way, a preference is not a procedure, process, or product. The very use of the word "preference" is a mind projection - mechanical systems do not have "preferences" - they just have behavior.
The only reason we even think we have preferences in the first place (let alone that they're about the territory!) is because we have inbuilt mind projection. The very idea of having preferences is hardwired into the model we use for thinking about other animals and people.
You never answered my question.
You said, "if not, what's the difference", and I gave you the difference. i..e, we can have "knowledge" of the territory.
So, knowledge exists in the structure of map and is about the territory, while preference can't be implemented in natural artifacts. Preference is a magical property of subjective experience, and it is over maps, or about subjective experience, but not, for example, about the brain. Saying that preference exists in the structure of map or that it is about the territory is a confusion, that you call "mind projection" Does that summarize your position? What are the specific errors in this account?
No, "preference" is an illusory magical property projected by brains onto reality, which contains only behaviors.
Our brains infer "preferences" as a way of modeling expected behaviors of other agents: humans, animals, and anything else we perceive as having agency (e.g. gods, spirits, monsters). When a thing has a behavior, our brains conclude that the thing "prefers" to have either the behavior or the outcome of the behavior, in a particular circumstance. In other words, "preference" is a label attached to a clump of behavior-tendency observations and predictions in the brain -- not a statement about the nature of the thing being observed.
Thus, presuming that these "preferences" actually exist in the territory is supernaturalism, i.e., acting as though basic mental entities exist.
My original point had more to do with the types of delusion that occur when we reason on the basis of preferences actually existing, rather than the idea simply being a projection of our own minds. However, the above will do for a start, as I believe my other conclusions can be easily reached from this point.
I feel your frustration, but throwing the word "magical" in there is just picking a fight, IMO. Anyway, I too would like to see P.J. Eby summarize his position in this format.
Be charitable in your interpretation, and remember the Least Convenient Possible World principle. I was presuming that the setup was such that being alive on Mars wouldn't be a 'fate worse than death' for her; if it were, I'd choose differently. If you prefer, take the same hypothetical but with me on Mars, choosing whether she stayed alive on Earth; or let choice B include subjecting her to an awful fate rather than death.
I would say rather that my reaction is my evaluation of an imagined future world. The essence of many decision algorithms is to model possible futures and compare them to some criteria. In this case, I have complicated unconscious affective criteria for imagined futures (which dovetail well with my affective criteria for states of affairs I directly experience), and my affective reaction generally determines my actions.
To the extent this is true (as in the sense of my previous sentence), it is a tautology. I understand what you're arguing against: the notion that what we actually execute matches a rational consequentialist calculus of our conscious ideals. I am not asserting this; I believe that our affective algorithms do often operate under more selfish and basic criteria, and that they fixate on the most salient possibilities instead of weighing probabilities properly, among other things.
However, these affective algorithms do appear to respond more strongly to certain facets of "how I expect the world to be" than to facets of "how I expect to think the world is" when the two conflict (with an added penalty for the expectation of being deceived), and I don't find that problematic on any level.
As I said, it's still going to be about your experience during the moments until your memory is erased.
I took that as a given, actually. ;-) What I'm really arguing against is the naive self-applied mind projection fallacy that causes people to see themselves as decision-making agents -- i.e., beings with "souls", if you will. Asserting that your preferences are "about" the territory is the same sort of error as saying that the thermostat "wants" it to be a certain temperature. The "wanting" is not in the thermostat, it's in the thermostat's maker.
Of course, it makes for convenient language to say it wants, but we should not confuse this with thinking the thermostat can really "want" anything but for its input and setting to match. And the same goes for humans.
(This is not a mere fine point of tautological philosophy; human preferences in general suffer from high degrees of subgoal stomp, chaotic loops, and other undesirable consequences arising as a direct result of this erroneous projection. Understanding the actual nature of preferences makes it easier to dissolve these confusions.)
I wish I could upvote this two or three times. Thank you.
What features of that comment made it communicate something new to you? What was it that got communicated?
The comment restated a claim that a certain relationship is desirable as a claim that given that it's desirable, there is a process that establishes it to be true. It's interesting how this restatement could pierce inferential distance: is preference less trustworthy than a fact, and so demonstrating the conversion of preference into a fact strengthens the case?
I'd been following this topic and getting frustrated with my inability to put my opinion on the whole preferences-about-the-territory thing into words, and I thought that orthonomal's comment accomplished it very nicely. I don't think I understand your other question.
Given the length of the thread I branched from, it looks like you and P.J. Eby ended up talking past each other to some extent, and I think that you both failed to distinguish explicitly between the current map (which is what you calculate the territory to be) and a hypothetical future map.
P.J. Eby was (correctly) insisting that your utility function is only in contact with your current map, not the territory directly. You were (correctly) insisting that your utility function cares about (what it calculates to be) the future territory, and not just the future map.
Is that a fair statement of the key points?
Utility function is no more "in contact" with your current map than the actual truth of 2+2=4 is in contact with display of a calculator that displays the statement. Utility function may care about past territory (and even counterfactual territory) as well as future territory, with map being its part. Keeping a map in good health is instrumentally a very strong move: just by injecting an agent with your preferences somewhere in the territory you improve it immensely.
While there might exist some abstracted idealized dynamic that is a mathematical object independent of your map, any feasible heuristic for calculating your utility function (including, of course, any calculation you actually do) will depend on your map.
If Omega came through tomorrow and made all pigs conscious with human-like thoughts and emotions, my moral views on pig farming wouldn't be instantly changed; only when information about this development gets to me and my map gets altered will I start assigning a much higher disutility to factory farming of pigs.
Or, to put it another way, a decision algorithm refers directly to the possible worlds in the territory (and their probabilities, etc), but it evaluates these referents by looking at the corresponding objects in its current map. I think that, since we're talking about practical purposes, this is a relevant point.
Agree completely. Of the worlds where my future map looks to diverge from the territory, though, I'm generally more repulsed by the ones in which my map says it's fine where it's not than by the opposite.
This something of a nitpick, but this isn't strictly true. If others are trying to calculate your utility function (in order to help you), this will depend on their maps rather than yours (though probably including their map of your map). The difference becomes important if their maps are more accurate than yours in some respect (or if they can affect how accurate your map is).
For example, if you know that I value not being deceived (and not merely the subjective experience of not being deceived), and you care about my welfare, then I think that you should not deceive me, even if you know that I might perceive my welfare to be higher if you did.
Oh, good point. I should have restricted it to "any calculation you personally do", in which case I believe it holds.
At which point it becomes trivial: any calculation that is done on your map is done using your map, just Markovity of computation...
A related point is that you can create tools that make decisions themselves, in situations only of possibility of which you are aware.
Right. It's trivial, but relevant when discussing in what sense our decision algorithms refer to territory versus map.
I can't parse this. What do you mean?