loqi comments on Post Your Utility Function - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (273)
Wow, -5! People here don't seem to appreciate this sort of challenge to their conceptual framework.
I agree, but I wonder if I failed to communicate the distinction I was attempting to make. The human-sensory experience of being embedded in a concrete, indifferent reality is (drugs, fantasies, and dreams aside) basically constant. It's a fundamental thread underlying our entire history of experience.
It's this indifference to our mental state that makes it special. A preference expressed in terms of "reality" has subjective properties that it would otherwise lack. Maybe I want the sky to be blue so that other people will possess a similar experience of it that we can share. "Blueness" may still be a red herring, but my preference now demands some kind of invariant between minds that seemingly cannot be mediated except through a shared external reality. You might argue that I really just prefer shared experiences, but this ignores the implied consistency between such experiences and all other experiences involving the external reality, something I claim to value above and beyond any particular experience.
This is where the massive implicit context enters the scene. "Even if I don't know it" is modeled after experience only in the degenerate sense that it's modeled after experience of indifferent causality. A translation might look like "I prefer to experience a reality with the sorts of consequences I would predict from the sky being blue, even if I don't consciously perceive blue skies". That's still an oversimplification, but it's definitely more complex than just invoking a generic memory of "not having known something" and applying it to blue skies.
I don't see how any of that is true. I can easily think of different concrete realizations of "real" and "virtual" that would interact differently with my experience of reality, thus provoking different labellings of "good" and "bad". If your point is merely that "real" is technically underspecified, then I agree. But I don't see how you can draw inferences from this underspecification.
I'm going to have to turn your own argument against you here. To the extent that you have a concept of reality that is remotely consistent with your everyday experience, I claim that "in reality, blue skies are bad because they provoke suffering" is a preference stated in terms of an extremely similar reality-concept, plus a suffering-concept blended together from first-hand experience and compassion (itself also formed in terms of reality-as-connected-to-other-minds). For you to say it has "nothing to do with reality" is pure semantic hogwash. What definition of "reality" can you possibly be using to make this statement, except the one formed by your lifetime's-worth of experience with indifferent causality? You seem to be denying the use of the term to relate your concept of reality to mine, despite their apparent similarity.
This doesn't make sense to me. Whether or not an experience is "negative" is a function of our preferences. If a preference "produces" negative experiences, then either they're still better than the alternative (in which case it's a reasonable preference, and it's probably worthwhile to change your perception of the experience) or they're not (in which case it's not a true preference, just delusion).
That's a property of pain and pleasure, not preference. I may well decide not to feel pain due to preference X being thwarted, but I still prefer X, and I still prefer pleasure to the absence of pleasure.
This is where I think your oversimplification of "experience vs reality" produces invalid conclusions. Those labels don't just apply to one experience or another, they apply to a massively complicated network of experience that I can't even begin to hold in my mind at once. Given that, your logic doesn't follow at all, because I really don't know what I'm relabeling.
This relates to a general reservation I have with cavalier attitudes toward mind-hacks: I know full well that my preferences are complex, difficult to understand, and grossly underspecified in any conscious realization, so it's not at all obvious to me that optimizing a simple preference concerning one particular scenario doesn't carry loads of unintended consequences for the rest of them. I've had direct experience with my subconsciously directed behavior "making decisions for me" that I had conscious reasons to optimize against, only later to find out that my conscious understanding of the situation was flawed and incomplete. I think that ignoring the intuitive implications of an external reality leads to similar contradictions.
You seem to mostly be arguing against a strawman; as I said, I'm not saying reality doesn't exist or that it's not relevant to our experiences. What I'm saying is that the preferences are composed of map, and while there are connections between that map and external reality, we are essentially deluded to think our preferences refer to actual reality, and that this delusion leads us to believing that changing external reality will change our internal experience, when more often the reverse is more likely true. (That is, changing our internal experience will more likely result in our taking actions that will actually change external reality.)
Note, however that:
Here you seem to be arguing my point. The experience is a function of preferences, the preferences are a product of, and point to, other experiences, in a self-sustaining loop that sometimes might as well not be connected to outside reality at all, for all that it has anything to do with what's actually going on.
Lucky people live in a perpetual perception of good things happening, unlucky people the opposite, even when the same events are happening to both.
How can we say, then, that either person's perceptions are "about" reality, if they are essentially unconditional? Clearly, something else is going on.
If we disagree at this point, I'd have to say it can only be because we disagree on what "about" means. When I say preferences are not "about" reality, it is in the same sense that Robin Hanson is always saying that politics is not "about" policy, etc.
Clearly, preferences are "about" reality in the same sense that politics are "about" policy. That is, reality is the subject of a preference, in the same way that a policy might be the subject of a political dispute.
However, in both cases, the point of the ostensible activity is not where it appears to be. In order for politics to function, people must sincerely believe that it is "about" policy, in precisely the same way as we must sincerely believe our preferences are "about" reality, in order to make them function -- and for similar reasons.
But in neither case does either the sincerity or the necessity of the delusion change the fact that it's nonetheless a delusion.
I don't think I disagree with any of the above, except to dispute its full universality (which I'm not sure you're even arguing). To attempt to rephrase your point: Our interactions with reality create experiences filtered through our particular way of characterizing such interactions. It's these necessarily subjective characterizations (among other things), rather than the substance of the interaction itself, which generate our preferences. When reflecting on our preferences, we're likely to look right past the interpretive layer we've introduced and attribute them to the external stimulus that produced the response, rather than the response itself.
Robin's "X is not about Y" has the flavor of general, but not universal, rules. Would you extend your analogy to include this property?
Here's an interesting question for you: why is it important that you consider this non-universal? What value does it provide you for me to concede an exception, or what difference will it make in your thinking if I say "yes" or "no"? I am most curious.
(Meanwhile, I agree with your summation as an accurate, if incomplete restatement of the bulk of my point.)
Because I'm trying to make sense of your position, but I don't think I can with such a strict conclusion. I don't see any fundamental reason why someone couldn't form preferences more or less directly mediated by reality, it just seems that in practice, we don't.
If you're asking why I'm bringing up universality, it seemed clear that your claims about preferences were universal in scope until you brought up "X is not about Y". "Must logically be" and "tends to be in practice" are pretty different types of statement.
You didn't answer my questions.
I mean, you said some things that sound like answers, but they're not answers to the questions I asked. Here they are again:
and
Your almost-answer was that you don't think you can "make sense" of my position with a strict conclusion. Why is that? What would it mean for there to be a strict conclusion? How, specifically, would that be a problem?
I didn't answer this because it's predicated on an assumption that has no origin in the conversation. I never claimed that it was "important" for me to consider this non-universal. As per being "liberal in what I accept" in the realm of communication, I tried to answer the nearest meaningful question I thought you might actually be asking. I thought the phrase "If you're asking why I'm bringing up universality" made my confusion sufficiently clear.
If you really do mean to ask me why I think it's important that I believe in some property of preference formation, then either I've said something fairly obvious to that end that I'm not remembering (or finding), or you're asserting your own inferences as the basis of a question, instead of its substance. I try to give people the benefit of the doubt that I've misunderstood them in such cases, rather than just assume they're speaking manipulatively.
No particular value in mind. I suppose the greatest value would be in you solidly refuting such exceptions in a way that made sense to me, as that would be a more surprising (therefore more informative) outcome. If you the concede the exception, I don't gain any additional insight, so that's of fairly neutral value.
Not really sure yet, especially in the "no" case (since in that case you may have reasons I haven't yet thought of or understood). I suppose in the "yes" case I'd have greater confidence that I knew what you were talking about if I encountered similar concepts in your comments elsewhere. This discussion has had some difference on my thinking: I don't think I understood the thrust of your point when I originally complained that your distinction lacked relevance.
By strict conclusion, I mean "preferences are modeled strictly in terms of the map: it is logically impossible to a hold preference expressed in terms of something other than that which is expressed in the map". This seems very nearly true, but vulnerable to counterexamples when taken as a general principle or logical result of some other general principle. I'll elaborate if you'd like, but I thought I'd clarify that you meant it that way. If you didn't, theoretical or speculative counter-examples aren't particularly relevant.
I can imagine that, in principle, some other sort of mind than a human's might be capable of being a counterexample, apart from, say, the trivial example of a thermostat, which shows a "preference" for reality being a certain way. An AI could presumably be built so that its preferences were based on properties of the world, rather than properties of its experience, or deduction from other properties based on experience. However, at some point that would need to be rooted in the goal system provided by its programmers... who presumably based it off of their own preferences.... ;-) (Nonetheless, if the AI didn't have anything we'd label "experience", then I'd have to agree that it has a preference about reality, rather than its experience of reality.)
I could also consider an argument that, say, hunger is about the state of one's stomach, and that it therefore is "about" the territory, except that I'm not sure hunger qualifies as a preference, rather than an appetite or a drive. A person on a hunger strike or with anorexia still experiences hunger, yet prefers not to eat.
If you think you have other counterexamples, I'd like to hear them. I will be very surprised if they don't involve some rather tortured reasoning and hypotheticals, though, or non-human minds. The only reason I even hedge my bets regarding humans is that (contrary to popular belief) I'm not under the mistaken impression that I have anything remotely approaching a complete theory of mind for human brains, versus a few crude maps that just happen to cover certain important chunks of "territory". ;-)
I don't actually consider this a good counterexample. It can been trivially shown that the thermostat's "preference" is not in terms of the "reality" of temperature: Just sabotage the sensor. The thermostat "prefers" its sensor reading to correspond to its set point. Wouldn't you agree this is fairly analogous to plenty of human desires?
Agreed. The closest it seems you could come is to prefer satiation of said appetites, which is a subjective state.
Actually, human minds are the primary source of my reservations. I don't think my reasoning is particularly tortured, but it certainly seems incomplete. Like you, I really have no idea what a mind is.
That said, I do seem to have preferences that concern other minds. These don't seem reducible to experiences of inter-personal behavior... they seem largely rooted in the empathic impulse, the "mirror neurons". Of course, on its face, this is still just built from subjective experience, right? It's the the experience of sympathetic response when modeling another mind. And there's no question that this involves substituting my own experiences for theirs as part of the modeling process.
But when I reflect on a simple inter-personal preference like "I'd love for my friend to experience this", I can't see how it really reduces to pure experience, except as mediated by my concept of invariant reality. I don't have a full anticipation of their reaction, and it doesn't seem to be my experience of modeling their interaction that I'm after either.
Feel free to come up with a better explanation, but I find it difficult to deconstruct my desire to reproduce internally significant experiences in an external environment in a way that dismisses the role of "hard" reality. I can guess at the pre-reflective biological origin of this sort of preference, just like we can point at the biological origin of domesticated turkeys, but, just as turkeys can't function without humans, I don't know how it would function without some reasonable concept of a reality that implements things intrinsically inaccessible and indifferent to my own experience.
I chose to instantiate this particular example, but the general rule seems to be: The very fabric of what "another mind" means to me involves the concept of an objective but shared reality. The very fabric of what "another's experiences" means to me involves the notion of an external system giving rise to external subjective experiences that bear some relation to my own.
You could claim my reasoning is tortured in that it resembles Russel's paradox: One could talk about the set of all subjective preferences explicitly involving objective phenomena (i.e., not containing themselves). But it seems to me that I can in a sense relate to a very restricted class of objective preferences, those constructed from the vocabulary of my experience, reflected back into the world, and reinstantiated in the form of another mind.
Another simple example: Do you think a preference for honest communication is at all plausible? Doesn't it involve something beyond "I hope the environment doesn't trick me"?
Right. And don't forget the mind-projection machinery, that causes us to have, e.g. different inbuilt intuitions about things that are passively moved, move by themselves, or have faces that appear to express emotion. These are all inbuilt maps in humans.
Most of us learn by experience that sharing positive experiences with others results in positive attention. That's all that would be needed, but it's also likely that humans have an evolved appetite to communicate and share positive experiences with their allies.
It just means you prefer one class of experiences to another, that you have come to associate with other experiences or actions coming before them, or co-incident with them.
The reason, btw, that I asked why it made a difference whether this is an absolute concept or a "mostly" concept, is that AFAICT, the idea that "some preferences are really about the territory" leads directly to "therefore, all of MY preferences are really about the territory".
In contrast, thinking of all preferences being essentially delusional is a much better approach, especially if 99.999999999% of all human preferences are entirely about the map, if we presume that maybe there are some enlightened Zen masters or Beisutsukai out there who've successfully managed, against all odds, to win the epistemic lottery and have an actual "about the territory" preference.
Even if the probability of having such a preference were much higher, viewing it as still delusional with respect to "invariant reality" (as you call it) does not introduce any error. So the consequences of erring on the side of delusion are negligible, and there is a significant upside to being more able to notice when you're looping, subgoal stomping, or just plain deluded.
That's why it's of little interest to me how many .9's there are on the end of that %, or whether in fact it's 100% - the difference is inconsequential for any practical purpose involving human beings. (Of course, if you're doing FAI, you probably want to do some deeper thinking than this, since you want the AI to be just as deluded as humans are, in one sense, but not as deluded in another.)
Consider the difference between a thermostat connected to a heater and a human maintaining the same temperature by looking at a thermometer and switching the heater on and off. Obviously there is a lot more going on inside the human's brain, but I still don't understand how the thermostat has any particular kind of connection to reality that the human lacks. The same applies whether the thermostat was built by humans with preferences or somehow formed without human design.
edit: I'm not trying to antagonize you, but I genuinely can't tell whether you are trying to communicate something that I'm not understanding, or you've just read The Secret one too many times.
The thermostat lacks the ability to reflect on itself, as well as the mind-projection machinery that deludes human beings into thinking that their preferences are "about" the reality they influence and are influenced by.
You're definitely rounding to a cliche. The Secret folks think that our preferences create the universe, which is just as delusional as thinking our preferences are about the universe.
Doesn't it rather have a preference for its sensors showing a certain reading? (This doesn't lead to thermostat wireheading because the thermostat's action won't make the sensor alter its mechanism.)
Really, it's only systems that can model a scenario where its sensors say X but the situation is actually Y, that could possibly have preferences that go beyond the future readings of its sensors. If you assert that a thermostat can have preferences about the territory but a human can't, then you are twisting language to an unhelpful degree.