AnnaSalamon comments on Off Topic Thread: May 2009 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (71)
People use the word "preference" to mean many things, including:
I take it you mean "preference" in senses 3 and 4, but not in sense 1 or 2?
Anna, you are incorrect in guessing that my statement of preference is less than extremely useful for an outside observer to predict my actual behavior.
In other words, the part of me that is loyal to the intellectual framework is very good at getting the rest of me to serve the framework.
The rest of this comment consists of more than most readers probably want to know about my unusual way of valuing things.
I am indifferent to impermanent effects. Internal experiences, mine and yours, certainly qualify as impermanent effects. Note though that internal experiences correlate with things I assign high instrumental value to.
OK, so I care only about permanent effects. I still have not said which permanent effects I prefer. Well, I value the ability to predict and control reality. Whose ability to predict and control? I am indifferent about that: what I want to maximize is reality's ability to predict and control reality: if maximizing my own ability is the best way to achieve that, then that is what I do. If maximizing my friend's ability or my hostile annoying neighbor's ability is the best way, then I do that. When do I want it? Well, my discount rate is zero.
That is the most informative 130 words I can write for improving the ability of someone who does not know me to predict the global effects of my actual behavior.
Since I am in a tiny, tiny minority in wanting this, I might choose to ally myself with people with significantly different preferences. And it is probably impossible in the long term to be allies or colleagues or coworkers with a group of people who all roughly share the same preferences without in a real sense adopting those preferences as my own.
But the preferences I just outlined are the criteria I'd use to decide who to ally with. The single criterion that is most informative in predicting who I might ally with BTW is the prospective ally's intrinsic values' discount rate's being low.
I understand that your stated goal system has effects on your external behavior.
Still, I was trying to understand your claim that "If... there really is no way for me or my friends to have a permanent effect on reality, then I have no preference for what happens" (emphasis mine). Imagine that you were somehow shown a magically 100% sound, 100% persuasive proof that you could not have any permanent effect on reality, and that the entire multiverse would eventually end. In this circumstance, I doubt very much that the concept “Hollerith’s aims” would cease to be predictively useful. Whether you ate breakfast, or sought to end your life, or took up a new trade, or whatever, I suspect that your actions would have a purposive structure unlike the random bouncing about of inanimate systems. If you maintain that you would have no "preferences" under these circumstances (despite a model of "Hollerith's preferences" being useful to predict your behavior under these circumstances), this suggests you're using the term "preferences" in an interesting way.
The reason I’m trying to pursue this line of inquiry is that I am not clear what “preference” does and should mean, as any of us discuss ethics and meta-ethics. No doubt you feel some desire to realize goals that are valued by goal system zero, and no doubt you act partially on that desire as well. No doubt you also feel and act partially on other desires or preferences that a particular aspect of you does not endorse. The thing I’m confused about is... well, I don’t know how to say what I’m confused about; I’m confused. But something like:
My confusion is not specific to you, and maybe I shouldn’t have responded to you with it. But your example is particularly interesting in that the preferences you verbally endorse are particularly far from the ordinary, felt, behaviorally enacted preferences that we mostly start out with as humans. And given that distance, it is natural to ask, “Why, and in what sense, should we call these preferences ‘Hollerith’s preferences’/ ‘Hollerith’s ethics’/ ‘the right thing to do’ ”? Psychologically, is “right” just functioning as a floating xml tag of apparent justified-ness?
I agree with you, Anna, that in that case the concept of my aims does not cease to be predictively useful. (Consequently, I take back my "then I have no preferences" .) It is just that I have not devoted any serious brain time to what my aims might be if knew for sure I cannot have a permanent effect. (Nor does it bother me that I am bad at predicting what I might do if I knew for sure I cannot have a permanent effect.)
Most of the people who say they are loyal to goal system zero seem to have only a superficial commitment to goal system zero. In contrast, Garcia clearly had a very strong deep commitment to goal system zero. Another way of saying what I said above: like Garcia's, my commitment to goal system zero is strong and deep. But that is probably not helping you.
One of the ways I have approached CEV is to think of the superintelligence as implementing what would have happened if the superintelligence had not come into being -- with certain modifications. An example of a modification you and I will agree is desirable: if Joe suffers brain damage the day before the superintelligence comes into being, the superintelligence arranges things the way that Joe would have arranged them if he had not suffered the brain damage. The intelligence might learn that by e.g. reading what Joe posted on the internet before his injury. In summary, one line of investigation that seems worthwhile to me is to get away from this slippery concept of preference or volition and think instead of what the superintelligence predicts would have happened if the superintelligence does not act. Note that e.g. the human sense of right and wrong are predicted by any competent agent to have huge effects on what will happen.
My adoption of goal system zero in 1992 helped me to resolve an emotional problem of mine. I severely doubt it would help your professional goals and concerns for me to describe that, though.
Would you go into why you only care about permanent effects? It seems highly bizarre to me (especially since, as Eliezer has pointed out, everything that happens is permanent insofar as occupies volume in 4d spacetime).
A system of valuing things is a definition. I have defined a system and said, "Oh, by the way, this system has my loyalty."
It is possible that the system is ill-defined, that is, that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs in some significant way from what I think it means. But your appeal to general relativity does not show the ill-definedness of my system because it is possible to pick the time dimension out of spacetime: the time dimension it is treated quite specially in general relativity.
Eliezer's response to my definition appeals not to general relativity but rather to Julian Barbour's endless physics and Eliezer's refinements and additions to it, but his response does not establish the ill-definedness of my system any more than your argument does. If anyone wants the URLs of Eliezer's comments (on Overcoming Bias) that respond to my definition, write me and say a few words about why it is important to you that I make this minor effort.
If Eliezer has a non-flimsy argument that my definition contradicts itself, does not apply to the reality we find ourselves in, or differs significantly from what I think it means, he has not shared it with me.
When I am being careful, I use Judea Pearl's language of causality in my definition rather than the concept of time. The reason I used the concept of time in yesterday's description is succinctness: "I am indifferent to impermanent effects" is shorter than "I care only about terminal effects where a terminal effect is defined as an effect that is not itself a cause" plus sufficient explanation of Judea Pearl's framework to avoid the most common ways in which those words would be misunderstood.
So if I had to, I could use Judea Pearl's language of causality to remove the reliance of my definition on the concept of time. But again, nothing you or Eliezer has written requires me to retreat from my use of the concept of time.
So there is my response to the parts of your comment that can be interpreted as implying that my system is ill-defined.
But what you were probably after when you asked, "Would you go into why you only care about permanent effects?" is why I am loyal to this system I have defined -- or more to the point why you should give it any of your loyalty. Well, I used try to persuade people to become loyal to the system, but that had negative effects, including the effect of causing me to tend to hijack conversations on Overcoming Bias, so now I try only to explain and inform. I no longer try to promote or persuade.
My main advice to you, dclayh, is to chalk this up to the fact that the internet gives a voice to people whose values are very different from yours. For example, you will probably find the values implied by the Voluntary Human Extinction Movement or by anti-natalism just as unconventional as my values. Peace, dclayh.