Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Wei_Dai comments on Welcome to Heaven - Less Wrong

21 Post author: denisbider 25 January 2010 11:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (240)

You are viewing a single comment's thread.

Comment author: Wei_Dai 26 January 2010 01:02:21AM 11 points [-]

denis, most utilitarians here are preference utilitarians, who believe in satisfying people's preferences, rather than maximizing happiness or pleasure.

To those who say they don't want to be wireheaded, how do you really know that, when you haven't tried wireheading? An FAI might reason the same way, and try to extrapolate what your preferences would be if you knew what it felt like to be wireheaded, in which case it might conclude that your true preferences are in favor of being wireheaded.

Comment author: ciphergoth 26 January 2010 01:06:06AM 6 points [-]

To those who say they don't want to be wireheaded, how do you really know that, when you haven't tried wireheading?

But it's not because I think there's some downside to the experience that I don't want it. The experience is as good as can possibly be. I want to continue to be someone who thinks things and does stuff, even at a cost in happiness.

Comment author: Wei_Dai 26 January 2010 03:37:55AM 10 points [-]

The experience is as good as can possibly be.

You don't know how good "as good as can possibly be" is yet.

I want to continue to be someone who thinks things and does stuff, even at a cost in happiness.

But surely the cost in happiness that you're willing to accept isn't infinite. For example, presumably you're not willing to be tortured for a year in exchange for a year of thinking and doing stuff. Someone who has never experienced much pain might think that torture is no big deal, and accept this exchange, but he would be mistaken, right?

How do you know you're not similarly mistaken about wireheading?

Comment author: Kaj_Sotala 26 January 2010 10:11:34AM *  7 points [-]

How do you know you're not similarly mistaken about wireheading?

I'm a bit skeptical of how well you can use the term "mistaken" when talking about technology that would allow us to modify our minds to an arbitrary degree. One could easily fathom a mind that (say) wants to be wireheaded for as long as the wireheading goes on, but ceases to want it the moment the wireheading stops. (I.e. both prefer their current state of wireheadedness/non-wireheadedness and wouldn't want to change it.) Can we really say that one of them is "mistaken", or wouldn't it be more accurate to say that they simply have different preferences?

EDIT: Expanded this to a top-level post.

Comment author: CannibalSmith 26 January 2010 10:01:40AM 1 point [-]

The maximum amount of pleasure is finite too.

Comment author: ciphergoth 27 January 2010 08:40:49AM 0 points [-]

Interesting problem! Perhaps I have a maximum utility to happiness, which increasing happiness approaches asymptotically?

Comment author: Wei_Dai 30 January 2010 03:00:19AM *  0 points [-]

Perhaps I have a maximum utility to happiness, which increasing happiness approaches asymptotically?

Yes, I think that's quite possible, but I don't know whether it's actually the case or not. A big question I have is whether any of our values scales up to the size of the universe, in other words, doesn't asymptotically approach an upper bound well before we used up the resources in the universe. See also my latest post http://lesswrong.com/lw/1oj/complexity_of_value_complexity_of_outcome/ where I talk about some related ideas.

Comment author: byrnema 26 January 2010 01:26:39AM *  5 points [-]

I want to continue to be someone who thinks things and does stuff, even at a cost in happiness.

The FAI can make you feel as though you "think things and do stuff", just by changing your preferences. I don't think any reason beginning with "I want" is going to work, because your preferences aren't fixed or immutable in this hypothetical.

Anyway, can you explain why you are attached to your preferences? That "it's better to value this than value that" is incoherent, and the FAI will see that. The FAI will have no objective, logical reason to distinguish between values you currently have and are attached to and values that you could have and be attached to, and might as well modify you than modify the universe. (Because the universe has exactly the same value either way.)

Comment author: LucasSloan 26 January 2010 01:30:48AM 3 points [-]

If any possible goal is considered to have the same value (by what standard?), then the "FAI" is not friendly. If preferences don't matter, then why does them not mattering matter? Why change one's utility function at all, if anything is as good as anything else?

Comment author: byrnema 26 January 2010 02:21:56AM *  2 points [-]

Well I understand I owe money to the Singularity Institute now for speculating on what the output of the CEV would be. (Dire Warnings #3)

Comment author: timtyler 26 January 2010 10:22:37AM *  2 points [-]

That page said:

"None may argue on the SL4 mailing list about the output of CEV".

A different place, with different rules.

Comment author: Kutta 26 January 2010 11:08:23AM *  2 points [-]

The FAI can make you feel as though you "think things and do stuff", just by changing your preferences.

I can't see how a true FAI can change my preferences if I prefer them not being changed.

Anyway, can you explain why you are attached to your preferences? That "it's better to value this than value that" is incoherent, and the FAI will see that. The FAI will have no objective, logical reason to distinguish between values you currently have and are attached to and values that you could have and be attached to, and might as well modify you than modify the universe. (Because the universe has exactly the same value either way.)

It does not work this way. We want to do what is right, not what would conform our utility function if we were petunias or paperclip AIs or randomly chosen expected utility maximizers; the whole point of Friendliness is to find out and implement what we care about and not anything else.

I'm not only attached to my preferences; I am great part my preferences. I even have a preference such that I don't want my preferences to be forcibly changed. Thinking about changing meta-preferences quickly leads to a strange loop, but if I look at specific outcome (like me being turned to orgasmium) I can still make a moral judgement and reject that outcome.

The FAI will have no objective, logical reason to distinguish between values you currently have and are attached to and values that you could have and be attached to, and might as well modify you than modify the universe. (Because the universe has exactly the same value either way.)

The FAI has a perfectly objective, logical reason to do what's right and not else; its existence and utility function is causally retractable to the humans that designed it. An AI that verges on nihilism and contemplates switching humanity's utility function to something else, partly because the universe has the "exactly same value" either way, is definitely NOT a Friendly AI.

Comment author: byrnema 26 January 2010 05:00:12PM *  1 point [-]

OK, I agree with this comment and this one that if you program an FAI to satisfy our actual preferences with no compromise, than that is what it is going to do. If people have a preference for their values being satisfied in reality, rather than them just being satisfied virtually, then no wire-heading for them.

However, if you do allow compromise so that the FAI should modify preferences that contradict each other, then we might be on our way to wire-heading. Eliezer observes there is a significant 'objective component to human moral intuition'. We also value truth and meaning. (This comment strikes me as relevant.) If the FAI finds that these thre e are incompatible, which preference should it modify?

(Background for this comment in case you're not familiar with my obsession -- how could you have missed it? -- is that objective meaning, from any kind of subjective/objective angle, is incoherent.)

Comment author: Kutta 26 January 2010 06:11:05PM *  5 points [-]

you do allow compromise so that the FAI should modify preferences that contradict each other, then we might be on our way to wire-heading.

First, I just note that this is a full-blown speculation about Friendliness content which should be only done while wearing a gas mask or a clown suit, or after donating to SIAI.

Quoting CEV:

"In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted."

Also:

"Do we want our coherent extrapolated volition to satisfice, or maximize? My guess is that we want our coherent extrapolated volition to satisfice - to apply emergency first aid to human civilization, but not do humanity's work on our behalf, or decide our futures for us. If so, rather than trying to guess the optimal decision of a specific individual, the CEV would pick a solution that satisficed the spread of possibilities for the extrapolated statistical aggregate of humankind."

This should adddress your question. CEV would not typically modify humans on contradictions. But I repeat, this is all speculation.

It's not clear to me from your recent posts whether you've read the metaethics sequence and/or CEV; if you haven't, I recommend it whole-heartedly as it's the most detailed discussion of morality available. Regarding your obsession, I'm aware of it and I think I'm able to understand your history and vantage point that enable such distress to arise, although my current self finds the topic utterly trivial and essentially a non-problem.

Comment author: tut 26 January 2010 11:33:00AM 0 points [-]

...a perfectly objective, ... reason ...

How do you define this term?

Comment author: Kutta 26 January 2010 11:50:40AM *  0 points [-]

"Reason" here: a normal, unexceptional instance of cause and effect. It should be understood in a prosaic way, e.g. reason in a causal sense.

As for "objective", I borrowed it from the parent post to illustrate my point. To expand on "objective" a bit: everything that exists in physical reality is, and our morality is as physical and extant as a brick (via our physical brains), so what sense does it make to distinguish between "subjective" and "objective," or to refer to any phenomena as "objective" when in reality it is not a salient distinguishing feature.

If anything is "objective", then I see no reason why human morality is not, that's why I included the word in my post. But probably the best would be to simply refrain from generating further confusion by the objective/subjective distinction.

Comment author: tut 26 January 2010 12:25:14PM *  1 point [-]

Reason is not the same as cause. Cause is whatever brings something about in the physical world. Reason is a special kind of cause for intentional actions. Specifically a reason for an action is a thought which convinces the actor that the action is good. So an objective reason would need an objective basis for something being called good. I don't know of such a basis, and a bit more than a week ago half of the LW readers were beating up on Byrnema because she kept talking about objective reasons.

Comment author: Kutta 26 January 2010 05:46:11PM 0 points [-]

OK then, it was a misuse of the word from my part. Anyway, I'd never intend a teleological meaning for reasons discussed here before.

Comment author: ciphergoth 26 January 2010 08:43:00AM 0 points [-]

The FAI can make you feel as though you "think things and do stuff", just by changing your preferences.

Please read Not for the Sake of Happiness (Alone) which addresses this point.

Comment author: Stuart_Armstrong 26 January 2010 12:55:50PM 4 points [-]

To those who say they don't want to be wireheaded, how do you really know that, when you haven't tried wireheading?

Same reason I don't try heroin. Wireheading (as generally conceived) imposes a predictable change on the user's utility function; huge and irreversible. Gathering this information is not without cost.

Comment author: Wei_Dai 26 January 2010 01:20:46PM 4 points [-]

I'm not suggesting that you try wireheading now, I'm saying that an FAI can obtain this information without a high cost, and when it does, it may turn out that you actually do prefer to be wireheaded.

Comment author: Stuart_Armstrong 26 January 2010 02:05:44PM 3 points [-]

That's possible (especially the non-addictive type of wire heading).

Though this does touch upon issues of autonomy - I'd like the AI to run it by me, even though it will have correctly predicted that I'd accept.