gRR comments on Holden's Objection 1: Friendliness is dangerous - LessWrong

11 Post author: PhilGoetz 18 May 2012 12:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (428)

You are viewing a single comment's thread. Show more comments above.

Comment author: gRR 19 May 2012 05:43:53PM 0 points [-]

I'd think someone's playing a practical joke on me.

Comment author: JoshuaZ 19 May 2012 05:49:24PM 0 points [-]

And suppose we develop such brain scanning technology and scanning someone else who claims to want the destruction of all life and it says "yep, he does" how would you respond?

Comment author: gRR 19 May 2012 05:56:37PM 0 points [-]

Dunno... propose to kill them quickly and painlessly, maybe? But why do you ask? As I said, I don't expect this to happen.

Comment author: JoshuaZ 19 May 2012 05:59:12PM 0 points [-]

That you don't expect it to happen shouldn't by itself be a reason not to consider it. I'm asking because it seems you are avoiding the hard questions by more or less saying you don't think they will happen. And there are many more conflicting value sets which are less extreme (and apparently more common) than this one.

Comment author: gRR 19 May 2012 06:11:09PM 0 points [-]

Errr. This is a question of simple fact, which is either true or false. I believe it's true, and build the plans accordingly. We can certainly think about contingency plans, of what to do if the belief turns out to be false, but so far no one agreed that the plan is good even in case the belief is true.

Comment author: TheOtherDave 19 May 2012 09:39:25PM 0 points [-]

You've lost me. Can you restate the question of simple fact to which you refer here, which you believe is true? Can you restate the plan that you consider good if that question is true?

Comment author: gRR 19 May 2012 10:33:10PM 0 points [-]

I believe there exist (extrapolated) wishes universal for humans (meaning, true for literally everyone). Among these wishes, I think there is the wish for humans to continue existing. I would like for AI to fulfil this wish (and other universal wishes if there are any), while letting people decide everything else for themselves.

Comment author: TheOtherDave 20 May 2012 04:15:27AM *  1 point [-]

OK, cool.

To answer your question: sure, if I assume (as you seem to) that the extrapolation process is such that I would in fact endorse the results, and I also assume that the extrapolation process is such that if it takes as input all humans it will produce at least one desire that is endorsed by all humans (even if they themselves don't know it in their current form), then I'd agree that's a good plan, if I further assume that it doesn't have any negative side-effects.

But the assumptions strike me as implausible, and that matters.

I mean, if I assume that everyone being thrown into a sufficiently properly designed blender and turned into stew is a process I would endorse, and I also assume that the blending process has no negative side-effects, then I'd agree that that's a good plan, too. I just don't think any such blender is ever going to exist.

Comment author: gRR 20 May 2012 10:44:33AM 1 point [-]

Ok, but do you grant that running a FAI with "unanimous CEV" is at least (1) safe, and (2) uncontroversial? That the worst problem with it is that it may just stand there doing nothing - if I'm wrong about my hypothesis?

Comment author: TheOtherDave 20 May 2012 04:07:27PM 0 points [-]

I don't know how to answer that question. Again, it seems that you're trying to get an answer given a whole bunch of assumptions, but that you resist the effort to make those assumptions clear as part of the answer.

  • It is not clear to me that there exists such a thing as a "unanimous CEV" at all, even in the hypothetical sense of something we might be able to articulate some day with the right tools.

  • If I nevertheless assume that a unanimous CEV exists in that hypothetical sense, it is not clear to me that only one exists; presumably modifications to the CEV-extraction algorithm would result in different CEVs from the same input minds, and I don't see any principled grounds for choosing among that cohort of algorithms that don't in effect involve selecting a desired output first. (In which case CEV extraction is a complete red herring, since the output was a "bottom line" written in advance of CEV's extraction, and we should be asking how that output was actually arrived at and whether we endorse that process. )

  • If I nevertheless assume that a single CEV-extraction algorithm is superior to all the others, and further assume that we select that algorithm via some process I cannot currently imagine and run it, and that we then run a superhuman environment-optimizer with its output as a target, it is not clear to me that I would endorse that state change as an individual. So, no, I don't agree that running it is uncontroversial. (Although everyone might agree afterwards that it was a good idea.)

  • If the state change nevertheless gets implemented, I agree (given all of those assumptions) that the resulting state-change improves the world by the standards of all humanity. "Safe" is an OK word for that, I guess, though it's not the usual meaning of "safe."

  • I don't agree that the worst that happens, if those assumptions turn out to be wrong, is that it stands there and does nothing. The worst that happens is that the superhuman environment-optimizer runs with a target that makes the world worse by the standards of all humanity.

(Yes, I understand that the CEV-extraction algorithm is supposed to prevent that, and I've agreed that if I assume that's true, then this doesn't happen. But now you're asking me to consider what happens if the "hypothesis" is false, so I am no longer just assuming that's true. You're putting a lot of faith in a mysterious extraction algorithm, and it is not clear to me that a non-mysterious algorithm that satisfies that faith is likely, or that the process of coming up with one won't come up with a different algorithm that antisatisfies that faith instead.)

Comment author: Desrtopa 19 May 2012 10:36:06PM 0 points [-]

I believe there exist (extrapolated) wishes universal for humans (meaning, true for literally everyone). Among these wishes, I think there is the wish for humans to continue existing.

Does the existence of the Voluntary Human Extinction Movement affect your belief in this proposition?

Comment author: gRR 19 May 2012 10:40:15PM 0 points [-]

VHEMT supports human extinction primarily because, in the group's view, it would prevent environmental degradation. The group states that a decrease in the human population would prevent a significant amount of man-made human suffering.

Obviously, human extinction is not their terminal value.

Comment author: Desrtopa 19 May 2012 10:44:03PM 0 points [-]

Or at least, not officially. I have known at least one person who professed to desire that the human race go extinct because he thought the universe as a whole would simply be better if humans did not exist. It's possible that he was stating such an extreme position for shock value (he did have a tendency to display some fairly pronounced antisocial tendencies,) and that he had other values that conflicted with this position on some level. But considering the diversity of viewpoints and values I've observed people to hold, I would bet quite heavily against nobody in the world actually desiring the end of human existence.