JoshuaZ comments on Holden's Objection 1: Friendliness is dangerous - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (428)
How would you respond if you were subject to such a brain scan and then informed that deep inside you actually are a nihilist who prefers the complete destruction of all life?
I'd think someone's playing a practical joke on me.
And suppose we develop such brain scanning technology and scanning someone else who claims to want the destruction of all life and it says "yep, he does" how would you respond?
Dunno... propose to kill them quickly and painlessly, maybe? But why do you ask? As I said, I don't expect this to happen.
That you don't expect it to happen shouldn't by itself be a reason not to consider it. I'm asking because it seems you are avoiding the hard questions by more or less saying you don't think they will happen. And there are many more conflicting value sets which are less extreme (and apparently more common) than this one.
Errr. This is a question of simple fact, which is either true or false. I believe it's true, and build the plans accordingly. We can certainly think about contingency plans, of what to do if the belief turns out to be false, but so far no one agreed that the plan is good even in case the belief is true.
You've lost me. Can you restate the question of simple fact to which you refer here, which you believe is true? Can you restate the plan that you consider good if that question is true?
I believe there exist (extrapolated) wishes universal for humans (meaning, true for literally everyone). Among these wishes, I think there is the wish for humans to continue existing. I would like for AI to fulfil this wish (and other universal wishes if there are any), while letting people decide everything else for themselves.
OK, cool.
To answer your question: sure, if I assume (as you seem to) that the extrapolation process is such that I would in fact endorse the results, and I also assume that the extrapolation process is such that if it takes as input all humans it will produce at least one desire that is endorsed by all humans (even if they themselves don't know it in their current form), then I'd agree that's a good plan, if I further assume that it doesn't have any negative side-effects.
But the assumptions strike me as implausible, and that matters.
I mean, if I assume that everyone being thrown into a sufficiently properly designed blender and turned into stew is a process I would endorse, and I also assume that the blending process has no negative side-effects, then I'd agree that that's a good plan, too. I just don't think any such blender is ever going to exist.
Ok, but do you grant that running a FAI with "unanimous CEV" is at least (1) safe, and (2) uncontroversial? That the worst problem with it is that it may just stand there doing nothing - if I'm wrong about my hypothesis?
Does the existence of the Voluntary Human Extinction Movement affect your belief in this proposition?
Obviously, human extinction is not their terminal value.