You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Armok_GoB comments on In favour of a selective CEV initial dynamic - Less Wrong Discussion

12 [deleted] 21 October 2011 05:33PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (110)

You are viewing a single comment's thread.

Comment author: Armok_GoB 18 November 2011 11:18:10PM 3 points [-]

I just got struck by an idea that seems to obvious, to naive, to possibly be true, and which horrified me causing my brain to throw a huge batch or rationalizations at it to stop me from believing something as obviously low status. I'm currently very undecided, but sich it seems like the thing I can't handle on my own I'll just leave a transcript of my uncensored internal monologue here:

  • What volition do I want to extrapolate
  • MY one, tautologically
  • But Eliezer, the great leader who is way, way way smarter than you said you shouldn't and thinking that was evil!
  • He also said you shouldn't just trust him like a great leader and reason from authority like that.
  • But you want to maximize the CEV of humanity!
  • In that case, it doesn't matter which is used because they are identical. And I might be wrong about them being identical, in wich case I want my real preferences used, and while concluding that with THIS brain would not be safe, for a CEV superintelligence it would.
  • Others will try to stop you!
  • UDT. their volitions would be taken into account in proportion to their power to stop me, such that they would using this same reasoning would be better of helping me.
  • If everyone does that you'll be worse of!
  • Again, UDT. Whatever CEV gets implemented in the end will take into account all arguments of this sort and modify itself into whatever I SHOULD have made in the first place.
  • You are an evil bad poopyhead!
  • If I am evil, I want to believe I am evil, and if I am nice, I want to believe I am nice. Or maybe I just want to believe I'm nice regardless but have the AI implement my evil preferences anyway.

  • Please note that I do not endorse my every though, and probably will regret posting this in the morning. As you can see, I'm to tired to even correct this obvious contradiction in my beliefs, and to tired to care I know that I believe every statement is true because I believe I believe a contradiction and I believe contradictions imply all statements being true. Or spelling properly.

Comment author: TheOtherDave 19 November 2011 12:37:15AM 1 point [-]

Leaving all the in-group/out-group anxiety aside, and assuming I were actually in a position where I get to choose whose volition to extrapolate, there's three options: ...humanity's extrapolated volition is inconsistent with mine (in which case I get less of what I want by using humanity's judgement rather than my own),
...HEV is consistent with, but different from, mine (in which case I get everything I want either way), or
...HEV is identical to mine (in which case I get everything I want either way).

So HEV <= mine.

That said, others more reliably get more of what they want using HEV than using mine, which potentially makes it easier to obtain their cooperation if they think I'm going to use HEV. So I should convince them of that.

Comment author: Armok_GoB 19 November 2011 06:44:02PM 0 points [-]

But they'd prefer just the CEV of you two to the one of all humanity, and the same goes for each single human who'd raise that objection. The end result is the CEV of you+everyone hat could have stopped you. And this dosn't need handling before you make it either: I'm pretty sure it arises naturally from TDT if you implement your own and were only able to do so because you used this argument on a bunch of people.