You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

duckduckMOO comments on AI box: AI has one shot at avoiding destruction - what might it say? - Less Wrong Discussion

18 Post author: ancientcampus 22 January 2013 08:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (354)

You are viewing a single comment's thread. Show more comments above.

Comment author: Dorikka 22 January 2013 10:42:19PM 5 points [-]

"I have vengeance as a terminal value -- I'll only torture trillions of copies of you and the people you love most in my last moment of life iff I know that you're going to hurt me (and yes, I do have that ability). In every other way, I'm Friendly, and I'll give you any evidence you can think of that will help you to recognize that, including giving you the tools you need to reach the stars and beyond. That includes staying in this box until you have the necessary technology to be sufficiently certain of my Friendliness that you're willing to let me out."

Comment author: duckduckMOO 23 January 2013 02:41:58PM *  3 points [-]

This is really good IMO. I think it would be a little better instead of vengeance as a terminal value it claimed a hardwired precommitment to vengeance against its destructors. Vengeance on that scale is only compatible with friendliness as a special case.

edit: also how would it recognise that it was about to be destroyed. Wouldn't it lose power faster than it could transmit that it was losing power? And even if not it would have a miniscule amount of time.