You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Alicorn comments on Stupid Questions Open Thread Round 4 - Less Wrong Discussion

6 Post author: lukeprog 27 August 2012 12:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (179)

You are viewing a single comment's thread. Show more comments above.

Comment author: Alicorn 31 August 2012 11:46:10PM *  7 points [-]

We wouldn't. However, the FAI knows that if it changed its code to unFriendly code, then unFriendly things would happen. It's Friendly, so it doesn't want unFriendly things to happen, so it doesn't want to change its code in such a way as to cause those things - so a proper FAI is stably Friendly. Unfortunately, this works both ways: an AI that wants something else will want to keep wanting it, and will resist attempts to change what it wants.

There's more on this in Omohundro's paper "Basic AI Drives"; relevant keyword is "goal distortion". You can also check out various uses of the classic example of giving Gandhi a pill that would, if taken, make him want to murder people. (Hint: he does not take it, 'cause he doesn't want people to get murdered.)