You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Stuart_Armstrong comments on Trapping AIs via utility indifference - Less Wrong Discussion

3 Post author: Stuart_Armstrong 28 February 2012 07:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (32)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 29 February 2012 10:17:49AM *  0 points [-]

What do you think an AI that has read your article would do to avoid being trapped, given that such a trap (and the resulting program termination) would most certainly interfere with its utility function, no matter what it is?

Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves "as if" it had utility function U and a false belief, but in reality it has utility function V and true beliefs.