Stuart_Armstrong comments on Trapping AIs via utility indifference - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (32)
Nothing at all. The trap works even if the AI knows everything there is to know, precisely because after utility indifference, its behaviour is exactly compatible with its utility function. It behaves "as if" it had utility function U and a false belief, but in reality it has utility function V and true beliefs.