Stuart_Armstrong comments on Heroin model: AI "manipulates" "unmanipulatable" reward - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (10)
Well in a sense U(++,-) itself contradicts μ. After all in when given heroin the human seeks it out and acquires more utility than not seeking it out, why doesn't the human seek it out volunterily?
Replace "force the human to take heroin" with "gives the human a single sock" and "the human subsequently seeks out heroin" with "the human subsequently seeks out another sock". The formal structure of this can correspond to something quite acceptable.