Stuart_Armstrong comments on Heroin model: AI "manipulates" "unmanipulatable" reward - All
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (10)
We feel that that is true, but "heroin replaces the human's utility" and "humans have composite utility where heroin is concerned" both lead to identical predictions. So you can't deduce the human's utility merely from observation; you need priors over what is irrational and what isn't.