Wei_Dai comments on A Master-Slave Model of Human Preferences - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (80)
I don't hear differently... I even suspect that preference is introspective, that is depends on a way the system works "internally", not just on how it interacts with environment. That is, two agents with different preferences may do exactly the same thing in all contexts. Even if not, it's a long way between how the agent (in its craziness and stupidity) actually changes the environment, and how it would prefer (on reflection, if it was smarter and saner) the environment to change.
I want to point out that in the interpretation of prior as weights on possible universes, specifically as how much one cares about different universes, we can't just replace "incorrect" beliefs with "the truth". In this interpretation, there can still be errors in one's beliefs caused by things like past computational mistakes, and I think fixing those errors would constitute helping, but the prior perhaps needs to be preserved as part of preference.