Wei_Dai comments on Proper value learning through indifference - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (50)
It seems to me that there are natural ways to implement value loading as UDT agents, with the properties you're looking for. For example, if the agent values eating cookies in universes where its creator wants it to eat cookies, and values not eating cookies in universes where its creator doesn't want it to eat cookies (glossing over how to define "creator wants" for now), then I don't see any problems with the agent manipulating its own moral changes or avoiding asking whether eating cookies is bad. So I'm not seeing the motivation for coming up with another decision theory framework here...