TheOtherDave comments on The Human's Hidden Utility Function (Maybe) - Less Wrong

44 Post author: lukeprog 23 January 2012 07:39PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (87)

You are viewing a single comment's thread.

Comment author: TheOtherDave 23 January 2012 11:40:02PM 0 points [-]

At a glance, it seems that upon reflection I might embrace an extrapolation of the model-based system's preferences as representing "my values," and I would reject the outputs of the model-free and Pavlovian systems as the outputs of dumb systems that evolved for their computational simplicity, and can be seen as ways of trying to approximate the full power of a model-based system responsible for goal-directed behavior.

At a glance, I might be more comfortable embracing an extrapolation of the combination of the model-based system's preferences and the Pavlovian system's preferences.

Admittedly, a first step in extrapolating the Pavlovian system's preferences might be to represent its various targets as goals in a model, thereby leaving the extrapolator with a single system to extrapolate, but given that 99% of the work takes place after this point I'm not sure how much I care. Much more important is to not lose track of that stuff accidentally.