conchis comments on Post Your Utility Function - Less Wrong

28 Post author: taw 04 June 2009 05:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (273)

You are viewing a single comment's thread. Show more comments above.

Comment author: pjeby 04 June 2009 06:00:49AM 1 point [-]

Utility functions are really bad match for human preferences, and one of the major premises we accept is wrong.

Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can't model all that crap with one number.

The study of affective synchrony shows that humans have simultaneously-active positive and negative affect systems. At extreme levels in either system, the other is shut down, but the rest of the time, they can support or oppose each other. (And in positions of opposition, we experience conflict and indecision.)

Meanwhile, the activation of these systems is influenced by current state/context/priming, as well as the envisioned future. So unless your attempt at modeling a utility function includes terms for all these things, you're sunk.

(Personally, this is where I think the idea of CEV has its biggest challenge: I know of no theoretical reason why humans must have convergent or consistent utility functions as individuals, let alone as a species.)

Comment author: timtyler 05 June 2009 05:10:46PM 0 points [-]

Re: Human utility functions are relative, contextual, and include semi-independent positive-negative axes. You can't model all that crap with one number.

That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?

Comment author: pjeby 05 June 2009 05:23:38PM 1 point [-]

That is not a coherent criticism of utilitarianism. Do you understand what it is that you are criticising?

Yes, I do... and it's not utilitarianism. ;-)

What I'm criticizing is the built-in System 2 motivation-comprehending model whose function is predicting the actions of others, but which usually fails when applied to self, because it doesn't model all of the relevant System 1 features.

If you try to build a human-values-friendly AI, or decide what would be of benefit to a person (or people), and you base it on System 2's model, you will get mistakes, because System 2's map of System 1 is flawed, in the same way that Newtonian physics is flawed for predicting near-light-speed mechanics: it leaves out important terms.