Manfred comments on Value Stability and Aggregation - Less Wrong

8 Post author: jimrandomh 06 February 2011 06:30PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread.

Comment author: Manfred 06 February 2011 06:59:47PM 7 points [-]

Linear aggregation works just fine for HapMax: it maximizes the complex neurological definition it wants to maximize, just as it wants. All the "catastrophes" you point out are not due to an agent having an internal error, but rather a conflict between what they want and what humans want. Additionally, the aggregation procedure is completely defined if you have a complete utility function.

So the problem is really "how does a human-like utility function look?" Because, as you argue, anything that tries to take a shortcut to human-like behavior can lead to catastrophes.

Comment author: jimrandomh 06 February 2011 10:04:49PM 4 points [-]

Yes, linear aggregation in HapMax produces a result that agrees with HapMax. But it does not agree with the intuitions that HapMax's creators had, and I think that utility functions which use linear aggregation will in general tend to produce results that are counterintuitive in dramatic (and therefore likely very bad) ways.

Comment author: Manfred 07 February 2011 06:04:09AM *  2 points [-]

Rather than "counterintuitive," I'd prefer "inhuman" or "unfriendly." If the creators had linear utility functions on the same stuff, HapMax would fit in just fine. If humans have a near-linear utility function on something, then an AI that has a linear utility function there will cause no catastrophes. I can't think of any problems unique to linear weighting - the problem is really when the weighting isn't like ours.