Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Tim_Tyler comments on Sympathetic Minds - Less Wrong

25 Post author: Eliezer_Yudkowsky 19 January 2009 09:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (25)

Sort By: Old

You are viewing a single comment's thread.

Comment author: Tim_Tyler 20 January 2009 09:37:08PM 1 point [-]

The core of most of my disagreements with this article find their most concentrated expression in:

"Happiness" is an idiom of policy reinforcement learning, not expected utility maximization.

Under Omohundro's model of intelligent systems, these two approaches converge. As they do so, the reward signal of reinforcement learning and the concept of expected utility also converge. In other words, it is rather inappropriate to emphasize the differences between these two systems as though it was a fundamental one.

There are differences - but they are rather superficial. For example, there is often a happiness "set point", for example - whereas that concept is typically more elusive for an expected utility maximizer. However, the analogies between the concepts are deep and fundamental: an agent maximising its happiness is doing something deeply and fundamentally similar to an agent maximising its expected utility. That becomes obvious if you substitute "happiness" for "expected utility".

In the case of real organisms, that substitution is doubly appropriate - because of evolution. The "happiness" function is not an arbitrarily chosen one - it is created in such a way that it converges closely on a function that favours behaviour resulting in increased expected ancestral representation. So, happiness gets an "expectation" of future events built into it automatically by the evolutionary process.