timtyler comments on Convergence Theories of Meta-Ethics - Less Wrong

7 Post author: Perplexed 07 February 2011 09:53PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (87)

You are viewing a single comment's thread.

Comment author: timtyler 08 February 2011 06:37:54PM *  3 points [-]

I know of two possible reasons why a rational agent might consent to an irreversible change in its values

Omohundro made a list of cases where an agent might change its values - in the basic AI drives:

While it is true that most rational systems will act to preserve their utility functions, there are at least three situations in which they will try to change them. These arise when the physical embodiment of the utility function itself becomes an important part of the assessment of preference. For example, imagine a system whose utility function is “the total amount of time during which the definition of my utility function is U = 0.” To get any utility at all with this perverse preference, the system has to change its utility function to be the constant 0. Once it makes this change, however, there is no going back.[...]

The second kind of situation arises when the physical resources required to store the utility function form a substantial portion of the system’s assets.[...]

The third situation where utility changes may be desirable can arise in game theoretic contexts where the agent wants to make its threats credible It may be able to create a better outcome by changing its utility function and then revealing it to an opponent.[...]

Fairly obviously, there are more cases. For instance: agents can harmlessly delete any preferences which they have for things that are exclusively in the past - saving themselves evaluation time.

Someone should try for a more comprehensive list someday.