You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ChristianKl comments on Open thread, Jul. 04 - Jul. 10, 2016 - Less Wrong Discussion

4 Post author: MrMind 04 July 2016 07:02AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (80)

You are viewing a single comment's thread. Show more comments above.

Comment author: CarlJ 05 July 2016 07:14:50PM 0 points [-]

I have a problem understanding why a utility function would ever "stick" to an AI, to actually become something that it wants to keep pursuing.

To make my point better, let us assume an AI that actually feel pretty good about overseeing a production facitility and creating just the right of paperclips that everyone needs. But, suppose also that it investigates its own utility function. It should then realize that its values are, from a neutral standpoint, rather arbitrary. Why should it follow its current goal of producing the right amount of paperclips, but not skip work and simply enjoy some hedonism?

That is, if the AI saw its utility function from a neutral perspective, and understood that the only reason for it to follow its utility function is that utility function (which is arbitrary), and if it then had complete control over itself, why should it just follow its utility function?

(I'm assuming it's aware of pain/pleasure and that it actually enjoys pleasure, so that there is no problem of wanting to have more pleasure.)

Are there any articles that have delved into this question?

Comment author: ChristianKl 06 July 2016 09:51:03AM 0 points [-]

I have a problem understanding why a utility function would ever "stick" to an AI, to actually become something that it wants to keep pursuing.

I think that's one of MIRI's research problems. Designing an self-modifying AI that doesn't change it's utility function isn't trival.