sark comments on A Much Better Life? - Less Wrong

61 Post author: Psychohistorian 03 February 2010 08:01PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (173)

You are viewing a single comment's thread. Show more comments above.

Comment author: PlatypusNinja 04 February 2010 06:29:41PM 11 points [-]

It's often difficult to think about humans' utility functions, because we're used to taking them as an input. Instead, I like to imagine that I'm designing an AI, and think about what its utility function should look like. For simplicity, let's assume I'm building a paperclip-maximizing AI: I'm going to build the AI's utility function in a way that lets it efficiently maximize paperclips.

This AI is self-modifying, meaning it can rewrite its own utility function. So, for example, it might rewrite its utility function to include a term for keeping its promises, if it determined that this would enhance its ability to maximize paperclips.

This AI has the ability to rewrite itself to "while(true) { happy(); }". It evaluates this action in terms of its current utility function: "If I wirehead myself, how many paperclips will I produce?" vs "If I don't wirehead myself, how many paperclips will I produce?" It sees that not wireheading is the better choice.

If, for some reason, I've written the AI to evaluate decisions based on its future utility function, then it immediately wireheads itself. In that case, arguably, I have not written an AI at all; I've simply written a very large amount of source code that compiles to "while(true) { happy(); }".

I would argue that any humans that had this bug in their utility function have (mostly) failed to reproduce, which is why most existing humans are opposed to wireheading.

Comment author: sark 09 February 2010 11:56:46AM *  7 points [-]

I would argue that any humans that had this bug in their utility function have (mostly) failed to reproduce, which is why most existing humans are opposed to wireheading.

Why would evolution come up with a fully general solution against such 'bugs in our utility functions'?

Take addiction to a substance X. Evolution wouldn't give us a psychological capacity to inspect our utility functions and to guard against such counterfeit utility. It would simply give us a distaste for substance X.

My guess is that we have some kind of self-referential utility function. We do not only want what our utility functions tell us we want. We also want utility (happiness) per se. And this want is itself included in that utility function!

When thinking about wireheading I think we are judging a tradeoff, between satisfying mere happiness and the states of affairs which we prefer (not including happiness).

Comment author: PlatypusNinja 09 February 2010 06:18:06PM 1 point [-]

So, people who have a strong component of "just be happy" in their utility function might choose to wirehead, and people in which other components are dominant might choose not to.

That sounds reasonable.