Nornagest comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: Jiro 06 September 2013 05:27:22PM *  3 points [-]

By this reasoning, an AI asked to do anything at all would respond by immediately modifying itself to set its utility function to MAXINT. You don't need to speak to it in English for that--if you asked the AI to maximize paperclips, that is the equivalent of "Maximize the value of X, where X is my current opinion of how many paperclips there are", and it would modify its paperclip-counting module to always return MAXINT.

You are correct that telling the AI to do Y is equivalent to "maximize the value of X, where X is my current opinion about Y". However, "current" really means "current", not "new". If the AI is actually trying to obey the command to do Y, it won't change its utility function unless having a new utility function will increase its utility according to its current utility function. Neither misunderstanding nor understanding will raise its utility unless its current utility function values having a utility function that misunderstands or understands.

Comment author: Nornagest 08 September 2013 07:32:59AM *  3 points [-]

By this reasoning, an AI asked to do anything at all would respond by immediately modifying itself to set its utility function to MAXINT.

That's allegedly more or less what happened to Eurisko (here, section 2), although it didn't trick itself quite that cleanly. The problem was only solved by algorithmically walling off its utility function from self-modification: an option that wouldn't work for sufficiently strong AI, and one to avoid if you want to eventually allow your AI the capacity for a more precise notion of utility than you can give it.

Paperclipping as the term's used here assumes value stability.