TheAncientGeek comments on Debunking Fallacies in the Theory of AI Motivation - LessWrong

8 Post author: Richard_Loosemore 05 May 2015 02:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (343)

You are viewing a single comment's thread. Show more comments above.

Comment author: drethelin 14 May 2015 07:24:01AM 3 points [-]

1) We want the AI to be able to learn and grow in power, and make decisions about its own structure and behavior without our input. We want it to be able to change.

2) we want the AI to fundamentally do the things we prefer.

This is the the basic dichotomy: How do you make an AI that modifies itself, but only in ways that don't make it hurt you? This is WHY we talk about hard-coding in moral codes. And part of the reason they would be "hard-coded" and thus unmodifiable is because we do not want to take the risk of the AI deciding something we don't like is morally correct and implementing it on us. But anything made by humans to be unmodifiable by the AI runs the risk of being messed up by the humans writing it. And this is the reason why we should be worried about an AI with a poorly made utility function: Because the utility function is the exact part of of the AI people would be most tempted to force the AI not to ever question.

Comment author: TheAncientGeek 14 May 2015 12:52:02PM *  1 point [-]

This is the the basic dichotomy: How do you make an AI that modifies itself, but only in ways that don't make it hurt you? This is WHY we talk about hard-coding in moral codes

(Correct) hardcoding is one answer, corrigibility another, reflective self correction another....