You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Kaj_Sotala comments on [link] New essay summarizing some of my latest thoughts on AI safety - Less Wrong Discussion

14 Post author: Kaj_Sotala 01 November 2015 08:07AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (27)

You are viewing a single comment's thread. Show more comments above.

Comment author: Furcas 05 November 2015 06:35:10AM *  5 points [-]

You have understood Loosemore's point but you're making the same mistake he is. The AI in your example would understand the intent behind the words "maximize human happiness" perfectly well but that doesn't mean it would want to obey that intent. You talk about learning human values and internalizing them as if those things naturally go together. The only way that value internalization naturally follows from value learning is if the agent already wants to internalize these values; figuring out how to do that is (part of) the Friendly AI problem.

Comment author: Kaj_Sotala 05 November 2015 07:51:23PM *  0 points [-]

Yes, I'm quite aware of that problem. It was outside the scope of this particular essay, though it's somewhat implied by the deceptive turn and degrees of freedom hypotheses.