[link] New essay summarizing some of my latest thoughts on AI safety

Kaj_Sotala

New essay summarizing some of my latest thoughts on AI safety, ~3500 words. I explain why I think that some of the thought experiments that have previously been used to illustrate the dangers of AI are flawed and should be used very cautiously, why I'm less worried about the dangers of AI than I used to be, and what are some of the remaining reasons for why I do continue to be somewhat worried.

http://kajsotala.fi/2015/10/maverick-nannies-and-danger-theses/

Backcover celebrity endorsement: "Thanks, Kaj, for a very nice write-up. It feels good to be discussing actually meaningful issues regarding AI safety. This is a big contrast to discussions I've had in the past with MIRI folks on AI safety, wherein they have generally tried to direct the conversation toward bizarre, pointless irrelevancies like "the values that would be held by a randomly selected mind", or "AIs with superhuman intelligence making retarded judgments" (like tiling the universe with paperclips to make humans happy), and so forth.... Now OTOH, we are actually discussing things of some potential practical meaning ;p ..." -- Ben Goertzel

http://kajsotala.fi/2015/10/maverick-nannies-and-danger-theses/

You have understood Loosemore's point but you're making the same mistake he is. The AI in your example would understand the intent behind the words "maximize human happiness" perfectly well but that doesn't mean it would want to obey that intent. You talk about learning human values and internalizing them as if those things naturally go together. The only way that value internalization naturally follows from value learning is if the agent already wants to internalize these values; figuring out how to do that is (part of) the Friendly AI problem.

Yes, I'm quite aware of that problem. It was outside the scope of this particular essay, though it's somewhat implied by the deceptive turn and degrees of freedom hypotheses.

23

[link] New essay summarizing some of my latest thoughts on AI safety

23

23

23

[link] New essay summarizing some of my latest thoughts on AI safety

23

23