Eliezer_Yudkowsky comments on The genie knows, but doesn't care - Less Wrong

54 Post author: RobbBB 06 September 2013 06:42AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (515)

You are viewing a single comment's thread. Show more comments above.

Comment author: RobbBB 12 September 2013 05:36:28PM *  1 point [-]

That problem has got to be solved somehow at some stage, because something that couldn't pass a Turing Test is no AGI.

Not so! An AGI need not think like a human, need not know much of anything about humans, and need not, for that matter, be as intelligent as a human.

To see this, imagine we encountered an alien race of roughly human-level intelligence. Would a human be able to pass as an alien, or an alien as a human? Probably not anytime soon. Possibly not ever.

(Also, passing a Turing Test does not require you to possess a particularly deep understanding of human morality! A simple list of some random things humans consider right or wrong would generally suffice.)

Why is that a problem? Is anyone suggesting AGI can be had for free?

The problem I'm pointing to here is that a lot of people treat 'what I mean' as a magical category. 'Meaning' and 'language' and 'semantics' are single words in English, which masks the complexity of 'just tell the AI to do what I mean'.

Ok. NL is hard. Everyone knows that. But its got to be solved anyway.

Nope!

Yeah. But it wouldn't be an AGI or an SI if it couldn't pass a TT.

It could certainly be an AGI! It couldn't be an SI -- provided it wants to pass a Turing Test, of course -- but that's not a problem we have to solve. It's one the SI can solve for itself.

A problem which has been solved over and over by humans.

No human being has ever created anything -- no system of laws, no government or organization, no human, no artifact -- that, if it were more powerful, would qualify as Friendly. In that sense, everything that currently exists in the universe is non-Friendly, if not outright Unfriendly.

Humans don't need to be loaded apriori with what makes other humans happy, they only need to know general indicators, like smiles and statements of approval.

All or nearly all humans, if they were more powerful, would qualify as Unfriendly.

Moreover, by default, relying on a miscellaneous heap of vaguely moral-sounding machine learning criteria will lead to the end of life on earth. 'Smiles' and 'statements of approval' are not adequate roadmarks, because those are stimuli the SI can seize control of in unhumanistic ways to pump its reward buttons.

"Intelligence on its own does not imply Friendliness."

That is an open question.

No, it isn't. And this is a non sequitur. Nothing else in your post calls orthogonality into question.

Comment author: Eliezer_Yudkowsky 13 September 2013 09:28:01AM 1 point [-]

Please take further discussion with Juno_Watt to your blog.