Juno_Watt comments on The genie knows, but doesn't care - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (515)
Not so! An AGI need not think like a human, need not know much of anything about humans, and need not, for that matter, be as intelligent as a human.
To see this, imagine we encountered an alien race of roughly human-level intelligence. Would a human be able to pass as an alien, or an alien as a human? Probably not anytime soon. Possibly not ever.
(Also, passing a Turing Test does not require you to possess a particularly deep understanding of human morality! A simple list of some random things humans consider right or wrong would generally suffice.)
The problem I'm pointing to here is that a lot of people treat 'what I mean' as a magical category. 'Meaning' and 'language' and 'semantics' are single words in English, which masks the complexity of 'just tell the AI to do what I mean'.
Nope!
It could certainly be an AGI! It couldn't be an SI -- provided it wants to pass a Turing Test, of course -- but that's not a problem we have to solve. It's one the SI can solve for itself.
No human being has ever created anything -- no system of laws, no government or organization, no human, no artifact -- that, if it were more powerful, would qualify as Friendly. In that sense, everything that currently exists in the universe is non-Friendly, if not outright Unfriendly.
All or nearly all humans, if they were more powerful, would qualify as Unfriendly.
Moreover, by default, relying on a miscellaneous heap of vaguely moral-sounding machine learning criteria will lead to the end of life on earth. 'Smiles' and 'statements of approval' are not adequate roadmarks, because those are stimuli the SI can seize control of in unhumanistic ways to pump its reward buttons.
No, it isn't. And this is a non sequitur. Nothing else in your post calls orthogonality into question.
Is that a fact? No, it's a matter of definition. It's scarecely credible you are unaware that a lot of people think the TT is critical to AGI.
I can't see any evidence of anyone invlolved in these discussions doing that. It looks like a straw man to me.
An AI you can't talk to has pretty limited usefulness, and it has pretty limited safety too, since you don;t even have the option of telling it to stop, or expaling to it why you don;t like what it is doing. Oh, and isn't EY assumign that an AGi will have NLP? After all, it is supposed to be able to talk its way out of the box.
It can figure out semantics for itslef. Values are a subsert of semantics...
Wherer do you get this stuff from? Modern societies, with their complex legal and security systems are much less violent than ancient socieites. To take ut one example.
Gee. Then I guess they don't have an architecutre with a basic drive to be friendly.
Why don't humans do that?
Uh-huh. MIRI has settled that centuries-aold quesiton for once and all has it?
It can't be a non-sequitur, since it is not an arguemnt but a statement of fact.
So? It wasn't relevant anywhere else.
Let's run with that idea. There's 'general-intelligence-1', which means "domain-general intelligence at a level comparable to that of a human"; and there's 'general-intelligence-2', which means (I take it) "domain-general intelligence at a level comparable to that of a human, plus the ability to solve the Turing Test". On the face of it, GI2 looks like a much more ad-hoc and heterogeneous definition. To use GI2 is to assert, by fiat, that most intelligences (e.g., most intelligent alien races) of roughly human-level intellectual ability (including ones a bit smarter than humans) are not general intelligences, because they aren't optimized for disguising themselves as one particular species from a Milky Way planet called Earth.
If your definition has nothing to recommend itself, then more useful definitions are on offer.
'Mean', 'right', 'rational', etc.
An AI doesn't need to be able to trick you in order for you to be able to give it instructions. All sorts of useful skills AIs have these days don't require them to persuade everyone that they're human.
Read the article you're commenting on. One of its two main theses is, in bold: The seed is not the superintelligence.
Yes. We should focusing on solving the values part of semantics, rather than the entire superset.
Doesn't matter. Give an ancient or a modern society arbitrarily large amounts of power overnight, and the end results won't differ in any humanly important way. There won't be any nights after that.
Setting aside the power issue: Because humans don't use 'smiles' or 'statements of approval' or any other string of virtues an AI researcher has come up with to date for its decision criteria. The specific proposals for making AI humanistic to date have all depended on fake utility functions, or stochastic riffs on fake utility functions.
Lots of easy questions were centuries old when they were solved. 'This is old, therefore I'm not going to think about it' is a bad pattern to fall into. If you think the orthogonality thesis is wrong, then give an argument establishing agnosticism or its negation.