Jiro comments on The Hidden Complexity of Wishes - Less Wrong

58 Post author: Eliezer_Yudkowsky 24 November 2007 12:12AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (121)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: Jiro 26 August 2013 03:36:26PM *  2 points [-]

Sure, if the results were explained to you, you might not like them, but you built the genie to grant wishes, not explain them.

That's why I suggested you can talk to the genie. Provided the genie is not malicious, it shouldn't conceal any such consequences; you just need to quiz it well.

It's sort of like the Turing test, but used to determine wish acceptability instead of intelligence. If a human can talk to it and say it is a person, treat it like a person. If a human can talk to it and decide the wish is good, treat the wish as good. And just like the Turing test, it relies on the fact that humans are better at asking questions during the process than writing long lists of prearranged questions that try to cover all situations in advance.

Well, that's what the term usually means.

Really? A clueless genie is a genie that is asked to do something, knows that the way it does it is displeasing to you, and does it anyway? I wouldn't call that a clueless genie.

What terms would you use for

-- a genie that would never knowingly displease you in granting wishes, but may do so out of ignorance

-- a genie that will knowingly displease you in granting wishes

-- a genie that will deliberately displease you in granting wishes?

Comment author: MugaSofer 26 August 2013 04:35:53PM 1 point [-]

More full response coming soon to a comment box near you. For now, terms! Everyone loves terms.

Really?

Here's how I learned it:

A "genie" will grant your wishes, without regard to what you actually want.

A malicious genie will grant your wishes, but deliberately seek out ways to do so that will do things you don't actually want.

A helpful - or Friendly - genie will work out what you actually wanted in the first place, and just give you that, without any of this tiresome "wishing" business. Sometimes called a "useful" genie - there's really no one agreed-on term. Essentially, what you're trying to replicate with carefully-worded wishes to other genies.

Comment author: Jiro 26 August 2013 08:19:50PM *  0 points [-]

I want to know what terms you would use that would distinguish between a genie that grants wishes in ways I don't want because it doesn't know any better, and a genie that grants wishes in ways I don't want despite knowing better.

By your definitions above, these are both just "genie" and you don't really have terms to distinguish between them at all.

Comment author: MugaSofer 26 August 2013 09:39:27PM 0 points [-]

Well, since the whole genie thing is a metaphor for superintelligence, "this genie is trying to be Friendly but it's too dumb to model you well" doesn't really come up. If it did, I guess you would need to invent a new term (Friendly Narrow AI?) to distinguish it, yeah.

Comment author: Jiro 26 August 2013 10:15:41PM *  0 points [-]

It's my impression that the typical scenario of a superintelligence that kills everyone to make paperclips, because you told it to make paperclips, falls into the first category. It's trying to follow your request; it just doesn't know that your request really means "I want to make paperclips, subject to some implicit constraints such as ethics, being able to stop when told to stop, etc." If it does know what your request really means, yet it still maximizes paperclips by killing people, it's disobeying your intention if not your literal words.

(And then there's always the possibility of telling it "make paperclips, in the way that I mean when I ask that". If you say that, and the AI still kills people, it's unfriendly by both our standards--since your request explicitly told it to follow your intention, disobeying your intention also disobeys your literal words.)

Comment author: MugaSofer 28 August 2013 06:19:42PM 0 points [-]

It's trying to follow your request; it just doesn't know that your request really means "I want to make paperclips, subject to some implicit constraints such as ethics, being able to stop when told to stop, etc." If it does know what your request really means, yet it still maximizes paperclips by killing people, it's disobeying your intention if not your literal words.

Well, sure it is. That's the point of genies (and the analogous point about programming AIs): they do what you tell them, not what you wanted.

Comment author: private_messaging 28 August 2013 07:54:33PM *  1 point [-]

What you tell is a pattern of pressure changes in the air, it's only the megaphones and tape recorders that literally "do what you tell them".

The genie that would do what you want would have to use the pressure changes as a clue for deducing your intent. When writing a story about a genie that does "what you tell them, not what you wanted" you have to use the pressure changes as a clue for deducing some range of misunderstandings of those orders, and then pick some understanding that you think makes the best story. It may be that we have an innate mechanism for finding the range of possible misunderstandings, to be able to combine following orders with self interest.

Comment author: ArisKatsaris 28 August 2013 08:16:01PM *  5 points [-]

"What you tell them" in the context of programs is meant in the sense of "What you program them to", not in the sense of "The dictionary definition of the word-noises you make when talking into their speakers".

Comment author: private_messaging 28 August 2013 09:04:32PM 0 points [-]

They were talking of genies, though, and the sort of failure that tends to arise from how a short sentence describes multitude of diverse intents (i.e. ambiguity). Programming is about specifying what you want in extremely verbose manner, the verbosity being a necessary consequence of non-ambiguity.