MugaSofer comments on The Hidden Complexity of Wishes - Less Wrong

58 Post author: Eliezer_Yudkowsky 24 November 2007 12:12AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (121)

Sort By: Old

You are viewing a single comment's thread. Show more comments above.

Comment author: MugaSofer 26 August 2013 04:35:53PM 1 point [-]

More full response coming soon to a comment box near you. For now, terms! Everyone loves terms.

Really?

Here's how I learned it:

A "genie" will grant your wishes, without regard to what you actually want.

A malicious genie will grant your wishes, but deliberately seek out ways to do so that will do things you don't actually want.

A helpful - or Friendly - genie will work out what you actually wanted in the first place, and just give you that, without any of this tiresome "wishing" business. Sometimes called a "useful" genie - there's really no one agreed-on term. Essentially, what you're trying to replicate with carefully-worded wishes to other genies.

Comment author: Jiro 26 August 2013 08:19:50PM *  0 points [-]

I want to know what terms you would use that would distinguish between a genie that grants wishes in ways I don't want because it doesn't know any better, and a genie that grants wishes in ways I don't want despite knowing better.

By your definitions above, these are both just "genie" and you don't really have terms to distinguish between them at all.

Comment author: MugaSofer 26 August 2013 09:39:27PM 0 points [-]

Well, since the whole genie thing is a metaphor for superintelligence, "this genie is trying to be Friendly but it's too dumb to model you well" doesn't really come up. If it did, I guess you would need to invent a new term (Friendly Narrow AI?) to distinguish it, yeah.

Comment author: Jiro 26 August 2013 10:15:41PM *  0 points [-]

It's my impression that the typical scenario of a superintelligence that kills everyone to make paperclips, because you told it to make paperclips, falls into the first category. It's trying to follow your request; it just doesn't know that your request really means "I want to make paperclips, subject to some implicit constraints such as ethics, being able to stop when told to stop, etc." If it does know what your request really means, yet it still maximizes paperclips by killing people, it's disobeying your intention if not your literal words.

(And then there's always the possibility of telling it "make paperclips, in the way that I mean when I ask that". If you say that, and the AI still kills people, it's unfriendly by both our standards--since your request explicitly told it to follow your intention, disobeying your intention also disobeys your literal words.)

Comment author: MugaSofer 28 August 2013 06:19:42PM 0 points [-]

It's trying to follow your request; it just doesn't know that your request really means "I want to make paperclips, subject to some implicit constraints such as ethics, being able to stop when told to stop, etc." If it does know what your request really means, yet it still maximizes paperclips by killing people, it's disobeying your intention if not your literal words.

Well, sure it is. That's the point of genies (and the analogous point about programming AIs): they do what you tell them, not what you wanted.

Comment author: private_messaging 28 August 2013 07:54:33PM *  1 point [-]

What you tell is a pattern of pressure changes in the air, it's only the megaphones and tape recorders that literally "do what you tell them".

The genie that would do what you want would have to use the pressure changes as a clue for deducing your intent. When writing a story about a genie that does "what you tell them, not what you wanted" you have to use the pressure changes as a clue for deducing some range of misunderstandings of those orders, and then pick some understanding that you think makes the best story. It may be that we have an innate mechanism for finding the range of possible misunderstandings, to be able to combine following orders with self interest.

Comment author: ArisKatsaris 28 August 2013 08:16:01PM *  5 points [-]

"What you tell them" in the context of programs is meant in the sense of "What you program them to", not in the sense of "The dictionary definition of the word-noises you make when talking into their speakers".

Comment author: private_messaging 28 August 2013 09:04:32PM 0 points [-]

They were talking of genies, though, and the sort of failure that tends to arise from how a short sentence describes multitude of diverse intents (i.e. ambiguity). Programming is about specifying what you want in extremely verbose manner, the verbosity being a necessary consequence of non-ambiguity.