Manfred comments on Open thread, Oct. 03 - Oct. 09, 2016 - Less Wrong

4 Post author: MrMind 03 October 2016 06:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (175)

You are viewing a single comment's thread. Show more comments above.

Comment author: skeptical_lurker 04 October 2016 04:18:16PM 0 points [-]

If all it takes to ensure FAI is to instruct "henceforth, always do what humans mean, not what they say" then FAI is trivial.

Comment author: Manfred 04 October 2016 07:06:49PM *  3 points [-]

The AI has to do what humans mean (rather than e.g. not following your orders and just calculating more digits of pi) before you start talking at it, because you are relying on it interpreting that sentence how you meant it.

The hard part is not figuring out good-sounding words to say to an AI. The hard part is figuring out how to make an actual, genuine computer program that will do what you mean.

Comment author: username2 04 October 2016 08:33:17PM 0 points [-]

Maybe? But consider that the opposite of what you just claimed sounds just as plausible to an outside observer. "Do what I mean" doesn't sound all that complicated -- even to someone with a background in computer science or AI specifically. "Do what I mean" translates as "accurately determine the principles which constrain my own actions and use those to constrain the AI's, or otherwise build a model of my thinking which the AI can use to evaluate options." Sub-goals such as verifying that the model matches reality fall easily out of this definition.

It's not at all clear, even to a practitioner within the field, that this expansion doesn't work, if in fact it does not.

Comment author: philh 05 October 2016 09:25:15AM 0 points [-]

It's not necessarily that the AI would have difficulty understanding what "do what humans mean" means, even before being told to do what humans mean.

It just has no reason to obey "do what humans mean" unless we program it to do what humans mean.

"Do what humans mean" is telling the AI to do something that we can currently only specify vaguely. "Figure out what we intend by "do what humans mean", and then do that" is also vaguely specified. It doesn't solve the problem.

Comment author: skeptical_lurker 05 October 2016 12:54:21PM 0 points [-]

It just has no reason to obey "do what humans mean" unless we program it to do what humans mean.

I'm not disputing that this is also a problem, indeed perhaps a harder problem than figuring out what humans mean. In fact there are many failure modes, I was just wondering why people seem to focus in on specifically the fickle genie failure mode to the exclusion of others.

Comment author: hairyfigment 07 October 2016 11:48:44PM 0 points [-]

You're assuming that "what humans mean" is well-defined. I've seen people criticize the example of an AI putting humans on a dopamine drip, on the grounds that "making people happy" clearly doesn't mean that. But if your boss tells you to 'make everyone happy,' you will probably get paid to make everyone stop complaining. Parents in the real world used to give their babies opium and cocaine; advertisers today have probably convinced themselves that the foods and drugs they push genuinely make people happy. There is no existing mind that is provably Friendly.

So, this criticism is implying that simply understanding human speech will (at a minimum) let the AI understand moral philosophy, which is not trivial.

Comment author: username2 09 October 2016 09:00:43PM 0 points [-]

So, this criticism is implying that simply understanding human speech will (at a minimum) let the AI understand moral philosophy, which is not trivial.

I don't disagree with the other stuff you said. But I interpreted the criticism as "an AI told to 'do what humans want, not what they mean'" will have approximately the same effect as if you told a perfectly rational human being to do the same. So in the same way that I can instruct people with some success to "do what I mean", the same will work for AI too. It's just also true that this isn't a solution to FAI any more than it is with humans -- because morality is inconsistent, human beings are inherently unfriendly, etc...

Comment author: hairyfigment 10 October 2016 01:46:54AM 0 points [-]

I think you're eliding the question of motive (which may be more alien for an AI). But I'm glad we agree on the main point.