eli_sennesh comments on Debunking Fallacies in the Theory of AI Motivation - Less Wrong

8 Post author: Richard_Loosemore 05 May 2015 02:46AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (343)

You are viewing a single comment's thread. Show more comments above.

Comment author: nshepperd 10 May 2015 06:01:59AM 12 points [-]

This article just makes the same old errors over and over again. Here's one:

"An all-powerful computer that was programmed to maximize human pleasure, for example, might consign us all to an intravenous dopamine drip [and] almost any easy solution that one might imagine leads to some variation or another on the Sorcerer’s Apprentice, a genie that’s given us what we’ve asked for, rather than what we truly desire." (Marcus 2012)

He is depicting a Nanny AI gone amok. It has good intentions (it wants to make us happy) but the programming to implement that laudable goal has had unexpected ramifications, and as a result the Nanny AI has decided to force all human beings to have their brains connected to a dopamine drip.

No. The AI does not have good intentions. Its intentions are extremely bad. It wants to make us happy, which is a completely distinct thing from actually doing what is good. The AI was in fact never programmed to do what is good, and there are no errors in its code.

The lack of precision here is depressing.

Comment author: [deleted] 10 May 2015 06:45:19PM 1 point [-]

Well of course, talking of doing what is good without giving content to the phrase isn't very precise or helpful, either. I certainly expect that if we build a "friendly superintelligence" and successfully program it to do what is good, I will experience a higher baseline level of happiness on a daily basis than if we don't (because, for example, we will be able to ask the AI how to cure depression). It needs saying that while The Good strongly implies (high likelihood/high log-odds) high broad levels of happiness throughout the population, happiness alone is very weak evidence (low but positive log-odds, likelihood nearer to 0.5) of The Good, insofar as the abstraction doesn't leak.

But, and this is an important point, if you give me a normative-ethical theory of The Good which implies that I specifically, or the population broadly, ought to be unhappy, or a meta-ethical theory of naturalizing morality which outputs a normative theory which implies that I/we ought to be unhappy, then something has gone very, very wrong.

Comment author: nshepperd 11 May 2015 09:36:56AM 2 points [-]

Using "good" to only refer to what is actually good is however vastly better, as precision goes. What I am taking issue to here is the careless equivocation between maximising pleasure and good intentions. A correct description of the "nanny AI" scenario would read something like this:

[The AI] has bad intentions (it was programmed to maximise human pleasure), and indeed by using its superior intelligence it successfully achieves that goal and does in fact maximise human pleasure -- by connecting all human brains up to dopamine drips.

Of course it is true that a AI programmed to do what is good would most likely generally increase happiness (and even pleasure) to some extent, but to conclude from that that these things are interchangeable is pure folly.