eli_sennesh comments on Debunking Fallacies in the Theory of AI Motivation - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (343)
This article just makes the same old errors over and over again. Here's one:
No. The AI does not have good intentions. Its intentions are extremely bad. It wants to make us happy, which is a completely distinct thing from actually doing what is good. The AI was in fact never programmed to do what is good, and there are no errors in its code.
The lack of precision here is depressing.
Well of course, talking of doing what is good without giving content to the phrase isn't very precise or helpful, either. I certainly expect that if we build a "friendly superintelligence" and successfully program it to do what is good, I will experience a higher baseline level of happiness on a daily basis than if we don't (because, for example, we will be able to ask the AI how to cure depression). It needs saying that while The Good strongly implies (high likelihood/high log-odds) high broad levels of happiness throughout the population, happiness alone is very weak evidence (low but positive log-odds, likelihood nearer to 0.5) of The Good, insofar as the abstraction doesn't leak.
But, and this is an important point, if you give me a normative-ethical theory of The Good which implies that I specifically, or the population broadly, ought to be unhappy, or a meta-ethical theory of naturalizing morality which outputs a normative theory which implies that I/we ought to be unhappy, then something has gone very, very wrong.
Using "good" to only refer to what is actually good is however vastly better, as precision goes. What I am taking issue to here is the careless equivocation between maximising pleasure and good intentions. A correct description of the "nanny AI" scenario would read something like this:
Of course it is true that a AI programmed to do what is good would most likely generally increase happiness (and even pleasure) to some extent, but to conclude from that that these things are interchangeable is pure folly.