[SEQ RERUN] Mirrors and Paintings

MinibearRex

Today's post, Mirrors and Paintings was originally published on 23 August 2008. A summary (taken from the LW wiki):

There is a proposal for programming a friendly AI, called CEV. Essentially, this strategy consists of teaching a computer to look at human brains and deduce, from that, morality. This should work better than trying to program morality "by hand", since we really aren't smart enough to solve that problem with an acceptable degree of accuracy.

Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Invisible Frameworks, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

Today's post, Mirrors and Paintings was originally published on 23 August 2008. A summary (taken from the LW wiki):

There is a proposal for programming a friendly AI, called CEV. Essentially, this strategy consists of teaching a computer to look at human brains and deduce, from that, morality. This should work better than trying to program morality "by hand", since we really aren't smart enough to solve that problem with an acceptable degree of accuracy.

So you build the AI with a kind of forward reference: "You see those humans over there? ..."

Building on a prior comment of mine, right in the moral sense is not evenly distributed. The there in "over there" matters a great deal. Not stating that leads to telling an AI 'pretend like you know what i mean when I talk about moral right.' Since the Coherent Extrapolated Volition model is built on human behavior, it matters that human behavior is... diverse.

For this human, who I have decided to not emulate (including myself in the past) has mattered perhaps more than who I decided to emulate (including imagined future selves). Refraining from doing harm works right away, trying to do good includes mis-steps and takes time. I am starting to understand that what works for me, or for humans, might not be optimal or even functional in an AI. But starting with what I know, I'd tell a potentially friendly AI 'you see those humans over there? Tend toward not doing what they do.' I'd aim it where physical violence is more common, for starters. I tend toward Popper's piecemeal social change rather than utopian social change. Again, what works well for people may not work well for AIs. To that end, aim AIs away from utopia and they'll bumble their way (in nanoseconds) towards something more humane.

For further study: a 37 second documentary clip on the pursuit of utopia, described here as "what is best in life."

"Tend toward not doing what they do differently from them."

Every person who committed deliberate genocide was breathing at the time they made the decision. That does not make breathing evil.

6

[SEQ RERUN] Mirrors and Paintings

6

6

6

[SEQ RERUN] Mirrors and Paintings

6

6