jcp29 - LessWrong

[$20K in Prizes] AI Safety Arguments Competition

[Policymakers]

"If we imagine a space in which all possible minds can be represented, we must imagine all human minds as constituting a small and fairly tight cluster within that space. The personality differences between Hannah Arendt and Benny Hill might seem vast to us, but this is because the scale bar in our intuitive judgment is calibrated on the existing human distribution. In the wider space of all logical possibilities, these two personalities are close neighbors. In terms of neural architecture, at least, Ms. Arendt and Mr. Hill are nearly identical. Imagine their brains laying side by side in quiet repose. The differences would appear minor and you would quite readily recognize them as two of a kind; you might even be unable to tell which brain was whose.

There is a common tendency to anthropomorphize the motivations of intelligent systems in which there is really no ground for expecting human-like drives and passions (“My car really didn’t want to start this morning”). Eliezer Yudkowsky gives a nice illustration of this phenomenon:

Back in the era of pulp science fiction, magazine covers occasionally depicted a sentient monstrous alien—colloquially known as a bug-eyed monster (BEM)—carrying off an attractive human female in a torn dress. It would seem the artist believed that a nonhumanoid alien, with a wholly different evolutionary history, would sexually desire human females … Probably the artist did not ask whether a giant bug perceives human females as attractive. Rather, a human female in a torn dress is sexy—inherently so, as an intrinsic property. They who made this mistake did not think about the insectoid’s mind: they focused on the woman’s torn dress. If the dress were not torn, the woman would be less sexy; the BEM does not enter into it. (Yudkowsky 2008)

An artificial intelligence can be far less human-like in its motivations than a space alien. The extraterrestrial (let us assume) is a biological creature who has arisen through a process of evolution and may therefore be expected to have the kinds of motivation typical of evolved creatures. For example, it would not be hugely surprising to find that some random intelligent alien would have motives related to the attaining or avoiding of food, air, temperature, energy expenditure, the threat or occurrence of bodily injury, disease, predators, reproduction, or protection of offspring. A member of an intelligent social species might also have motivations related to cooperation and competition: like us, it might show in-group loyalty, a resentment of free-riders, perhaps even a concern with reputation and appearance.

By contrast, an artificial mind need not care intrinsically about any of those things, not even to the slightest degree. One can easily conceive of an artificial intelligence whose sole fundamental goal is to count the grains of sand on Boracay, or to calculate decimal places of pi indefinitely, or to maximize the total number of paperclips in its future lightcone. In fact, it would be easier to create an AI with simple goals like these, than to build one that has a humanlike set of values and dispositions."

[Taken from Nick Bostrom's 2012 paper, The Superintelligent Will]

[$20K in Prizes] AI Safety Arguments Competition

jcp293y10

[Policy makers & ML researchers]

Expecting AI to automatically care about humanity is like expecting a man to automatically care about a rock. Just as the man only cares about the rock insofar as it can help him achieve his goals, the AI only cares about humanity insofar as it can help it achieve its goals. If we want an AI to care about humanity, we must program it to do so. AI safety is about making sure we get this programming right. We may only get one chance.

[$20K in Prizes] AI Safety Arguments Competition

jcp293y10

[Policy makers & ML researchers]

Our goal is human flourishing. AI’s job is to stop at nothing to accomplish its understanding of our goal. AI safety is about making sure we’re really good at explaining ourselves.

[$20K in Prizes] AI Safety Arguments Competition

jcp293y10

[Policy makers & ML researchers]

AI safety is about developing an AI that understands not what we say, but what we mean. And it’s about doing so without relying on the things that we take for granted in inter-human communication: shared evolutionary history, shared experiences, and shared values. If we fail, a powerful AI could decide to maximize the number of people that see an ad by ensuring that ad is all that people see. AI could decide to reduce deaths by reducing births. AI could decide to end world hunger by ending the world.

(The first line is a slightly tweaked version of a different post by Linda Linsefors, so credit to her for that part.)

[$20K in Prizes] AI Safety Arguments Competition

jcp293y10

Thanks Trevor - appreciate the support! Right back at you.

[$20K in Prizes] AI Safety Arguments Competition

jcp293y*30

[Policy makers & ML researchers]

"There isn’t any spark of compassion that automatically imbues computers with respect for other sentients once they cross a certain capability threshold. If you want compassion, you have to program it in" (Nate Soares). Given that we can't agree on whether a straw has two holes or one...We should probably start thinking about how program compassion into a computer.

[$20K in Prizes] AI Safety Arguments Competition