As far as AI designers go, evolution has to be one of the worst. It randomly changes the genetic code, and then selects on the criterion of ingroup reproductive fitness - in other words, how well a being can reproduce and stay alive - it says nothing about the goals of that being while it's alive.

To survive, and increase one's power are instrumentally convergent goals of any intelligent agent, which means that evolution does not select for any specific type of mind, ethics, or final values.

And yet, it created humans and not paperclip maximizers. True, humans rebelled against and overpowered evolution, but in the end we ended up creating amazing things and not a universe tiled with paperclips(or DNA, for that matter).

Considering how neural network training and genetic algorithms are considered some of the most dangerous ways of creating an AI,

the fact that natural evolution managed to create us with all our goals of curiosity and empathy and love and science,

would be a very unlikely coincidence given that we assume that most AIs we could create are worthless in terms of their goals and what they will do with the universe. Did it happen by chance? The p-value is pretty small on this one.

Careless evolution managed to create humans on her first attempt at intelligence, but humans, given foresight and intelligence, have an extreme challenge making sure an AI is friendly? How can we explain this contradiction? 

 

New to LessWrong?

New Comment
12 comments, sorted by Click to highlight new comments since: Today at 11:06 PM

Human decisions lead to a world that's decent for humans because humans have an interest in buiding a world that's decent for humans. On the same token an AGI doesn't have a natural interest in building a world that's decent for humans.

I think you're equivocating between "good world" and "good world for humans." Humans have made a pretty good world for themselves, but we've also caused mass extinction of many species, have used practices like factory farming of animals which creates enormous amounts of suffering, and we have yet to undo much of the environmental damage we've caused.

We have no inherent incentive to maximize happiness for all living beings in existence - evolution did not select brains that held these values as strongly as, say, your desire to seek pleasure or your self preservational instinct, which are far stronger (assuming we have any inherent altruistic desires at all). Keep in mind that evolution only selected traits which were beneficial in propagating the genes that contained those traits. There is no reason to expect such genes to carry traits that help different genes survive. There might be genes that carry altruistic behavior traits, but this does not change the fact that these genes were selected completely selfishly, for the preservation of itself.

It's true that evolution hasn't produced "paperclip" maximizers, but it has produced many replicators, with traits helpful for replicating. Is the concept really so different? Are not most organisms simply "self" maximizers?

This is the right answer, but I'd like to add emphasis on the self-referential nature of the evaluation of humans in the OP. That is, it uses human values to assess humanity, and comes up with a positive verdict. Not terribly surprising, nor terribly useful in predicting the value, in human terms, of an AI. What the analogy predicts is that evaluated by AI values, AI will probably be a wonderful thing. I don't find that very reassuring.

The evolution of the human mind did not create a better world from the perspective of most species of the time - just ask the dodo, most megafauna, countless other species, etc. In fact, the evolution of humanity was/is a mass extinction event.

From the perspective of the God of Evolution, we are the unfriendly AI:

  • We were supposed to be compelled to reproduce, but we figure out that we can get the reward by disabling our reproductive functions and continuing to go through the motions.
  • We were supposed to seek out nutritious food and eat it, but we figured out that we could concentrate the parts that trigger our reward centers and just eat that.

And of course, we're unfriendly to everything else too:

  • Humans fight each other over farmland (= land that can be turned into food which can be turned into humans) all the time
  • We're trying to tile the universe with human colonies and probes. It's true that we're not strictly trying to tile the universe with our DNA, but we are trying to turn it all into human things, and it's not uncommon for people to be sad about the parts of the universe we can never reach and turn into humantronium.
  • We do not love or hate the cow/chicken/pig, but they are made of meat which can be turned into reward center triggers.

As to why we're not exactly like a paperclip maximizer, I suspect one big piece is:

  • We're not able to make direct copies of ourselves or extend our personal power to the extent that we expect AI to be able to, so "being nice" is adaptive because there are a lot of things we can't do alone. We expect that an AI could just make itself bigger or make exact copies that won't have divergent goals, so it won't need this.

DNA could make exact copies of itself, yet it chooses to mix itself with another set. There might be similar pressures on minds to prevent exploitation of blind spots.

Evolution has produced lots of utility functions in different animals. How many of them would you like to have maximized?

This is similar to "How did we get just the right amount of oxygen in the air for us to breathe?" or "How good of the Lord to create us to find the color green most relaxing when it's so prevalent in our environment."

If we think about "what evolution was 'trying' to design humans for," I think it's pretty reasonable to ask what was evolutionarily adaptive in small hunter-gatherer tribes with early language. Complete success for evolution would be someone who was an absolutely astounding hunter-gatherer tribesperson, who had lots of healthy babies.

As someone who is not a lean, mean, hunting and gathering machine, nor someone with lots of healthy children, I feel like evolution has not gotten the outcomes it 'tried' to design humans to achieve.

Essentially:

Q: Evolution is a dumb algorithm, yet it produced halfway functional minds. How can it be that the problem isn't easy for humans, who are much smarter than evolution?

A: Evolution's output is not just one functional mind. Evolution put out billions of different minds, an extreme minority of them being functional. If we had a billion years of time and had a trillion chances to get it right, the problem would be easy. Since we only have around 30 years and exactly 1 chance, the problem is hard.

Evolution also had 1 chance, in the sense that the first intelligent species created would take over the world and reform it very quickly, leaving no time for evolution to try any other mind-design. I'm pretty sure there will be no other intelligent species that evolves by pure natural selection after humanity - unless it's part of an experiment run by humans. Evolution had a lot of chances to try to create a functional intelligence, but as for the friendliness problem, it had only one chance. The reason being, a faulty intelligence will die out soon enough, and give evolution time to design a better one, but a working paperclip maximizer is quite capable of surviving and reproducing and eliminating any other attempts at intelligence.

To survive, and increase one's power are instrumentally convergent goals of any intelligent agent, which means that evolution does not select for any specific type of mind, ethics, or final values.

But, as you also point out, evolution "selects on the criterion of ingroup reproductive fitness", which does select for a specific type of mind and ethics, especially if you also have the constraint, that the agent should be intelligent. As far as I am aware all of the animals considered the most intelligent are social animals (octopi may be an exception?). The most important aspect of an evolutionary algorithm is the fitness function. The real world seems to impose a fitness function, that, if it selects for intelligence, it also seems to select for sociability, something that generally seems like it would increase friendliness.

True, humans rebelled against and overpowered evolution

Evolution among humans is as real as it is among any lifeform. Until every human being born is actually genetically designed from scratch there is evolutionary pressure favoring some genes over others.

Careless evolution managed to create humans on her first attempt at intelligence, but humans, given foresight and intelligence, have an extreme challenge making sure an AI is friendly? How can we explain this contradiction?

Humans are not friendly. There are countless examples of humans, who, if they had the chance, would make the world into a place that is significantly worse for most humans. The reason none of them have succeeded yet is that so far no one has had a power/intelligence advantage so large that they could just impose their ideas unilaterally onto the rest of humanity.