Eliezer Yudkowsky and Scott Aaronson - Percontations: Artificial Intelligence and Quantum Mechanics
Sections of the diavlog:
- When will we build the first superintelligence?
- Why quantum computing isn’t a recipe for robot apocalypse
- How to guilt-trip a machine
- The evolutionary psychology of artificial intelligence
- Eliezer contends many-worlds is obviously correct
- Scott contends many-worlds is ridiculous (but might still be true)
The discussion got a bit sidetracked around about when EY asked something like:
If you are assuming that you can give the machine one value and have it stable, why assume that there are all these other values coming into it which you can't control.
...about 27 minutes in.
Scott said something about that being how humans work. That could be expanded on a bit:
In biology, it's hard to build values in explicitly, since the genes have limited control over the brain - since the brain is a big self-organising system. It's as though the genes can determine the initial developmental trajectory - but then there's the wind to deal with.
If machine intelligence turns out to work much like that, then we may have similar difficulties building in machine values. If we can find a way of getting the machines to absorb values from surrounding agents, then that might save a lot of trouble.
Humans get many of their values from surrounding humans - via human culture. Were it not for that we would be like our cannibal ancestors from 1MY ago. Conscience and guilt are some of the mechanisms used to absorb those values. Evolution built those in - rather than all the details of the values of human society. It would have been technically difficult to build those in - and the result would have been inflexible. Instead it built a learning machine - and allowed the details of the values of human society to be one of the things learned.
Machine intelligence is quite likely to work along those lines if it is built on a connectionist model - where the brain grows from a simple initial state. There, we can't easily wire in the details of particular values - since it is so hard to understand the details of what is going on. However, we can wire in some gross values - pain, suffering, irritation, etc. Guilt is basically a way of applying negative reinforcement to past actions. It's a fairly primitive value - the kind that it is easier to build in.