turchin comments on Open thread, Oct. 03 - Oct. 09, 2016 - Less Wrong

4 Post author: MrMind 03 October 2016 06:59AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (175)

You are viewing a single comment's thread. Show more comments above.

Comment author: skeptical_lurker 04 October 2016 05:23:48AM *  3 points [-]

I've been thinking about what seems to be the standard LW pitch on AI risk. It goes like this: "Consider an AI that is given a goal by humans. Since 'convert the planet into computronium' is a subgoal of most goals, it does this and kills humanity."

The problem, which various people have pointed out, is that this implies an intelligence capable of taking over the world, but not capable of working out that when a human says pursue a certain goal, they would not want this goal to be pursued in a way that leads to the destruction of the world.

Worse, the argument can then be made that this idea that an AI will interpret goals so literally without modelling a human mind constitutes an "autistic AI" and that only autistic people would assume that AI would be similarly autistic. I do not endorse this argument in any way, but I guess its still better to avoid arguments that signal low social skills, all other things being equal.

Is there any consensus on what the best 'elevator pitch' argument for AI risk is? Instead of focusing on any one failure mode, I would go with something like this:

"Most philosophers agree that there is no reason why superintelligence is not possible. Anything which is possible will eventually be achieved, and so will superintelligence, perhaps in the far future, perhaps in the next few decades. At some point, superintelligences will be as far above humans as we are above ants. I do not know what will happen at this point, but the only reference case we have is humans and ants, and if superintelligences decide that humans are an infestation, we will be exterminated."

Incidentally, this is the sort of thing I mean by painting LW style ideas as autistic (via David Pierce)

As far as we can tell, digital computers are still zombies. Our machines are becoming autistically intelligent, but not supersentient - nor even conscious. [...] Full-Spectrum Superintelligence entails: [...] social intelligence [...] a metric to distinguish the important from the trivial [...] a capacity to navigate, reason logically about, and solve problems in multiple state-spaces of consciousness [e.g. dreaming states (cf. lucid dreaming), waking consciousness, echolocatory competence, visual discrimination, synaesthesia in all its existing and potential guises, humour, introspection, the different realms of psychedelia [...] and finally "Autistic", pattern-matching, rule-following, mathematico-linguistic intelligence, i.e. the standard, mind-blind cognitive tool-kit scored by existing IQ tests. High-functioning "autistic" intelligence is indispensable to higher mathematics, computer science and the natural sciences. High-functioning autistic intelligence is necessary - but not sufficient - for a civilisation capable of advanced technology that can cure ageing and disease, systematically phase out the biology of suffering, and take us to the stars. And for programming artificial intelligence.

Sometimes David Pierce seems very smart. And sometimes he seems to imply that the ability to think logically while on psychedelic drugs is as important as 'autistic intelligence'. I don't think he thinks that autistic people are zombies that do not experience subjective experience, but that also does seem implied.

Comment author: turchin 04 October 2016 08:47:54PM 0 points [-]

I think that most people already heard about the fact that AI could be catastrophic risk, and they already has their opinion about it. May be their opinions are wrong.

What is the goal of such elevator pitch?

I think that the message should be following: While it is known that AI could be catastrophic, the only organisation (MIRI) which is doing most serios research on its prevention is underfunded. Providing finding to them could dramatically change probability of human survival, and we could estimate that 1 USD donated to them will save 10 human lives.

Comment author: ChristianKl 04 October 2016 09:28:00PM 2 points [-]

I think that most people already heard about the fact that AI could be catastrophic risk, and they already has their opinion about it.

In our circle that might be true but many people don't have an opinion that goes beyond terminator.

Comment author: turchin 04 October 2016 11:04:37PM 0 points [-]

Yes. So we have to utilise this knowledge. We could said something like: Terminator appear because its progenitor, Skynet computer, received a command to protect US, and concluded that the best way to do it is to prevent humans from switching him off, and so he decided to exterminate humans. So Terminator appear because of unsolved problem of value alignment.

Comment author: skeptical_lurker 05 October 2016 01:00:40PM 0 points [-]

Is that the canon explanation? I thought Skynet was acting out of self-preservation.

Comment author: turchin 05 October 2016 04:01:50PM *  0 points [-]

It is not exactly canon explanation, but (the following is my speculation which could be used in discussion about AI values if terminator was mentioned) the decision to preserve it self must follow from its main task: win nuclear war.

Winning nuclear war includes as it subgoal a very high priority one: to ensure survival of command center. Basically, a country, which was able to preserve its command center is wining nuclear war. So it seems rational to programmers of skynet to put preserving the skynet as a main goal, as it is the same as winning nuclear war (but only in a situation when nuclear war has started).

But skynet concluded that in peaceful time the main risks to its goal of command center survival is people and decided to kill them all. So it worked as paperclip maximaser for the goal of command center preservation.

It also probably started self improvement only after it kills most people, as it was already powerful system. So it escaped the main problem of chicken and the egg in case of SeedAI - what happens first? - self-improvement or malicious decision to kill people.

Comment author: skeptical_lurker 05 October 2016 06:17:07PM 1 point [-]

The Terminator: The Skynet Funding Bill is passed. The system goes on-line August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.

Sarah Connor: Skynet fights back.

Your version is great as rational fanfic, but in an actual debate I'd say that its generally best not to base ideas on action movies. Having said that, I do like the bit where the terminator has been told not to kill anyone, so he shoots them in the kneecaps.

Comment author: Brillyant 06 October 2016 02:31:16PM 0 points [-]

While it is known that AI could be catastrophic, the only organisation (MIRI) which is doing most serios research on its prevention is underfunded. Providing finding to them could dramatically change probability of human survival, and we could estimate that 1 USD donated to them will save 10 human lives.

Is any of this true? "Most serious"? "Dramatically change probability of human survival"? 10 lives per $1?

Comment author: turchin 06 October 2016 06:12:16PM 0 points [-]

I just provided an example of possible pitch, and I think that some people in Miri thinks in this way. I wanted to show that the pitch must have new information and be actionable.