Engaging First Introductions to AI Risk

Rob Bensinger

31 Engaging First Introductions to AI Risk

19th Aug 2013

3 min read

31

I'm putting together a list of short and sweet introductions to the dangers of artificial superintelligence.

My target audience is intelligent, broadly philosophical narrative thinkers, who can evaluate arguments well but who don't know a lot of the relevant background or jargon.

My method is to construct a Sequence mix tape — a collection of short and enlightening texts, meant to be read in a specified order. I've chosen them for their persuasive and pedagogical punchiness, and for their flow in the list. I'll also (separately) list somewhat longer or less essential follow-up texts below that are still meant to be accessible to astute visitors and laypeople.

The first half focuses on intelligence, answering 'What is Artificial General Intelligence (AGI)?'. The second half focuses on friendliness, answering 'How can we make AGI safe, and why does it matter?'. Since the topics of some posts aren't obvious from their titles, I've summarized them using questions they address.

Part I. Building intelligence.

1. Power of Intelligence. Why is intelligence important?

2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?

3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don't yet understand it?

4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the 'goals' of evolution?

5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as 'agents', 'intelligences', or 'optimizers' with defined values/goals/preferences?

Part II. Intelligence explosion.

6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?

7. Efficient Cross-Domain Optimization. What is intelligence?

8. The Design Space of Minds-In-General. What else is universally true of intelligences?

9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?

Part III. AI risk.

10. The True Prisoner's Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?

11. Basic AI drives. Why are AGIs dangerous even when they're indifferent to us?

12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?

13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?

14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?

15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?

Part IV. Ends.

16. Could Anything Be Right? What do we mean by 'good', or 'valuable', or 'moral'?

17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?

18. Serious Stories. What would a true utopia be like?

19. Value is Fragile. If we just sit back and let the universe do its thing, will it still produce value? If we don't take charge of our future, won't it still turn out interesting and beautiful on some deeper level?

20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?

Summary: Five theses, two lemmas, and a couple of strategic implications.

All of the above were written by Eliezer Yudkowsky, with the exception of The Blue-Minimizing Robot (by Yvain), Plenty of Room Above Us and The AI Problem (by Luke Muehlhauser), and Basic AI Drives (a wiki collaboration). Seeking a powerful conclusion, I ended up making a compromise between Eliezer's original The Gift We Give To Tomorrow and Raymond Arnold's Solstice Ritual Book version. It's on the wiki, so you can further improve it with edits.

Further reading:

Three Worlds Collide (Normal), by Eliezer Yudkowsky
- a short story vividly illustrating how alien values can evolve.
So You Want to Save the World, by Luke Muehlhauser
- an introduction to the open problems in Friendly Artificial Intelligence.
Intelligence Explosion FAQ, by Luke Muehlhauser
- a broad overview of likely misconceptions about AI risk.
The Singularity: A Philosophical Analysis, by David Chalmers
- a detailed but non-technical argument for expecting intelligence explosion, with an assessment of the moral significance of synthetic human and non-human intelligence.

I'm posting this to get more feedback for improving it, to isolate topics for which we don't yet have high-quality, non-technical stand-alone introductions, and to reintroduce LessWrongers to exceptionally useful posts I haven't seen sufficiently discussed, linked, or upvoted. I'd especially like feedback on how the list I provided flows as a unit, and what inferential gaps it fails to address. My goals are:

A. Via lucid and anti-anthropomorphic vignettes, to explain AGI in a way that encourages clear thought.

B. Via the Five Theses, to demonstrate the importance of Friendly AI research.

C. Via down-to-earth meta-ethics, humanistic poetry, and pragmatic strategizing, to combat any nihilisms, relativisms, and defeatisms that might be triggered by recognizing the possibility (or probability) of Unfriendly AI.

D. Via an accessible, substantive, entertaining presentation, to introduce the raison d'être of LessWrong to sophisticated newcomers in a way that encourages further engagement with LessWrong's community and/or content.

What do you think? What would you add, remove, or alter?

AI RiskCommunity Outreach

Personal Blog

31

New Comment

Rendering 0/21 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 5:21 AM

Moderation Log

31 Engaging First Introductions to AI Risk

by Rob Bensinger

19th Aug 2013

3 min read

31

I'm putting together a list of short and sweet introductions to the dangers of artificial superintelligence.

My target audience is intelligent, broadly philosophical narrative thinkers, who can evaluate arguments well but who don't know a lot of the relevant background or jargon.

Part I. Building intelligence.

1. Power of Intelligence. Why is intelligence important?

2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?

3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don't yet understand it?

4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the 'goals' of evolution?

5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as 'agents', 'intelligences', or 'optimizers' with defined values/goals/preferences?

Part II. Intelligence explosion.

6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?

7. Efficient Cross-Domain Optimization. What is intelligence?

8. The Design Space of Minds-In-General. What else is universally true of intelligences?

9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?

Part III. AI risk.

10. The True Prisoner's Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?

11. Basic AI drives. Why are AGIs dangerous even when they're indifferent to us?

12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?

13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?

14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?

15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?

Part IV. Ends.

16. Could Anything Be Right? What do we mean by 'good', or 'valuable', or 'moral'?

17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?

18. Serious Stories. What would a true utopia be like?

20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?

Summary: Five theses, two lemmas, and a couple of strategic implications.

Further reading:

Three Worlds Collide (Normal), by Eliezer Yudkowsky
- a short story vividly illustrating how alien values can evolve.
So You Want to Save the World, by Luke Muehlhauser
- an introduction to the open problems in Friendly Artificial Intelligence.
Intelligence Explosion FAQ, by Luke Muehlhauser
- a broad overview of likely misconceptions about AI risk.
The Singularity: A Philosophical Analysis, by David Chalmers
- a detailed but non-technical argument for expecting intelligence explosion, with an assessment of the moral significance of synthetic human and non-human intelligence.

A. Via lucid and anti-anthropomorphic vignettes, to explain AGI in a way that encourages clear thought.

B. Via the Five Theses, to demonstrate the importance of Friendly AI research.

What do you think? What would you add, remove, or alter?

AI RiskCommunity Outreach

Personal Blog

31

Mentioned in

56Reality is weirdly normal

New Comment

Rendering 0/21 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 5:21 AM

Moderation Log

More from Rob Bensinger

Curated and popular this week

21Comments

Comment Permalink

Moss_Piglet13y00

To be perfectly honest, mentioning Artificial Intelligence at all might be the wrong way to start a discussion about the risks of superintelligence. The vast majority of us (myself included) have only very basic ideas of how even modern computers work much less the scientific / mathematical background to really engage with AI theory, not to mention that we're so inundated with inaccurate ideas about AI from fiction that it would probably be easier just to dodge the misconceptions entirely. An additional concern is that "serious people" who might otherwise be capable of understanding the issue won't want to be associated with a seemingly fantastical and/or nerdy discussion and will tune you out just on that basis.

Short of just starting with That Alien Message, either because Mr Yudkowski's prose style is a little dense(1) at times or because you want to put your own mark on it, I would suggest something along the same lines of replacing AI with a biological superintelligence. Throwing in applause lights like having the "wise fools" who unleash the smartpocalypse being a or the AI-equivalent being a would probably make it more palatable to targeted audiences, but might be too far on the dark side for your tastes.

I've actually come up with a cool short story idea based on the concept, essentially genetically modified Humboldt squid with satellite internet connections killing off humanity by being a bit overprotective of depleted fish stocks, which I might post as a reply to this post after some polishing.

Not intended as a criticism at all, but it is certainly above a high school reading level and unfortunately that is an issue when we're trying to talk to a wide audience.

Rob Bensinger13y-20

To be perfectly honest, mentioning Artificial Intelligence at all might be the wrong way to start a discussion about the risks of superintelligence.

I think that's giving up too much ground. Talking about AI risk while trying to avoid mentioning anything synthetic or artificial or robotic is like talking about asteroid risk while trying to avoid mentioning outer space.

The vast majority of us (myself included) have only very basic ideas of how even modern computers work much less the scientific / mathematical background to really engage with AI theory

... (read more)

See in context