Engaging First Introductions to AI Risk
I'm putting together a list of short and sweet introductions to the dangers of artificial superintelligence.
My target audience is intelligent, broadly philosophical narrative thinkers, who can evaluate arguments well but who don't know a lot of the relevant background or jargon.
My method is to construct a Sequence mix tape — a collection of short and enlightening texts, meant to be read in a specified order. I've chosen them for their persuasive and pedagogical punchiness, and for their flow in the list. I'll also (separately) list somewhat longer or less essential follow-up texts below that are still meant to be accessible to astute visitors and laypeople.
The first half focuses on intelligence, answering 'What is Artificial General Intelligence (AGI)?'. The second half focuses on friendliness, answering 'How can we make AGI safe, and why does it matter?'. Since the topics of some posts aren't obvious from their titles, I've summarized them using questions they address.
Part I. Building intelligence.
1. Power of Intelligence. Why is intelligence important?
2. Ghosts in the Machine. Is building an intelligence from scratch like talking to a person?
3. Artificial Addition. What can we conclude about the nature of intelligence from the fact that we don't yet understand it?
4. Adaptation-Executers, not Fitness-Maximizers. How do human goals relate to the 'goals' of evolution?
5. The Blue-Minimizing Robot. What are the shortcomings of thinking of things as 'agents', 'intelligences', or 'optimizers' with defined values/goals/preferences?
Part II. Intelligence explosion.
6. Optimization and the Singularity. What is optimization? As optimization processes, how do evolution, humans, and self-modifying AGI differ?
7. Efficient Cross-Domain Optimization. What is intelligence?
8. The Design Space of Minds-In-General. What else is universally true of intelligences?
9. Plenty of Room Above Us. Why should we expect self-improving AGI to quickly become superintelligent?
Part III. AI risk.
10. The True Prisoner's Dilemma. What kind of jerk would Defect even knowing the other side Cooperated?
11. Basic AI drives. Why are AGIs dangerous even when they're indifferent to us?
12. Anthropomorphic Optimism. Why do we think things we hope happen are likelier?
13. The Hidden Complexity of Wishes. How hard is it to directly program an alien intelligence to enact my values?
14. Magical Categories. How hard is it to program an alien intelligence to reconstruct my values from observed patterns?
15. The AI Problem, with Solutions. How hard is it to give AGI predictable values of any sort? More generally, why does AGI risk matter so much?
Part IV. Ends.
16. Could Anything Be Right? What do we mean by 'good', or 'valuable', or 'moral'?
17. Morality as Fixed Computation. Is it enough to have an AGI improve the fit between my preferences and the world?
18. Serious Stories. What would a true utopia be like?
19. Value is Fragile. If we just sit back and let the universe do its thing, will it still produce value? If we don't take charge of our future, won't it still turn out interesting and beautiful on some deeper level?
20. The Gift We Give To Tomorrow. In explaining value, are we explaining it away? Are we making our goals less important?
Summary: Five theses, two lemmas, and a couple of strategic implications.
All of the above were written by Eliezer Yudkowsky, with the exception of The Blue-Minimizing Robot (by Yvain), Plenty of Room Above Us and The AI Problem (by Luke Muehlhauser), and Basic AI Drives (a wiki collaboration). Seeking a powerful conclusion, I ended up making a compromise between Eliezer's original The Gift We Give To Tomorrow and Raymond Arnold's Solstice Ritual Book version. It's on the wiki, so you can further improve it with edits.
Further reading:
- Three Worlds Collide (Normal), by Eliezer Yudkowsky
- a short story vividly illustrating how alien values can evolve.
- So You Want to Save the World, by Luke Muehlhauser
- an introduction to the open problems in Friendly Artificial Intelligence.
- Intelligence Explosion FAQ, by Luke Muehlhauser
- a broad overview of likely misconceptions about AI risk.
- The Singularity: A Philosophical Analysis, by David Chalmers
- a detailed but non-technical argument for expecting intelligence explosion, with an assessment of the moral significance of synthetic human and non-human intelligence.
I'm posting this to get more feedback for improving it, to isolate topics for which we don't yet have high-quality, non-technical stand-alone introductions, and to reintroduce LessWrongers to exceptionally useful posts I haven't seen sufficiently discussed, linked, or upvoted. I'd especially like feedback on how the list I provided flows as a unit, and what inferential gaps it fails to address. My goals are:
A. Via lucid and anti-anthropomorphic vignettes, to explain AGI in a way that encourages clear thought.
B. Via the Five Theses, to demonstrate the importance of Friendly AI research.
C. Via down-to-earth meta-ethics, humanistic poetry, and pragmatic strategizing, to combat any nihilisms, relativisms, and defeatisms that might be triggered by recognizing the possibility (or probability) of Unfriendly AI.
D. Via an accessible, substantive, entertaining presentation, to introduce the raison d'être of LessWrong to sophisticated newcomers in a way that encourages further engagement with LessWrong's community and/or content.
What do you think? What would you add, remove, or alter?
Intelligence explosion in organizations, or why I'm not worried about the singularity
If I understand the Singularitarian argument espoused by many members of this community (eg. Muehlhauser and Salamon), it goes something like this:
- Machine intelligence is getting smarter.
- Once an intelligence becomes sufficiently supra-human, its instrumental rationality will drive it towards cognitive self-enhancement (Bostrom), so making it a super-powerful, resource hungry superintelligence.
- If a superintelligence isn't sufficiently human-like or 'friendly', that could be disastrous for humanity.
- Machine intelligence is unlikely to be human-like or friendly unless we take precautions.
I'm in danger of getting into politics. Since I understand that political arguments are not welcome here, I will refer to these potentially unfriendly human intelligences broadly as organizations.
Smart organizations
By "organization" I mean something commonplace, with a twist. It's commonplace because I'm talking about a bunch of people coordinated somehow. The twist is that I want to include the information technology infrastructure used by that bunch of people within the extension of "organization".
Do organizations have intelligence? I think so. Here's some of the reasons why:
- We can model human organizations as having preference functions. (Economists do this all the time)
- Human organizations have a lot of optimization power.
I talked with Mr. Muehlhauser about this specifically. I gather that at least at the time he thought human organizations should not be counted as intelligences (or at least as intelligences with the potential to become superintelligences) because they are not as versatile as human beings.
So when I am talking about super-human intelligence, I specifically mean an agent that is as good or better at humans at just about every skill set that humans possess for achieving their goals. So that would include things like not just mathematical ability or theorem proving and playing chess, but also things like social manipulation and composing music and so on, which are all functions of the brain not the kidneys
...and then...
It would be a kind of weird [organization] that was better than the best human or even the median human at all the things that humans do. [Organizations] aren’t usually the best in music and AI research and theory proving and stock markets and composing novels. And so there certainly are [Organizations] that are better than median humans at certain things, like digging oil wells, but I don’t think there are [Organizations] as good or better than humans at all things. More to the point, there is an interesting difference here because [Organizations] are made of lots of humans and so they have the sorts of limitations on activities and intelligence that humans have. For example, they are not particularly rational in the sense defined by cognitive science. And the brains of the people that make up organizations are limited to the size of skulls, whereas you can have an AI that is the size of a warehouse.
I think that Muehlhauser is slightly mistaken on a few subtle but important points. I'm going to assert my position on them without much argument because I think they are fairly sensible, but if any reader disagrees I will try to defend them in the comments.
- When judging whether an entity has intelligence, we should consider only the skills relevant to the entity's goals.
- So, if organizations are not as good at a human being at composing music, that shouldn't disqualify them from being considered broadly intelligent if that has nothing to do with their goals.
- Many organizations are quite good at AI research, or outsource their AI research to other organizations with which they are intertwined.
- The cognitive power of an organization is not limited to the size of skulls. The computational power is of many organizations is comprised of both the skulls of its members and possibly "warehouses" of digital computers.
- With the ubiquity of cloud computing, it's hard to say that a particular computational process has a static spatial bound at all.
Mean organizations
* My preferred standard of rationality is communicative rationality, a Habermasian ideal of a rationality aimed at consensus through principled communication. As a consequence, when I believe a position to be rational, I believe that it is possible and desirable to convince other rational agents of it.
A Primer On Risks From AI
The Power of Algorithms
Evolutionary processes are the most evident example of the power of simple algorithms [1][2][3][4][5].
The field of evolutionary biology gathered a vast amount of evidence [6] that established evolution as the process that explains the local decrease in entropy [7], the complexity of life.
Since it can be conclusively shown that all life is an effect of an evolutionary process it is implicit that everything we do not understand about living beings is also an effect of evolution.
We might not understand the nature of intelligence [8] and consciousness [9] but we do know that they are the result of an optimization process that is neither intelligent nor conscious.
Therefore we know that it is possible for an physical optimization process to culminate in the creation of more advanced processes that feature superior qualities.
One of these qualities is the human ability to observe and improve the optimization process that created us. The most obvious example being science [10].
Science can be thought of as civilization-level self-improvement method. It allows us to work together in a systematic and efficient way and accelerate the rate at which further improvements are made.
The Automation of Science
We know that optimization processes that can create improved versions of themselves are possible, even without an explicit understanding of their own workings, as exemplified by natural selection.
We know that optimization processes can lead to self-reinforcing improvements, as exemplified by the adaptation of the scientific method [11] as an improved evolutionary process and successor of natural selection.
Which raises questions about the continuation of this self-reinforcing feedback cycle and its possible implications.
One possibility is to automate science [12][13] and apply it to itself and its improvement.
But science is a tool and its bottleneck are its users. Humans, the biased [14] effect of the blind idiot god that is evolution.
Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves, artificial general intelligence.
Artificial general intelligence, that can recursively optimize itself [15], is the logical endpoint of various converging and self-reinforcing feedback cycles.
Risks from AI
Will we be able to build an artificial general intelligence? Yes, sooner or later.
Even the unintelligent, unconscious and aimless process of natural selection was capable of creating goal-oriented, intelligent and conscious agents that can think ahead, jump fitness gaps and improve upon the process that created them to engage in prediction and direct experimentation.
The question is, what are the possible implications of the invention of an artificial, fully autonomous, intelligent and goal-oriented optimization process?
One good bet is that such an agent will recursively improve its most versatile, and therefore instrumentally useful, resource. It will improve its general intelligence, respectively cross-domain optimization power.
Since it is unlikely that human intelligence is the optimum, the positive feedback effect, that is a result of using intelligence amplifications to amplify intelligence, is likely to lead to a level of intelligence that is generally more capable than the human intelligence level.
Humans are unlikely to be the most efficient thinkers because evolution is mindless and has no goals. Evolution did not actively try to create the smartest thing possible.
Evolution is further not limitlessly creative, each step of an evolutionary design must increase the fitness of its host. Which makes it probable that there are artificial mind designs that can do what no product of natural selection could accomplish, since an intelligent artificer does not rely on the incremental fitness of each step in the development process.
It is actually possible that human general intelligence is the bare minimum. Because the human level of intelligence might have been sufficient to both survive and reproduce and that therefore no further evolutionary pressure existed to select for even higher levels of general intelligence.
The implications of this possibility might be the creation of an intelligent agent that is more capable than humans in every sense. Maybe because it does directly employ superior approximations of our best formal methods, that tell us how to update based on evidence and how to choose between various actions. Or maybe it will simply think faster. It doesn’t matter.
What matters is that a superior intellect is probable and that it will be better than us at discovering knowledge and inventing new technology. Technology that will make it even more powerful and likely invincible.
And that is the problem. We might be unable to control such a superior being. Just like a group of chimpanzees is unable to stop a company from clearing its forest [16].
But even if such a being is only slightly more capable than us. We might find ourselves at its mercy nonetheless.
Human history provides us with many examples [17][18][19] that make it abundantly clear that even the slightest advance can enable one group to dominate others.
What happens is that the dominant group imposes its values on the others. Which in turn raises the question of what values an artificial general intelligence might have and the implications of those values for us.
Due to our evolutionary origins, our struggle for survival and the necessity to cooperate with other agents, we are equipped with many values and a concern for the welfare of others [20].
The information theoretic complexity [21][22] of our values is very high. Which means that it is highly unlikely for similar values to automatically arise in agents that are the product of intelligent design, agents that never underwent the million of years of competition with other agents that equipped humans with altruism and general compassion.
But that does not mean that an artificial intelligence won’t have any goals [23][24]. Just that those goals will be simple and their realization remorseless [25].
An artificial general intelligence will do whatever is implied by its initial design. And we will be helpless to stop it from achieving its goals. Goals that won’t automatically respect our values [26].
A likely implication is the total extinction of all of humanity [27].
Further Reading
- What should a reasonable person believe about the Singularity?
- The Singularity: A Philosophical Analysis
- Intelligence Explosion: Evidence and Import
- Why an Intelligence Explosion is Probable
- Artificial Intelligence as a Positive and Negative Factor in Global Risk
- From mostly harmless to civilization-threatening: pathways to dangerous artificial general intelligences
- The Hanson-Yudkowsky AI-Foom Debate
- Facing The Singularity
References
[1] Genetic Algorithms and Evolutionary Computation, talkorigins.org/faqs/genalg/genalg.html
[2] Fixing software bugs in 10 minutes or less using evolutionary computation, genetic-programming.org/hc2009/1-Forrest/Forrest-Presentation.pdf
[3] Automatically Finding Patches Using Genetic Programming, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Patches.pdf
[4] A Genetic Programming Approach to Automated Software Repair, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Repair.pdf
[5]GenProg: A Generic Method for Automatic Software Repair, virginia.edu/~weimer/p/weimer-tse2012-genprog.pdf
[6] 29+ Evidences for Macroevolution (The Scientific Case for Common Descent), talkorigins.org/faqs/comdesc/
[7] Thermodynamics, Evolution and Creationism, talkorigins.org/faqs/thermo.html
[8] A Collection of Definitions of Intelligence, vetta.org/documents/A-Collection-of-Definitions-of-Intelligence.pdf
[9] plato.stanford.edu/entries/consciousness/
[10] en.wikipedia.org/wiki/Science
[11] en.wikipedia.org/wiki/Scientific_method
[12] The Automation of Science, sciencemag.org/content/324/5923/85.abstract
[13] Computer Program Self-Discovers Laws of Physics, wired.com/wiredscience/2009/04/newtonai/
[14] List of cognitive biases, en.wikipedia.org/wiki/List_of_cognitive_biases
[15] Intelligence explosion, wiki.lesswrong.com/wiki/Intelligence_explosion
[16] 1% with Neil deGrasse Tyson, youtu.be/9nR9XEqrCvw
[17] Mongol military tactics and organization, en.wikipedia.org/wiki/Mongol_military_tactics_and_organization
[18] Wars of Alexander the Great, en.wikipedia.org/wiki/Wars_of_Alexander_the_Great
[19] Spanish colonization of the Americas, en.wikipedia.org/wiki/Spanish_colonization_of_the_Americas
[20] A Quantitative Test of Hamilton's Rule for the Evolution of Altruism, plosbiology.org/article/info:doi/10.1371/journal.pbio.1000615
[21] Algorithmic information theory, scholarpedia.org/article/Algorithmic_information_theory
[22] Algorithmic probability, scholarpedia.org/article/Algorithmic_probability
[23] The Nature of Self-Improving Artificial Intelligence, selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf
[24] The Basic AI Drives, selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
[25] Paperclip maximizer, wiki.lesswrong.com/wiki/Paperclip_maximizer
[26] Friendly artificial intelligence, wiki.lesswrong.com/wiki/Friendly_artificial_intelligence
[27] Existential Risk, existential-risk.org
LINK: Can intelligence explode?
I thought many of you would be interested to know that the following paper just appeared in Journal of Consciousness Studies:
"Can Intelligence Explode?", by Marcus Hutter. (LINK HERE)
Abstract: The technological singularity refers to a hypothetical scenario in which technological advances virtually explode. The most popular scenario is the creation of super-intelligent algorithms that recursively create ever higher intelligences. It took many decades for these ideas to spread from science fiction to popular science magazines and finally to attract the attention of serious philosophers. David Chalmers' (JCS 2010) article is the first comprehensive philosophical analysis of the singularity in a respected philosophy journal. The motivation of my article is to augment Chalmers' and to discuss some issues not addressed by him, in particular what it could mean for intelligence to explode. In this course, I will (have to) provide a more careful treatment of what intelligence actually is, separate speed from intelligence explosion, compare what super-intelligent participants and classical human observers might experience and do, discuss immediate implications for the diversity and value of life, consider possible bounds on intelligence, and contemplate intelligences right at the singularity.
I have only just seen the paper and have not yet thread through it myself, but I thought we could use this thread for discussion.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)