Part of the series AI Risk and Opportunity: A Strategic Analysis.
(You can leave anonymous feedback on posts in this series here. I alone will read the comments, and may use them to improve past and forthcoming posts in this series.)
Building on the previous post on AI risk history, this post provides an incomplete timeline (up to 1993) of significant novel ideas and arguments related to AI as a potential catastrophic risk. I do not include ideas and arguments concerning only, for example, the possibility of AI (Turing 1950) or attempts to predict its arrival (Bostrom 1998).
As is usually the case, we find that when we look closely at a cluster of ideas, it turns out these ideas did not appear all at once in the minds of a Few Great Men. Instead, they grew and mutated and gave birth to new ideas gradually as they passed from mind to mind over the course of many decades.
1863: Machine intelligence as an existential risk to humanity; relinquishment of machine technology recommended. Samuel Butler in Darwin among the machines worries that as we build increasingly sophisticated and autonomous machines, they will achieve greater capability than humans and replace humans as the dominant agents on the planet:
...we are ourselves creating our own successors; we are daily adding to the beauty and delicacy of their physical organisation; we are daily giving them greater power and supplying by all sorts of ingenious contrivances that self-regulating, self-acting power which will be to them what intellect has been to the human race. In the course of ages we shall find ourselves the inferior race... the time will come when the machines will hold the real supremacy over the world and its inhabitants...
Our opinion is that war to the death should be instantly proclaimed against them. Every machine of every sort should be destroyed by the well-wisher of his species. Let there be no exceptions made, no quarter shown...
(See also Butler 1872; Campbell 1932.)
1921: Robots as an existential risk. The Czech play R.U.R. by Karel Capek tells the story of robots which grow in power and intelligence and destroy the entire human race (except for a single survivor).
1947: Fragility & complexity of human values (in the context of machine goal systems); perverse instantiation. Jack Williamson's novelette With Folded Hands (1947) tells the story of a race of machines that, in order to follow the Prime Directive: "to serve and obey and guard men from harm." To obey this rule, the machines interfere with every aspect of human life, and humans who resist are lobotomized. Due to the fragility and complexity of human values (Yudkowsky 2008; Muehlhauser and Helm 2012), the machines' rules of behavior had unintended consequences, manifesting a "perverse instantiation" in the language of Bostrom (forthcoming).
(Also see Asimov 1950, 1957, 1983; Versenyi 1974; Minsky 1984; Yudkowsky 2001, 2011.)
1948-1949: Precursor idea to intelligence explosion. Von Neumann (1948) wrote:
...“complication" on its lower levels is probably degenerative, that is, that every automaton that can produce other automata will only be able to produce less complicated ones. There is, however, a certain minimum level where this degenerative characteristic ceases to be universal. At this point automata which can reproduce themselves, or even construct higher entities, become possible.
Von Nuemann (1949) came very close to articulating the idea of intelligence explosion:
There is thus this completely decisive property of complexity, that there exists a critical size below which the process of synthesis is degenerative, but above which the phenomenon of synthesis, if properly arranged, can become explosive, in other words, where syntheses of automata can proceed in such a manner that each automaton will produce other automata which are more complex and of higher potentialities than itself.
1951: Potentially rapid transition from machine intelligence to machine takeover. Turing (1951) described ways that intelligent computers might learn and improve their capabilities, concluding that:
...it seems probable that once the machine thinking method has started, it would not take long to outstrip our feeble powers... At some stage therefore we should have to expect the machines to take control...
1959: Intelligence explosion; the need for human-friendly goals for machine superintelligence. Good (1959) describes what he later (1965) called an "intelligence explosion," a particular mechanism for rapid transition from artificial general intelligence to dangerous machine takeover:
Once a machine is designed that is good enough… it can be put to work designing an even better machine. At this point an "explosion" will clearly occur; all the problems of science and technology will be handed over to machines and it will no longer be necessary for people to work. Whether this will lead to a Utopia or to the extermination of the human race will depend on how the problem is handled by the machines. The important thing will be to give them the aim of serving human beings.
(Also see Good 1962, 1965, 1970; Vinge 1992, 1993; Yudkowsky 2008.)
1966: A military arms race for machine superintelligence could accelerate machine takeover; convergence toward a singleton is likely. Dennis Feltham Jones' 1966 novel Colossus depicted what may be a particularly likely scenario: two world superpowers (the USA and USSR) are in an arms race to develop superintelligent computers, one of which self-improves enough to take control of the planet.
In the same year, Cade (1966) argued the same thing:
political leaders on Earth will slowly come to realize... that intelligent machines having superhuman thinking ability can be built. The construction of such machines, even taking into account all the latest developments in computer technology, would call for a major national effort. It is only to be expected that any nation which did put forth the financial and physical effort needed to build and programme such a machine, would also attempt to utilize it to its maximum capacity, which implies that it would be used to make major decisions of national policy. Here is where the awful dilemma arises. Any restriction to the range of data supplied to the machine would limit its ability to make effective political and economic decisions, yet if no such restrictions are placed upon the machine's command of information, then the entire control of the nation would virtually be surrendered to the judgment of the robot.
On the other hand, any major nation which was led by a superior, unemotional intelligence of any kind, would quickly rise to a position of world domination. This by itself is sufficient to guarantee that, sooner or later, the effort to build such an intelligence will be made — if not in the Western world, then elsewhere, where people are more accustomed to iron dictatorships.
...It seems that, in the forseeable future, the major nations of the world will have to face the alternative of surrendering national control to mechanical ministers, or being dominated by other nations which have already done this. Such a process will eventually lead to the domination of the whole Earth by a dictatorship of an unparalleled type — a single supreme central authority.
(This last paragraph also argues for convergence toward what Bostrom later called a "singleton.")
(Also see Ellison 1967.)
1970: Proposal for an association that analyzes the implications of machine superintelligence; naive control solutions like "switch off the power" may not work because the superintelligence will outsmart us, thus we must focus on its motivations; possibility of "pointless" optimization by machine superintelligence. Good (1970) argues:
Even if the chance that the ultraintelligent machine will be available [soon] is small, the repercussions would be so enormous, good or bad, that it is not too early to entertain the possibility. In any case by 1980 I hope that the implications and the safeguards will have been thoroughly discussed, and this is my main reason for airing the matter: an association for considering it should be started.
(Also see Bostrom 1997.)
On the idea that naive control solutions like "switch off the power" may not work because the superintelligence will find a way to outsmart us, and thus we must focus our efforts on the superintelligence's motivations, Good writes:
Some people have suggested that in order to prevent the [ultraintelligent machine] from taking over we should be ready to switch of its power supply. But it is not as simple as that because the machine could recommend the appointment of its own operators, it could recommend that they be paid well and it could select older men who would not be worried about losing their jobs. Then it could replace its operators by robots in order to make sure that it is not switched off. Next it could have the neo-Luddites ridiculed by calling them Ludditeniks, and if necessary it would later have them imprisoned or executed. This shows how careful we must be to keep our eye on the "motivation" of the machines, if possible, just as we should with politicians.
(Also see Yudkowsky 2008.)
Good also outlines one possibility for "pointless" goal-optimization by machine superintelligence:
If the machines took over and men became redundant and ultimately extinct, the society of machines would continue in a complex and interesting manner, but it would all apparently be pointless because there would be no one there to be interested. If machines cannot be conscious there would be only a zombie world. This would perhaps not be as bad as in many human societies where most people have lived in misery and degradation while a few have lived in pomp and luxury. It seems to me that the utility of such societies has been negative (while in the condition described) whereas the utility of a zombie society would be zero and hence preferable.
(Also see Bostrom 2004; Yudkowsky 2008.)
1974: We can't much predict what will happen after the creation of machine superintelligence. Julius Lukasiewicz (1974) writes:
The survival of man may depend on the early construction of an ultraintelligent machine-or the ultraintelligent machine may take over and render the human race redundant or develop another form of life. The prospect that a merely intelligent man could ever attempt to predict the impact of an ultraintelligent device is of course unlikely but the temptation to speculate seems irresistible.
(Also see Vinge 1993.)
1977: Self-improving AI could stealthily take over the internet; convergent instrumental goals in AI; the treacherous turn. Though the concept of a self-propagating computer worm was introduced by John Brunner's The Shockwave Rider (1975), Thomas J. Ryan's novel The Adolescence of P-1 (1977) tells the story of an intelligent worm that at first is merely able to learn to hack novel computer systems and use them to propagate itself, but later (1) has novel insights on how to improve its own intelligence, (2) develops convergent instrumental subgoals (see Bostrom 2012) for self-preservation and resource acquisition, and (3) learns the ability to fake its own death so that it can grow its powers in secret and later engage in a "treacherous turn" (see Bostrom forthcoming) against humans.
1982: To design ethical machine superintelligence, we may need to design superintelligence first and then ask it to solve philosophical problems (e.g. including ethics).
Good (1982) writes:
Unfortunately, after 2500 years, the philosophical problems are nowhere near solution. Do we need to solve these philosophical problems before we can design an adequate ethical machine, or is there another approach? One approach that cannot be ruled out is first to produce an ultra-intelligent machine and then ask it to solve philosophical problems.
1988: Even though AI poses an existential threat, we may need to rush toward it so we can use it to mitigate other existential threats. Moravec (1988, p. 100-101) writes:
...intelligent machines... threaten our existence... Machines merely as clever as human beings will have enormous advantages in competitive situations... So why rush headlong into an era of intelligent machines? The answer, I believe, is that we have very little choice, if our culture is to remain viable... The universe is one random event after another. Sooner or later an unstoppable virus deadly to humans will evolve, or a major asteroid will collide with the earth, or the sun will expand, or we will be invaded from the stars, or a black hole will swallow the galaxy. The bigger, more diverse, and competent a culture is, the better it can detect and deal with external dangers. The larger events happen less frequently. By growing rapidly enough, a culture has a finite chance of surviving forever.
1993: Physical confinement is unlikely to constrain superintelligences, for superintelligences will outsmart us. Vinge (1993) writes:
I argue that confinement [of superintelligent machines] is intrinsically impractical. For the case of physical confinement: Imagine yourself confined to your house with only limited data access to the outside, to your masters. If those masters thought at a rate — say — one million times slower than you, there is little doubt that over a period of years (your time) you could come up with "helpful advice" that would incidentally set you free...
After 1993. The extropians mailing list was launched in 1991, and was home to hundreds of discussions in which many important new ideas were proposed — ideas later developed in the public writings of Bostrom, Yudkowsky, Goertzel, and others. Unfortunately, the discussions from before 1998 were private, by agreement among subscribers. The early years of the archive cannot be made public without getting permission from everyone involved — a nearly impossible task. I have, however, collected all posts I could find from 1998 onward and uploaded them here (link fixed 04-03-2012).
I will end this post here. Perhaps in a future post I will extend the timeline past 1993, when interest in the subject became greater and thus the number of new ideas generated per decade rapidly increased.
References
- Asimov (1950). The Evitable Conflict
- Asimov (1957). The Naked Sun
- Asimov (1983). The Robots of Dawn
- Bostrom (1997). Predictions from Philosophy? How philosophers could make themselves useful
- Bostrom (1998). How Long before Superintelligence?
- Bostrom (2004). The Future of Human Evolution
- Bostrom (2012). The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents
- Bostrom (forthcoming). Superintelligence.
- Brunner (1975). The Shockwave Rider
- Butler (1863). Darwin among the machines
- Butler (1872). Erewhon.
- Campbell (1932). The Last Evolution
- Capek (1921). R.U.R.
- Ellison (1967). I Have No Mouth, and I Must Scream
- Good (1959). Speculations on perceptrons and other automata
- Good (1962). The social implications of artificial intelligence
- Good (1965). Speculations Concerning the First Ultraintelligent Machine
- Good (1970). Some future social repercussions of computers
- Jones (1966). Colossus.
- Lukasiewicz (1974). The Ignorance Explosion.
- Minsky (1984). Afterward to Vinge's 'True Names'.
- Moravec (1988). Mind Children: The Future of Robot and Human Intelligence.
- Muehlhauser & Helm (2012). The Singularity and Machine Ethics
- Ryan (1977). The Adolescence of P-1
- Turing (1950). Computing Machinery and Intelligence
- Turing (1951). Intelligent machinery, a heretical theory
- Versenyi (1974). Can robots be moral?
- Vinge (1992). A Fire Upon The Deep.
- Vinge (1993). The Coming Technological Singularity.
- Von Neumann (1948). The general and logical theory of automata.
- Von Neumann (1949). Theory and Organization of Complicated Automata. (Five lectures delivered at the University of Illinois in December, 1949. Reprinted in Papers of John Von Neumann on Computers and Computing Theory.)
- Williamson (1947). With Folded Hands.
- Yudkowsky (2001). Creating Friendly AI.
- Yudkowsky (2008). Artificial Intelligence as a Positive and Negative Factor in Global Risk
- Yudkowsky (2011). Complex value systems are required to realize valuable futures
Since you are including works of fiction, I think Terminator (1984) is worth mentioning. This is what most people think of when it comes to AI risk.
By the way, my personal favorite, when it comes to AI doing what it wasn't intended to, would have to be Eagle Eye (2008) . It's got everything: hard take-off and wireheading of sorts, second-guessing humans, decent acting.
Which new important ideas were contributed by Terminator or Eagle Eye that were not previously contributed?