to illustrate all the ways in which such simple rules for governing robot behavior could go wrong.
to illustrate all the ways in which such well meaning and seemingly comprehensive rules for governing robot behavior could go wrong.
noted that machines will one day be capable of genuine thought:
noted that there is no good reason to believe machines will not one day be capable of everything currently only achievable with human intelligence:
Perhaps our role on this planet is not to worship God but to create Him.
Ewwww.
a field of inquiry variously known as
I think the citations here clutter the text and should be footnotes.
Leading philosopher
Prominent philosopher
Friendly AI researchers do not regularly cite the machine ethics literature
...because (fill in the blank)
...because Friendly AI researchers seek to make ethics a dynamic part of the AI system and machine ethics literature mostly concerns attaching a separate algorithmic module to a functioning, non-Friendly AI.
Maybe treat The Last Evolution more like you treated Runaround. Just condense the relevant points.
Arthur C. Clarke is better known as an author, so I'd prefer to see him listed as "futurist and author." The last sentence of Clarke's quote is just going to feed the dreaded fourth definition of the singularity, and should probably be dropped.
The Vinge quote seems unnecessary, since you've quoted Lukasiewicz with a much more directly relevant quote about unpredictability.
I then want to see a little more logical structure, more than just saying "FAI is AI that has a positive impact." Maybe frame FAI in response to Lukasiewicz's quote, in terms of being rigorously able to predict that some AI will have a positive impact.
Was FAI or machine ethics mentioned in Chalmers' paper? Will these topics be discussed in the folllow-up issue? If so, say so, if not, say less, or say why this is still important for the friendly AI concept.
The last paragraph then suddenly jumps. Maybe start with a "despite their parallel yada yada." Does the machine ethics literature cite friendly AI literature?
Because CEV predates the stuff you were talking about just above, I'd rather see a short mention of it at the end of the "Eliezer Yudkowsky paragraph." Maybe just call it (Yudkowsky 2004) - the important part isn't the idea of CEV, it's that one of the prongs of FAI is goal systems that can be predicted to have positive impact.
Stanislaw Lem anticipates Friendly AI in one of his tales about star explorer Ijon Tichy, in Star Diaries, Voyage 24 (this particular story was originally published in 1953). The citizens of the planet Tichy visits in this voyage have decided to entrust their fate in a machine created by them and more intelligent than they are. They try to safeguard its conduct with axioms, but do not get Friendly AI right:
"...Great danger threatens our state, for rebellious, criminal ideas are arising among the masses of Drudgelings. They strive to abolish our splendid freedoms and the law of Civic Initiative! We must make every effort to defend our liberty. After careful consideration of the whole problem, we have reached the conclusion that we are unequal to the task. Even the most virtuous, capable, and model Phool can be swayed by feelings, and is often vacillating, biased, and fallible, and thus unfit to reach a decision in so complicated and important a matter. Therefore, within six months you are to build us a purely rational, strictly logical, and completely objective Governing Machine that does not know the hesitation, emotion, and fear that befuddle living minds. Let this machine be as impartial as the light of the Sun and stars. When you have built and activated it, we shall hand over to it the burden of power, which grows too heavy for our weary shoulders."
" 'So be it,' said the constructor, 'but what is to be the machine's basic motivation?'
" 'Obviously, the freedom of Civic Initiative. The machine must not command or forbid the citizens anything; it may, of course, change the conditions of our existence, but it must do so always in the form of a proposal, leaving us alternatives between which we can freely choose.' "
'So be it,' replied the constructor, 'but this injunction concerns mainly the mode of operation. What of the ultimate goal? What is this machine's purpose?' "
'Our state is threatened by chaos; disorder and disregard for the law are spreading. Let the Machine bring supreme harmony to the planet, let it institute, consolidate, and establish perfect and absolute order.'
The machine so created ends up turning all the citizens into pleasant geometrical figures - triangles, rectangles and so forth - and arranging them in an aesthetically pleasing manner on lawns throughout the land.
Of course, there are many places where Lem considers possible consequences of letting constructed entities run the world.
In the "Observation on the Spot", he describes "ethicosphere" and its coming to be. It is stated that eticosphere is non-person because they wanted it just to enforce the laws as given. The whole premise is that it was done as well as anyone could hope. Still, its ability to maintain life in the body after brain death (either because eticosphere was still too weak or because it had hard limits on interventions into brain) and apparent practice of uncontrollable embryoselection (among other things) make the creating race nervous.
I don't remember whether the short story where robots decide that they are true humans was included in the "I, Robot"; I remember precisely that the novel where one robot goes into deadlock right after harming most of currently-existing humans to benefit the future humanity is written later than "I, Robot".
An interesting point to note: Asimov started writing about robots with the idea to go between "Robots as Menace" and "Robots as Pathos" - he ended up having a single robot determine the key events in the human history for a few thousand years...
I invite your feedback on this snippet from the forthcoming Friendly AI FAQ. This one is an answer to the question "What is the history of the Friendly AI concept?"
_____
Late in the Industrial Revolution, Samuel Butler (1863) worried about what might happen when machines become more capable than the humans who designed them:
This basic idea was picked up by science fiction authors, for example in John W. Campbell’s (1932) short story The Last Evolution. In the story, humans live lives of leisure because machines are smart enough to do all the work. One day, aliens invade:
Earth’s machines, protecting humans, defeat the aliens. The aliens’ machines survive long enough to render humans extinct, but are eventually defeated by Earth’s machines. These machines inherit the solar system, eventually moving to run on substrates of pure “Force.”
The concerns of machine ethics are most popularly identified with Isaac Asimov’s Three Laws of Robotics, introduced in his short story Runaround. Asimov used his stories, including those collected in the popular I, Robot book, to illustrate all the ways in which such simple rules for governing robot behavior could go wrong.
In the year of I, Robot’s release, mathematician Alan Turing (1950) noted that machines will one day be capable of genuine thought:
Turing (1951/2004) concluded:
Bayesian statistician I.J. Good (1965), who had worked with Turing to crack Nazi codes in World War II, made the crucial leap to the ‘intelligence explosion’ concept:
Futurist Arthur C. Clarke (1968) agreed:
Julius Lukasiewicz (1974) noted that human intelligence may be unable to predict what a superintelligent machine would do:
Even critics of AI like Jack Schwartz (1987) saw the implications:
Novelist Vernor Vinge (1981) called this 'event horizon' in our ability to predict the future a 'singularity':
Eliezer Yudkowsky (1996) used the term 'singularity' to refer instead to Good's 'intelligence explosion', and began work on the task of figuring out how to build a self-improving AI that had a positive rather than negative effect on the world (Yudkowsky 2000) — a project he eventually called 'Friendly AI' (Yudkowsky 2001).
Meanwhile, philosophers and AI researchers were considering whether or not machines could have moral value, and how to ensure ethical behavior from less powerful machines or 'narrow AIs', a field of inquiry variously known as 'artificial morality' (Danielson 1992; Floridi & Sanders 2004; Allen et al. 2000), 'machine ethics' (Hall 2000; McLaren 2005; Anderson & Anderson 2006), 'computational ethics' (Allen 2002) and 'computational metaethics' (Lokhorst, 2011), and 'robo-ethics' or 'robot ethics' (Capurro et al. 2006; Sawyer 2007). This vein of research — what we'll call the 'machine ethics' literature — was recently summarized in two books: Wallach & Allen (2009); Anderson & Anderson (2011).
Leading philosopher of mind David Chalmers brought the concepts of intelligence explosion and Friendly AI to mainstream academic attention with his 2010 paper, ‘The Singularity: A Philosophical Analysis’, published in Journal of Consciousness Studies. That journal’s January 2012 issue will be devoted to responses to Chalmers’ article, as will an edited volume from Springer (Eden et al. 2012).
Friendly AI researchers do not regularly cite the machine ethics literature (e.g. see Bostrom & Yudkowsky 2011). These researchers have put forward preliminary proposals for ensuring ethical behavior in superintelligent or self-improving machines, for example 'Coherent Extrapolated Volition' (Yudkowsky 2004).