I'm researching and writing a book on meta-ethics and the technological singularity. I plan to post the first draft of the book, in tiny parts, to the Less Wrong discussion area. Your comments and constructive criticisms are much appreciated.
This is not a book for a mainstream audience. Its style is that of contemporary Anglophone philosophy. Compare to, for example, Chalmers' survey article on the singularity.
Bibliographic references are provided here.
Part 1 is below...
Chapter 1: The technological singularity is coming soon.
The Wright Brothers flew their spruce-wood plane for 200 feet in 1903. Only 66 years later, Neil Armstrong walked on the moon, more than 240,000 miles from Earth.
The rapid pace of progress in the physical sciences drives many philosophers to science envy. Philosophers have been researching the core problems of metaphysics, epistemology, and ethics for millennia and not yet come to consensus about them like scientists have for so many core problems in physics, chemistry, and biology.
I won’t argue about why this is so. Instead, I will argue that maintaining philosophy’s slow pace and not solving certain philosophical problems in the next two centuries may lead to the extinction of the human species.
This extinction would result from a “technological singularity” in which an artificial intelligence (AI) of human-level general intelligence uses its intelligence to improve its own intelligence, which would enable it to improve its intelligence even more, which would lead to an “intelligence explosion” feedback loop that would give this AI inestimable power to accomplish its goals. If so, then it is critically important to program its goal system wisely. This project could mean the difference between a utopian solar system of unprecedented harmony and happiness, and a solar system in which all available matter is converted into parts for a planet-sized computer built to solve difficult mathematical problems.
The technical challenges of designing the goal system of such a superintelligence are daunting.[1] But even if we can solve those problems, the question of which goal system to give the superintelligence remains. It is a question of philosophy; it is a question of ethics.
Philosophy has impacted billions of humans through religion, culture, and government. But now the stakes are even higher. When the technological singularity occurs, the philosophy behind the goal system of a superintelligent machine will determine the fate of the species, the solar system, and perhaps the galaxy.
***
Now that I have laid my positions on the table, I must argue for them. In this chapter I argue that the technological singularity is likely to occur within the next 200 years unless a worldwide catastrophe drastically impedes scientific progress. In chapter two I survey the philosophical problems involved in designing the goal system of a singular superintelligence, which I call the “singleton.”
In chapter three I show how the singleton will produce very different future worlds depending on which normative theory is used to design its goal system. In chapter four I describe what is perhaps the most developed plan for the design of the singleton’s goal system: Eliezer Yudkowsky’s “Coherent Extrapolated Volition.” In chapter five, I present some objections to Coherent Extrapolated Volition.
In chapter six I argue that we cannot decide how to design the singleton’s goal system without considering meta-ethics, because normative theory depends on meta-ethics. In chapter seven I argue that we should invest little effort in meta-ethical theories that do not fit well with our emerging reductionist picture of the world, just as we quickly abandon scientific theories that don’t fit the available scientific data. I also specify several meta-ethical positions that I think are good candidates for abandonment.
But the looming problem of the technological singularity requires us to have a positive theory, too. In chapter eight I propose some meta-ethical claims about which I think naturalists should come to agree. In chapter nine I consider the implications of these plausible meta-ethical claims for the design of the singleton’s goal system.
***
[1] These technical challenges are discussed in the literature on artificial agents in general and Artificial General Intelligence (AGI) in particular. Russell and Norvig (2009) provide a good overview of the challenges involved in the design of artificial agents. Goertzel and Pennachin (2010) provide a collection of recent papers on the challenges of AGI. Yudkowsky (2010) proposes a new extension of causal decision theory to suit the needs of a self-modifying AI. Yudkowsky (2001) discusses other technical (and philosophical) problems related to designing the goal system of a superintelligence.
I'm confused by your asking such questions. Roko's basilisk is a failure mode of CEV. I'm not aware of any work by you or other SIAI people that addresses it, never mind work that would prove the absence of other, yet undiscovered "creative" flaws.
From your current perspective. But also given your extrapolated volition? If it is, then it won't happen.
ETA The above was confusing and unclear. I don't believe that one person can change the course of CEV. I rather meant to ask if he believes that it would be a failure mode even if it was the correct extrapolated volition of humanity.