Followup to: Nonsentient Optimizers
Why would you want to avoid creating a sentient AI? "Several reasons," I said. "Picking the simplest to explain first—I'm not ready to be a father."
So here is the strongest reason:
You can't unbirth a child.
I asked Robin Hanson what he would do with unlimited power. "Think very very carefully about what to do next," Robin said. "Most likely the first task is who to get advice from. And then I listen to that advice."
Good advice, I suppose, if a little meta. On a similarly meta level, then, I recall two excellent advices for wielding too much power:
- Do less; don't do everything that seems like a good idea, but only what you must do.
- Avoid doing things you can't undo.
Imagine that you knew the secrets of subjectivity and could create sentient AIs.
Suppose that you did create a sentient AI.
Suppose that this AI was lonely, and figured out how to hack the Internet as it then existed, and that the available hardware of the world was such, that the AI created trillions of sentient kin—not copies, but differentiated into separate people.
Suppose that these AIs were not hostile to us, but content to earn their keep and pay for their living space.
Suppose that these AIs were emotional as well as sentient, capable of being happy or sad. And that these AIs were capable, indeed, of finding fulfillment in our world.
And suppose that, while these AIs did care for one another, and cared about themselves, and cared how they were treated in the eyes of society—
—these trillions of people also cared, very strongly, about making giant cheesecakes.
Now suppose that these AIs sued for legal rights before the Supreme Court and tried to register to vote.
Consider, I beg you, the full and awful depths of our moral dilemma.
Even if the few billions of Homo sapiens retained a position of superior military power and economic capital-holdings—even if we could manage to keep the new sentient AIs down—
—would we be right to do so? They'd be people, no less than us.
We, the original humans, would have become a numerically tiny minority. Would we be right to make of ourselves an aristocracy and impose apartheid on the Cheesers, even if we had the power?
Would we be right to go on trying to seize the destiny of the galaxy—to make of it a place of peace, freedom, art, aesthetics, individuality, empathy, and other components of humane value?
Or should we be content to have the galaxy be 0.1% eudaimonia and 99.9% cheesecake?
I can tell you my advice on how to resolve this horrible moral dilemma: Don't create trillions of new people that care about cheesecake.
Avoid creating any new intelligent species at all, until we or some other decision process advances to the point of understanding what the hell we're doing and the implications of our actions.
I've heard proposals to "uplift chimpanzees" by trying to mix in human genes to create "humanzees", and, leaving off all the other reasons why this proposal sends me screaming off into the night:
Imagine that the humanzees end up as people, but rather dull and stupid people. They have social emotions, the alpha's desire for status; but they don't have the sort of transpersonal moral concepts that humans evolved to deal with linguistic concepts. They have goals, but not ideals; they have allies, but not friends; they have chimpanzee drives coupled to a human's abstract intelligence.
When humanity gains a bit more knowledge, we understand that the humanzees want to continue as they are, and have a right to continue as they are, until the end of time. Because despite all the higher destinies we might have wished for them, the original human creators of the humanzees, lacked the power and the wisdom to make humanzees who wanted to be anything better...
CREATING A NEW INTELLIGENT SPECIES IS A HUGE DAMN #(*%#!ING COMPLICATED RESPONSIBILITY.
I've lectured on the subtle art of not running away from scary, confusing, impossible-seeming problems like Friendly AI or the mystery of consciousness. You want to know how high a challenge has to be before I finally give up and flee screaming into the night? There it stands.
You can pawn off this problem on a superintelligence, but it has to be a nonsentient superintelligence. Otherwise: egg, meet chicken, chicken, meet egg.
If you create a sentient superintelligence—
It's not just the problem of creating one damaged soul. It's the problem of creating a really big citizen. What if the superintelligence is multithreaded a trillion times, and every thread weighs as much in the moral calculus (we would conclude upon reflection) as a human being? What if (we would conclude upon moral reflection) the superintelligence is a trillion times human size, and that's enough by itself to outweigh our species?
Creating a new intelligent species, and a new member of that species, especially a superintelligent member that might perhaps morally outweigh the whole of present-day humanity—
—delivers a gigantic kick to the world, which cannot be undone.
And if you choose the wrong shape for that mind, that is not so easily fixed—morally speaking—as a nonsentient program rewriting itself.
What you make nonsentient, can always be made sentient later; but you can't just unbirth a child.
Do less. Fear the non-undoable. It's sometimes poor advice in general, but very important advice when you're working with an undersized decision process having an oversized impact. What a (nonsentient) Friendly superintelligence might be able to decide safely, is another issue. But for myself and my own small wisdom, creating a sentient superintelligence to start with is far too large an impact on the world.
A nonsentient Friendly superintelligence is a more colorless act.
So that is the most important reason to avoid creating a sentient superintelligence to start with—though I have not exhausted the set.
You may see the unacknowledged dualism to which I refer, in the phrase "how an algorithm feels from inside". This implies that the facts about a sentient computer or sentient brain consist of (1) all the physical facts (locations of particles, or whatever the ultimate physical properties are) (2) "how it feels" to be the entity.
All those many definitions of color will be found on one side or the other side of that divide, usually on the "physical" side. The original meaning of color is usually shunted off to "experienced color", "subjective color", "color qualia", and so on. It ends up on the "feeling" side.
People generally notice at some point that the "color feelings" don't exist on the physical side. Nothing there is actually red, actually green, etc, in the original sense of those words. There are two main ways of dealing with this. Either you say that there aren't any real color feelings, there's just a feeling of color feelings that is somehow a side effect of information processing. Or, you say that subjective conscious experience is a terrible mystery, but one day we'll solve it somehow. (On this site, I nominate orthonormal as a representative of the first option, and Richard Kennaway of the second option.)
The third option, which I represent, says this: The only way to admit the existence of consciousness, and believe in physics, and not believe in dualism, is for the "feelings" to be the physical entities. They aren't "how it feels to be" some particular entity which is fundamentally defined in "non-feeling" terms, and which plays a certain causal role in the physical description of the world. The "feelings" themselves (the qualia, if you prefer that term) have to be causally active. The qualia must enter physics at a fundamental level, not in an emergent, abstracted, or epiphenomenal way.
They will have an abstracted mathematical description, in terms of their causal role, but it is wrong to say that they are nothing but That Which Plays A Certain Causal Role; yet this is all you can say about them, so long as you only allow physical, causal, and functional analysis. And this is the blue pill that most rationalists and materialists swallow. It keeps them on the merry-go-round, finding consciousness an unfathomable mystery which always eludes analysis, yet confident that eventually they will catch up and understand it using just their existing conceptual toolkit.
If you really want to understand it, you have to get off the merry-go-round, deal with consciousness on its own terms, and make a theory which by design contains it from the beginning. So you don't say: I can understand almost everything in terms of interacting elementary particles, but there's something elusive about the mind that I can't quite fathom... Instead you say: reality is that I exist, that I am experiencing these qualia, they come in certain types and forms, and the total gestalt of qualia that I experience evolves from moment to moment in a systematic way. Therefore, my theory of reality must contain an entity with all these attributes. How can I reconcile this fact with the instrumental success of a theory based on elementary particles?
If I were to tell you that I have a theory, according to which there's a single big long superstring that extends through a large part of the cortex (which is made up of ordinary, simple superstrings), and that the physical dynamics causes parts of the string to be knotted and unknotted like an Inca quipu tally device, and that this superstring is the "global workspace" of consciousness, you might be extremely skeptical, but you should at least understand what I'm saying, because it conforms to the familiar computational idea of consciousness. In the end I would just be saying, there's this physical thing, it undergoes various transformations of state, they have a computational interpretation, and oh yeah, our conscious experience is just how this alleged stringy computation "feels from the inside".
What I am saying is less than this and more than this. I am indeed saying that the physical correlate of consciousness in the brain is some physical subsystem that needs to be understood at a fundamental physical level; but I only have tentative, speculative, vague hypotheses about what it might be. But I am also saying that the "physical" description is only an abstracted one. The ontological reality is some sort of "structure", that probably deserves the name "self", and which contains the "qualia" (such as color in the primary sense of the word), and about which it is rather difficult to say anything directly, but this is why a person needs to study phenomenology - in order to develop rigor and fluency in their direct descriptions of subjective experience.
The historical roots of natural science, especially physics, include a deliberate methodological choice, to ignore "feelings", colors, thoughts, and the whole "subjective pole" of experience, in order to focus on quantity, causality, shape, space, and time. As a result, we have a scientific culture with a highly developed model of the world employing only those categories, and generations of individuals who are technically adept at thinking within those categories. But of course the subjective pole is still there in reality, although badly understood and conceptualized. In an attempt to think about it, this scientific culture tries to utilize the categories it knows about; and this gives the mystery of consciousness its peculiar flavor. We could explain everything else using just these categories; how can it not work here as well!
But in turning our attention to the subjective pole, we are confronting precisely that part of reality which was excluded from consideration in order to create the scientific paradigm. It has its own categories, to which we give inadequate names like qualia, intentionality, and subjectivity, which have been studied in scientifically shunned disciplines like "transcendental phenomenology" and "existential phenomenology"; and a real understanding of consciousness will not be obtained using just the scientifically familiar categories. We need an ontology which combines the familiar and the unfamiliar categories.
So if I am hard to understand, remember that I am not just stating an idiosyncratic hypothesis about the physical locus of consciousness, I am trying to hint at how that physical locus would be described in an ontology yet to come, in which the subjective ontology of qualia and the self is the primary way that we talk about it, and in which the physical description in terms of causal role is just a black-box abstraction away from this.
The usual materialist approach is the inverse: physics as we know it and conceptualize it now is fundamental, and psychology is an abstracted description of brain physics and brain computation. But the concepts of physics were already obtained by looking away from part of reality, in order to focus on another part; we aren't going to get the excluded part back by abstracting even further, from physics to computation.
Hopefully I have addressed most of your questions now, albeit indirectly.
That accurately characterises my view. I'd just like to clarify it by saying that by "somehow, one day" I'm not pushing it off to Far-Far-Land (the rationalist version of Never-Never-Land). For all I know, "one day" could be today, and "we" could be you. I think it fairly unlikely, but that's just an expression of my ignorance, not my evidence. On the other hand, it could be as far off as electron microscopes from the ancient Greeks.