Related to: Kaj Sotala's Posts, Blogs by LWers
By fellow LessWronger Kaj_Sotala on his blog.
Scott recently made two posts [1 2] about some of the dangers of technology, and of becoming too powerful for yourself. Now, I’ll admit that I didn’t entirely understand his concern. As far as I could tell, he was worried that at some point, we might perfectly know the best possible strategy for pursuing all of our desires, and have the willpower to do so. Then, in a sense, one could say that we’d no longer experience having a free will. There would always be only one reasonable action in any situation, and we would always pick that one.
Well, I’m not too concerned about that. But the post hilighted one possible way that technology could damage something that we consider dear and essential, by removing essential constraints. That’s actually a rather major worry, and a far broader one than just one example suggests. (This essay was also influenced by a recent comment by Randal Koene.)
First, though, let’s review a bit of history.
In 1967, the biologist Sol Spiegelman took a strand of viral RNA, and placed it on a dish containing various raw materials that the RNA could use to build new copies of itself. After the RNA strands had replicated on the dish, Spiegelman extracted some of them and put them on another dish, again with raw materials that the strands could use to replicate themselves. He then kept repeating this process.
No longer burdened with the constraints of needing to work for a living, produce protein coats, or to do anything but reproduce, the RNA evolved to match its new environment. The RNA mutated, and the strands which could copy themselves the fastest won out. Everything in those strands that wasn’t needed for reproduction had just become an unnecessary liability. After just 74 generations, the original 4,500 nucleotide bases had been reduced to a mere 220. Useless parts of the genome had been discarded; the viral RNA had now become a pure replicator, dubbed “Spiegelman’s monster”. (Source.)
What happens in evolution is that organisms adapt themselves to exploit, and protect themselves from, the various regularities of the environment. Light reflects off distant objects in a predictable manner, so creatures have evolved eyes that they can use to see. If the environment ceases to possess some regularities, it will necessarily change the organisms. Put a fish with eyes in a cave with no light, and it will lose its sight over a few thousand years at most. Even humans have kept evolving as our environment has changed. Sickle-cell disease is more common in people whose ancestors are from regions with malaria. A single sickle-cell gene makes you more resistant to malaria, but two give you the disease. That’s an acceptable tradeoff in an environment with a lot of malaria, but a burden outside that environment.
You could say that the environment constrains the kind of organisms that can exist there. Now, those constraints aren’t immediate: that cave fish won’t lose its eyes right away. But over enough time, as different kinds of fish compete for survival, the ones which don’t waste their energy on growing useless eyes will win out.
Humans, as I was suggesting before, have also evolved to meet some very specific environmental constraints. As our environment has changed – either by our own doing, or due to reasons that have nothing to do with us – those constraints have changed somewhat, and we have changed with them. But many things about our nature, things that we might consider fundamental, have not changed. We still tell stories, enjoy the company of others, and are distinct individuals. Sure, the exact forms that those things take have changed over time. Today we are more likely to watch a story on TV than to hear one over a campfire – but both are still recognizable forms of story-telling. Countless of human universals are found in cultures all over the planet:
aesthetics; affection expressed and felt; age grades; body adornments; childhood fears; classification of kin; cooking; cooperation; customary greetings; daily routines; dance; distinguishing right and wrong; dreams; emotions; empathy; envy; family (or household); folklore; generosity admired; gossip; hope; hospitality; imagery; jokes; judging others; leaders; likes and dislikes; manipulating social relations; marriage; meal times; mourning; music …
Individuals may disagree about which of those things really are fundamental – whether losing some specific universal would really be a loss – but most people are likely to say that at least some of those things are important and worth keeping.
But as technology keeps evolving, it will make it easier and easier to overcome various constraints in our environment, our bodies, and in our minds. And then it will become increasing tempting to become a Spiegelman’s monster: to rid yourself of the things that the loosened constraints have made unnecessary, to become something that is no longer even remotely human. If you don’t do it, then someone else will. With enough time, they may end up ruling the world, outcompeting you like Spiegelman’s monster outcompeted the original, umutated RNA strands.
Exactly what kinds of constraints am I talking about, here? Well, there are several, in a roughly increasing order of severity:
- Not being too powerful for yourself. Scott’s concern: that at some point, we might perfectly know the best possible strategy for pursuing all of our desires, and have the willpower to do so. Then, in a sense, one could say that we no longer experienced having a free will – there would always only be one reasonable action in any situation, and we would always pick that one.
- Having distinct minds. We might not be too far away from having the ability to directly connect brains with each other. I think about something, and the thought crosses over to your brain, merging with your stream of consciousness. With time, this technology could be perfected so that large groups of people could join together into a single entity, coordinating and doing everything much better than any “traditional” human. Combined with an ability to copy memories, the concept of “personal identity” might cease to have any meaning at all – there would be no persons, just an amorphous mass of consciousnesses all sharing most of the same memories.
- Unmodifiable desires: as desire modification becomes possible, anyone could reprogram their brains to be constantly perfectly satisfied and never do anything else (except possibly the bare minimum needed for survival). Sure, the possibility feels unappealing now… but maybe you’re having a bad day, and you choose to modify your brain to feel just a little better, all the time. And then the thought of being permanently blissed out doesn’t feel so bad after all, and you modify your brain just a little more… how could you not envy the folks who are never unhappy, especially since the option to self-modify is always there?
- Inability to design superintelligent AGIs. We are constantly investing in ever-improving AI, for the obvious economic reasons: it allows for ever-more work to be automated. It may indeed prove impossible to regulate AI development in order to stop super-intelligent AGIs (artificial general intelligences) from arising. If so, then it might also prove impossible to ensure that safe and human-friendly AGIs prevail: like with Spiegelman’s monsters, the AGIs not burdened with the constraints of respecting human life and property may end up winning the AGIs that wish to protect humanity, after which they’ll recycle human settlements into their raw materials.
- An inability to become mindless outsourcers. Nick Bostrom suggests a scenario where we learn to offload all of our thought to non-conscious external programs. To quote: “Why do I need to know arithmetic when I can buy time on Arithmetic-Modules Inc. whenever I need to do my accounts? Why do I need to be good with language when I can hire a professional language module to articulate my thoughts? Why do I need to bother with making decisions about my personal life when there are certified executive-modules that can scan my goal structure and manage my assets so as best to fulfill my goals?” And so, we give in to the temptation to cut away more and more parts of our brains, letting computer programs run those tasks… until there is no conscious experience left.
- An inability to copy the best workers, choosing only the ones best fit for their tasks. If we could upload brains to computers, it could also become possible to copy minds. This could be far quicker than ordinary reproduction, making copying the primary method by which humans multiplied – and one’s ability to acquire and retain more hardware to run one’s copies on, would become the main criteria that evolution selected for. As Nick Bostrom writes: Much of human life’s meaning arguably depends on the enjoyment, for its own sake, of humor, love, game-playing, art, sex, dancing, social conversation, philosophy, literature, scientific discovery, food and drink, friendship, parenting, and sport. We have preferences and capabilities that make us engage in such activities, and these predispositions were adaptive in our species’ evolutionary past; but what ground do we have for being confident that these or similar activities will continue to be adaptive in the future? Perhaps what will maximize fitness in the future will be nothing but non-stop high-intensity drudgery, work of a drab and repetitive nature, aimed at improving the eighth decimal of some economic output measure. Even if the workers selected for in this scenario were conscious, the resulting world would still be radically impoverished in terms of the qualities that give value to life.
To rephrase what I have been saying:
“Humans” inhabit a narrow region in a multidimensional space of possibilities, and various constraints currently keep everyone stuck in that tiny space. If any of those constraints were to be relaxed – the space of possible minds stretched in any direction – then the new kinds of minds, no longer burdened with the constraints that make our fundamental values so adaptive, would be free to expand in entirely new directions. And it seems inevitable that, given a broader space of possible adaptations, evolutionary pressures would eventually lead to the dominance of minds – or at least replicators – which were very different from what most of us would value as “human”.
Get enough of one constraint, and you might still recognize the outcome as having once been human. Get rid of enough constraints, and you’ll get the equivalent of a Spiegelman’s monster, no longer even remotely human.
There have been some suggestions of how to avoid this. Nick Bostrom has suggested [1 2] that we create a “singleton”, a world-order with a single decision-making agency at the highest level, capable of controlling evolution. The singleton could be an appropriately-programmed AGI, the right group of uploads, or something else. Maybe this will work: but I doubt it. I expect all such efforts to fail, and humanity to eventually vanish. Possibly within my lifetime, if we’re unlucky.
I’ll conclude this essay with the immortal words of H.P. Lovecraft:
The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the deadly light into the peace and safety of a new dark age.
Needless to say, Lovecraft was being too optimistic.
This might be clearer once the survey paper about proposed FAI approaches (as well as other approaches to limiting AI risk) we're writing becomes public, but suffice to say, IMO nobody so far has managed to propose an FAI approach that wouldn't be riddled with serious problems. Almost none of them work if we have a hard takeoff, and a soft takeoff might not be any better, due to allowing lots of different AGIs to compete and leading to the kind of evolutionary scenarios like described in the post. If there's a hard takeoff, you need to devote a lot of time and effort into making the design safe and also be the first one to have your AGI undergo a hard takeoff, two mutually incompatible goals. That's assuming that you even have a clue of what kind of a design would be safe - something CEV-like could qualify as safe, but currently it remains so vaguely specified that it reads more like a list of applause lights than an actual design, and even getting to the point where we could call it a design feels like it requires solving numerous difficult problems, some of which have remained unsolved for thousands of years, and our remaining time might be counted in tens of years rather than thousands or even hundreds... and so on and so on.
Not saying that it's impossible, but there are far more failure scenarios than successful ones, and an amazing amount of things would all have to go right in order for us to succeed.
Scary.
What can be done to improve our chances? I assume more funding for SI is a good idea, and I don't know how much I can do beyond that (math/philosophy/AI are not my expertise).
Waterline stuff is important, too.