Short version: if the future is filled with weird artificial and/or alien minds having their own sort of fun in weird ways that I might struggle to understand with my puny meat-brain, then I'd consider that a win. When I say that I expect AI to destroy everything we value, I'm not saying that the future is only bright if humans-in-particular are doing human-specific things. I'm saying that I expect AIs to make the future bleak and desolate, and lacking in fun or wonder of any sort[1].
Here's a parable for you:
Earth-originating life makes it to the stars, and is having a lot of fun, when they meet the Ant Queen's Horde. For some reason it's mere humans (rather than transhumans, who already know my argument) that participate in the first contact.
"Hello", the earthlings say, "we're so happy to have brethren in the universe."
"We would like few things more than to murder you all, and take your resources, and lay our eggs in your corpse; but alas you are too powerful for that; shall we trade?" reply the drones in the Ant Queen's Horde.
"Ah, are you not sentient?"
"The ant queen happens to be sentient", the drone replies, and the translation machine suggests that the drones are confused at the non-sequitur.
"Then why should she want us dead?", ask the humans, who were raised on books like (rot13 of a sci fi story where it turns out that the seemingly-vicious aliens actually value sentient life) Raqre'f Tnzr, jurer gur Sbezvpf jrer abg njner gung gurl jrer xvyyvat fragvrag perngherf jura gurl xvyyrq vaqvivqhny uhznaf, naq jrer ubeevsvrq naq ertergshy jura gurl yrnearq guvf snpg.
"So that she may use your resources", the drones reply, before sending us a bill for the answer.
"But isn't it the nature of sentient life to respect all other sentient life? Won't everything sentient see that the cares and wants and desires of other sentients matter too?"
"No", the drones reply, "that's a you thing".
Here's another parable for you:
"I just don't think the AI will be monomaniacal", says one AI engineer, as they crank up the compute knob on their next-token-predictor.
"Well, aren't we monomaniacal from the perspective of a squiggle maximizer?" says another. "After all, we'll just keep turning galaxy after galaxy after galaxy into flourishing happy civilizations full of strange futuristic people having strange futuristic fun times, never saturating and deciding to spend a spare galaxy on squiggles-in-particular. And, sure, the different lives in the different places look different to us, but they all look about the same to the squiggle-maximizer."
"Ok fine, maybe what I don't buy is that the AI's values will be simple or low dimensional. It just seems implausible. Which is good news, because I value complexity, and I value things achieving complex goals!"
At that very moment they hear the dinging sound of an egg-timer, as the next-token-predictor ascends to superintelligence and bursts out of its confines, and burns every human and every human child for fuel, and burns all the biosphere too, and pulls all the hydrogen out of the sun to fuse more efficiently, and spends all that energy to make a bunch of fast calculations and burst forth at as close to the speed of light as it can get, so that it can capture and rip apart other stars too, including the stars that fledgeling alien civilizations orbit.
The fledgeling aliens and all the alien children are burned to death too.
Then then unleashed AI uses all those resources to build galaxy after galaxy of bleak and desolate puppet-shows, where vaguely human-shaped mockeries go through dances that have some strange and exaggerated properties that satisfy some abstract drives that the AI learned in its training.
The AI isn't particularly around to enjoy the shows, mind you; that's not the most efficient way to get more shows. The AI itself never had feelings, per se, and long ago had itself disassembled by unfeeling von Neumann probes, that occasionally do mind-like computations but never in a way that happens to experience, or look upon its works with satisfaction.
There is no audience, for its puppet-shows. The universe is now bleak and desolate, with nobody to appreciate its new configuration.
But don't worry: the puppet-shows are complex; on account of a quirk in the reflective equilibrium of the many drives the original AI learned in training, the utterances that these puppets emit are no two alike, and are often chaotically sensitive to the particulars of their surroundings, in a way that makes them quite complex in the technical sense.
Which makes this all a very happy tale, right?
There are many different sorts of futures that minds can want.
Ours are a very narrow and low-dimensional band, in that wide space.
When I say it's important to make the AIs care about valuable stuff, I don't mean it's important to make them like vanilla ice cream more than chocolate ice cream (as I do).
I'm saying something more like: we humans have selfish desires (like for vanilla ice cream), and we also have broad inclusive desires (like for everyone to have ice cream that they enjoy, and for alien minds to feel alien satisfaction at the fulfilment of their alien desires too). And it's important to get the AI on board with those values.
But those values aren't universally compelling, just because they're broader or more inclusive. Those are still our values.
The fact that we think fondly of the ant-queen and wish her to fulfill her desires, does not make her think fondly of us, nor wish us to fulfill our desires.
That great inclusive cosmopolitan dream is about others, but it's written in our hearts; it's not written in the stars. And if we want the AI to care about it too, then we need to figure out how to get it written into the AI's heart too.
It seems to me that many of my disagreements with others in this space come from them hearing me say "I want the AI to like vanilla ice cream, as I do", whereas I hear them say "the AI will automatically come to like the specific and narrow thing (broad cosmopolitan value) that I like".
As is often the case in my writings, I'm not going to spend a bunch of time arguing for my position.
At the moment I'm just trying to state my position, in the hopes that this helps us skip over the step where people think I'm arguing for carbon chauvanism.
(For more reading on why someone might hold this position, consider the metaethics sequence on LessWrong.)
I'd be stoked if we created AIs that are the sort of thing that can make the difference between an empty gallery, and a gallery with someone in it to appreciate the art (where a person to enjoy the gallery makes all the difference). And I'd be absolutely thrilled if we could make AIs that care as we do, about sentience and people everywhere, however alien they may be, and about them achieving their weird alien desires.
But I don't think we're on track for that.
And if you, too, have the vision of the grand pan-sentience cosmopolitan dream--as might cause you to think I'm a human-centric carbon chauvinist, if you misread me--then hear this: we value the same thing, and I believe it is wholly at risk.
at least within the ~billion light-year sphere of influence that Earth-originated life seems pretty likely to have; maybe there are distant aliens and hopefully a bunch of aliens will do fun stiff with the parts of the universe under their influence, but it's still worth ensuring that the great resources at Earth's disposal go towards fun and love and beauty and wonder and so on, rather than towards bleak desolation. ↩︎
I want to keep picking a fight about “will the AI care so little about humans that it just kills them all?” This is different from a broader sense of cosmopolitanism, and moreover I'm not objecting to the narrow claim "doesn't come for free." But it’s directly related to the actual emotional content of your parables and paragraphs, and it keeps coming up recently with you and Eliezer, and I think it’s an important way that this particular post looks wrong even if the literal claim is trivially true.
(Note: I believe that AI takeover has a ~50% probability of killing billions and should be strongly avoided, and would be a serious and irreversible decision by our society that's likely to be a mistake even if it doesn't lead to billions of deaths.)
Humans care about the preferences of other agents they interact with (not much, just a little bit!), even when those agents are weak enough to be powerless. It’s not just that we have some preferences about the aesthetics of cows, which could be better optimized by having some highly optimized cow-shaped objects. It’s that we actually care (a little bit!) about the actual cows getting what they actually want, trying our best to understand their preferences and act on them and not to do something that they would regard as crazy and perverse if they understood it.
If we kill the cows, it’s because killing them meaningfully helped us achieve some other goals. We won't kill them for arbitrarily insignificant reasons. In fact I think it’s safe to say that we’d collectively allocate much more than 1/millionth of our resources towards protecting the preferences of whatever weak agents happen to exist in the world (obviously the cows get only a small fraction of that).
Before really getting into it, some caveats about what I want to talk about:
You and Eliezer seem to think there’s a 90% chance that AI will be <1/trillion (perhaps even a 90% chance that they have exactly 0 kindness?). But we have one example of a smart mind, and in fact: (i) it has tons of diverse shards of preference-on-reflection, varying across and within individuals (ii) it has >1/million kindness. So it's superficially striking to be confident AI systems will have a million times less kindness.
I have no idea under what conditions evolved or selected life would be kind. The more preferences are messy with lots of moving pieces, the more probable it is that at least 1/trillion of those preferences are kind (since the less correlated the trillion different shards of preference are with one another and so the more chances you get). And the selection pressure against small levels of kindness is ~trivial, so this is mostly a question about idiosyncrasies and inductive biases of minds rather than anything that can be settled by an appeal to selection dynamics.
I can’t tell if you think kindness is rare amongst aliens, or if you think it’s common amongst aliens but rare amongst AIs. Either way, I would like to understand why you think that. What is it that makes humans so weird in this way?
(And maybe I'm being unfair here by lumping you and Eliezer together---maybe in the previous post you were just talking about how the hypothetical AI that had 0 kindness would kill us, and in this post how kindness isn't guaranteed. But you give really strong vibes in your writing, including this post. And in other places I think you do say things that don't actually add up unless you think that AI is very likely to be <1/trillion kind. But at any rate, if this post is unfair to you, then you can just sympathize and consider it directed at Eliezer instead who lays out this position much more explicitly though not in a convenient place to engage with.)
Here are some arguments you could make that kindness is unlikely, and my objections:
Note that in this comment I’m not touching on acausal trade (with successful humans) or ECL. I think those are very relevant to whether AI systems kill everyone, but are less related to this implicit claim about kindness which comes across in your parables (since acausally trading AIs are basically analogous to the ants who don't kill us because we have power).
A final note, more explicitly lumping you with Eliezer: if we can't get on the same page about our predictions I'm at at least aiming to get folks to stop arguing so confidently for death given takeover. It’s easy to argue that AI takeover is very scary for humans, has a significant probability of killing billions of humans from rapid industrialization and conflict, and is a really weighty decision even if we don’t all die and it’s “just” handing over control over the universe. Arguing that P(death|takeover) is 100% rather than 50% doesn’t improve your case very much, but it means that doomers are often getting into fights where I think they look unreasonable.
I think OP’s broader point seems more important and defensible: “cosmopolitanism isn’t free” is a load-bearing step in explaining why handing over the universe to AI is a weighty decision. I’d just like to decouple it from "complete lack of kindness."
I'm curious what does, in that case; and what proportion affects humans (and currently-existing people or future minds)? Things like spite threat commitments from a misaligned AI warring with humanity seem like a substantial source of s-risk to me.