Status: Partially in response to We Don't Trade With Ants, partly in response to watching others try to make versions of this point that I didn't like. None of this is particularly new; it feels to me like repeating obvious claims that have regularly been made in comments elsewhere, and are probably found in multiple parts of the LessWrong sequences. But I've been repeating them aloud a bunch recently, and so might as well collect the points into a single post.
This post is an answer to the question of why an AI that was truly indifferent to humanity (and sentient life more generally), would destroy all Earth-originated sentient life.
Might the AGI let us live, not because it cares but because it has no particular reason to go out of its way to kill us?
As Eliezer Yudkowsky once said:
The AI does not hate you, nor does it love you, but you are made of atoms which it can use for something else.
There's lots of energy in the biosphere! (That's why animals eat plants and animals for fuel.) By consuming it, you can do whatever else you were going to do better or faster.
(Last I checked, you can get about 10x as much energy from burning a square meter of biosphere as you can get by collecting a square meter of sunlight for a day. But I haven't done the calculation for years and years and am pulling that straight out of a cold cache. That energy boost could yield a speedup (in your thinking, or in your technological design, or in your intergalactic probes themselves), which translates into extra galaxies you manage to catch before they cross the cosmic event horizon!)
But there's so little energy here, compared to the rest of the universe. Why wouldn't it just leave us be, and go mine asteroids or something?
Well, for starters, there's quite a lot of energy in the sun, and if the biosphere isn't burned for fuel then it will freeze over when the AI wraps the sun in a dyson sphere or otherwise rips it apart. It doesn't need to consume your personal biomass to kill you; consuming the sun works just fine.
And separately, note that if the AI is actually completely indifferent to humanity, the question is not "is there more energy in the biosphere or in the sun?", but rather "is there more energy available in the biosphere than it takes to access that energy?". The AI doesn't have to choose between harvesting the sun and harvesting the biosphere, it can just harvest both, and there's a lot of calories in the biosphere.
I still just think that it might decide to leave us be for some reason.
That answers above are sufficient to argue that the AI kills us (if the AI's goals are orthogonal to ours, and can be better achieved with more resources). But the answer is in fact overdetermined, because there's also the following reason.
A humanity that just finished coughing up a superintelligence has the potential to cough up another superintelligence, if left unchecked. Humanity alone might not stand a chance against a superintelligence, but the next superintelligence humanity builds could in principle be a problem. Disassembling us for parts seems likely to be easier than building all your infrastructure in a manner that's robust to whatever superintelligence humanity coughs up next. Better to nip that problem in the bud.[1]
But we don't kill all the cows.
Sure, but the horse population fell dramatically with the invention of the automobile.
One of the big reasons that humans haven't disassembled cows for spare parts is that we aren't yet skilled enough to reassemble those spare parts into something that is more useful to us than cows. We are trying to culture meat in labs, and when we do, the cow population might also fall off a cliff.
A sufficiently capable AI takes you apart instead of trading with you at the point that it can rearrange your atoms into an even better trading partner.[2] And humans are probably not the optimal trading partners.
But there's still a bunch of horses around! Because we like them!
Yep. The horses that are left around after they stopped being economically useful are around because some humans care about horses, and enjoy having them around.
If you can make the AI care about humans, and enjoy having them around (more than it enjoys having-around whatever plethora of puppets it could build by disassembling your body and rearranging the parts), then you're in the clear! That sort of AI won't kill you.
But getting the AI to care about you in that way is a big alignment problem. We should totally be aiming for it, but that's the sort of problem that we don't know how to solve yet, and that we don't seem on-track to solve (as far as I can tell).
Ok, maybe my objection is that I expect it to care about us at least a tiny bit, enough to leave us be.
This is a common intuition! I won't argue against it in depth here, but I'll leave a couple points in parting:
- my position is that making the AI care a tiny bit (in the limit of capability, under reflection) is almost as hard as the entire alignment problem, and we're not on track to solve it.
- if you want to learn more about why I think that, some relevant search terms are "the orthogonality thesis" and "the fragility of value".
And disassembling us for spare parts sounds much easier than building pervasive monitoring that can successfully detect and shut down human attempts to build a competing superintelligence, even as the humans attempt to subvert those monitoring mechanisms. Why leave clever antagonists at your rear? ↩︎
Or a drone that doesn't even ask for payment, plus extra fuel for the space probes or whatever. Or actually before that, so that we don't create other AIs. But whatever. ↩︎
If you condition on misaligned AI takeover, my current (extremely rough) probabilities are:
Edit: I now think mass death and extinction are notably less likely than these probabilites. Perhaps more like 40% on >50% of people killed and 20% on >99% of people killed.
By 'kill' here I'm not including things like 'the AI cryonically preserves everyone's brains and then revives people later'. I'm also not including cases where the AI lets everyone live a normal human lifespan but fails to grant immortality or continue human civilization beyond this point.
My beliefs here are due to a combination of causal/acausal trade arguments as well as some intuitions that it's likely that AIs will be slightly cooperative/nice for decision theory reasons (ECL mostly) or just moral reasons.
To be clear, it seems totally insane to depend on this or think that this makes the situation ok. Further, note that I think it's reasonably likely that there is a bloody and horrible conflict between AIs and humanity (it just seems unlikely that this conflict kills >99% of people, so the question does not come down to conflict). Edit: seems unclear more than seems unlikely. I think conflict between AIs an humans killing >99% is plausible but not enough that I'd be confident humans die
Note that the trade and niceness bar might be extremely low as you discuss here:
Yeah, all of these scenerios with aliens seem sufficiently plausible to me that we should expect the AI to keep humans alive if it's very cheap to do so (which is what I expect).
Note that both common sense moral views and views like UDASSA imply that you should particularly value currently alive humans over future beings. I find this position somewhat implausible and none of these views seem stable under reflection. Regardless it does hint at the idea that future humans or aliens might place a considerable amount of value on keeping human civilization going. If you don't particularly value currently alive humans, then I agree that you just do the thing you'd like in your universe or you trade for asks other than keeping humans alive right now.
I also think a relatively strong version of acausal trade arguments seems plausible. Specifically it seems plausible that after the dust settles the universe looks basically similar if an AI takes over vs humans keeping control. (Note that this doesn't imply alignment is unimportant: our alignment work is directly logically entangled with total resources. For those worried about currently alive humans, you should possibly be very worried about what happens before the dust settles...).
Overall, I'm confused why you seem so insistent on making such a specific technical point which seems insanely sensitive to various hard to predict details about the future. Further, it depends on rounding errors in the future resource allocation which seems to make the situation particularly sensitive to random questions about how aliens behave etc.
My probabilities are very rough, but I'm feeling more like 1/3 ish today after thinking about it a bit more. Shrug.
As far as reasons for it being this high:
Generally, I'm happy to argue for 'we should be pretty confused and there are a decent number of good reasons why AIs might keep humans alive'. I'm not confident in survival overall though...