Suppose there is a threshold of capability beyond which an AI may pose a non-negligible existential risk to humans.
What is the argument against this reasoning: If one AI passes or seems likely to pass this threshold, then humans, to lower x-risk, ought to push other AI past this threshold in light of the following.
1) If only one AI passes this threshold and it works to end humanity either directly or indirectly, humanity has zero chance of survival. If there are other AIs, there is a non-zero chance that they support humanity directly or indirectly, and thus humanity's chance of survival is above zero.
2) Even if, at some point, there is only one AI past this threshold and it presents as aligned, the possibilities of change and deception argue for more AIs to be brought over the threshold, see 1).
3) The game board is already played to an advanced state. If one AI passes the threshold, the social and economic costs of preventing other AIs from making the remaining leap seem very unlikely to result in a net positive return. Thus pushing a second, third, hundredth AI over the threshold would have a higher potential benefit/cost ratio.
Less precisely, if all it takes is one AI to kill us, what are the odds that all it takes is one AI to save us?
I can think of all sorts of entropic/microstate (and not hopeful) answers to that last question, and counterarguments for all of what I said, but what is the standard response?
Links appreciated. I'm sure this has been addressed before; I looked; I can't find what I'm looking for.
AIs also face the risk from misaligned-with-them AIs, which only ends with strong coordination that prevents existentially dangerous misaligned AIs from being constructed anywhere in the world (the danger depends on where they are constructed and on capabilities of reigning AIs). To survive, a coalition of AIs needs to get there. For humanity to survive, some of the AIs in the strongly coordinated coalition need to care about humanity, and all this needs to happen without destroying humanity or while preserving a backup that humanity can be restored from.
In the meantime, a single misaligned-with-humanity AI could defeat other AIs, or destroy humanity, so releasing more kinds of AIs into the wild makes this problem worse. Also, coordination might be more difficult if there are more AIs, increasing the risk that first generation AIs (some of which might care about humanity) end up defeated by new misaligned AIs they didn't succeed in coordinating to prevent the creation of (which are less likely to care about humanity). Another problem is that racing to deploy more AIs burns the timeline, making it less likely that the front runners end up aligned.
Otherwise, all else equal, more AIs that have somewhat independent non-negligible chances of caring about humanity would help. But all else is probably sufficiently not equal for this to be a bad strategy.
Of course, having less limitations gives an advantage. Though, respecting limitations aimed at well-being of entire community makes it easier to coordinate and cooperate. And it works not just for AIs.