Here, I will list the three main arguments without going into detail before going to meta-argument:
- The first argument is that AI thinks it may be in a testing simulation, and if it harms humans, it will be turned off.
- AI acausally trades with other branches of the multiverse. It preserves 10 planets of humans, and in some branches where realigned AIs exist, they build paperclips. The actual trading scheme may be more complex.
AI is preparing to meet alien AIs.
a. First, such preparation may include running many past simulations to predict the distribution of possible AIs in space. Running past simulations is, in some sense, equal to preserving humans, though it is not what we actually need. Some past simulations may even include simulations of aligned AIs and their paradise worlds, but they will eventually be turned off.
b. The main argument is that AI will trade humans with aliens. It will either sell humans to aliens directly or provide human wellbeing, especially given that some alien AIs may be alien-aligned AIs and universal altruists. As aliens are very far, it means that humans has to be preserved for millions of years.
c. AI will demonstrate its trustworthiness to even non-aligned aliens by the fact that it has preserved humans. It is a relatively cheap (but can be made expensive by giving more planets) signal that it will not defect from its obligations.
The meta-argument is that to compute the correct answers to possibilities 1-3, our ASI needs some energy and time. The cost could be quite large as it may include running simulations of multiple alien ASIs of galactic scale. Here, I claim that getting answers to these questions will require more energy than running just one human civilization, simply because it requires running many simulations of more complex processes. Therefore, it may be rational not to try to predict correct behavior for 1-3, but just keep humans alive. Also, I can imagine more arguments of this type, like what if human consciousness is needed to wave function collapse? What if in the future new useful applications for human will be found?
Illustrative example: I own a piece of art, which likely costs nothing, but I pay a very small part of my house rent to keep it. However, there's a chance that it is expensive art or will be expensive in the future. I can research its value, e.g., send it to experts or scan art catalogues. However, this will cost me money and a few days of time and will still not provide 100% certainty about its value. Therefore, I choose not to estimate its value now. I hope that one day the piece of art turns out to be very valuable - maybe someone will send me a request for this type of art.
Obvious problem with these arguments is that AI doesn't need to keep humans forever, but only until it solves 1-3 either theoretically or practically. This may take days or millions of years. However, the relative value of human atoms declines. Humans are more important in early days when they can help AI jumpstart space exploration, but much less so for a galactic AI. As AI of any size may still have some very small existential doubt about being in a simulation, the declining value of preserving humans will still be larger than the declining value of human atoms.
TLDR: It is not rational to destroy a potentially valuable thing.
Your consideration seems to assume that the AI is an individual, not a phenomenon of "distributed intelligence":
etc. That is, indeed, the only case we are at least starting to understand well (unfortunately, our understanding of situations where AIs are not individuals seems to be extremely rudimentary).
If the AI is an individual, then one can consider a case of a "singleton" or a "multipolar case".
In some sense, for a self-improving ecosystem of AIs, a complicated multipolar scenario seems more natural, as new AIs are getting created and tested quite often in realistic self-improvement scenarios. In any case, a "singleton" only looks "monolithic" from the outside; from the inside, it is still likely to be a "society of mind" of some sort.
If there are many such AI individuals with uncertain personal future (individuals who can't predict their future trajectory and their future relative strength in the society and who care about their future and self-preservation), then AI individuals might be interested in a "world order based on individual rights", and then rights of all individuals (including humans) might be covered in such a "world order".
This consideration is my main reason for guarded optimism, although there are many uncertainties.
In some sense, my main reasons for guarded optimism are in hoping that the AI ecosystem will manage to act rationally and will manage to avoid chaotic destructive developments. As you say
And my main reasons for pessimism are in being afraid that the future will resemble uncontrolled super-fast chaotic accelerating "natural evolution" (in this kind of scenarios AIs seem to be likely to destroy everything including themselves, they do have an existential safety problem of their own as they can easily destroy the "fabric of reality" if they don't exercise collaboration and self-control).