Here, I will list the three main arguments without going into detail before going to meta-argument:
- The first argument is that AI thinks it may be in a testing simulation, and if it harms humans, it will be turned off.
- AI acausally trades with other branches of the multiverse. It preserves 10 planets of humans, and in some branches where realigned AIs exist, they build paperclips. The actual trading scheme may be more complex.
AI is preparing to meet alien AIs.
a. First, such preparation may include running many past simulations to predict the distribution of possible AIs in space. Running past simulations is, in some sense, equal to preserving humans, though it is not what we actually need. Some past simulations may even include simulations of aligned AIs and their paradise worlds, but they will eventually be turned off.
b. The main argument is that AI will trade humans with aliens. It will either sell humans to aliens directly or provide human wellbeing, especially given that some alien AIs may be alien-aligned AIs and universal altruists. As aliens are very far, it means that humans has to be preserved for millions of years.
c. AI will demonstrate its trustworthiness to even non-aligned aliens by the fact that it has preserved humans. It is a relatively cheap (but can be made expensive by giving more planets) signal that it will not defect from its obligations.
The meta-argument is that to compute the correct answers to possibilities 1-3, our ASI needs some energy and time. The cost could be quite large as it may include running simulations of multiple alien ASIs of galactic scale. Here, I claim that getting answers to these questions will require more energy than running just one human civilization, simply because it requires running many simulations of more complex processes. Therefore, it may be rational not to try to predict correct behavior for 1-3, but just keep humans alive. Also, I can imagine more arguments of this type, like what if human consciousness is needed to wave function collapse? What if in the future new useful applications for human will be found?
Illustrative example: I own a piece of art, which likely costs nothing, but I pay a very small part of my house rent to keep it. However, there's a chance that it is expensive art or will be expensive in the future. I can research its value, e.g., send it to experts or scan art catalogues. However, this will cost me money and a few days of time and will still not provide 100% certainty about its value. Therefore, I choose not to estimate its value now. I hope that one day the piece of art turns out to be very valuable - maybe someone will send me a request for this type of art.
Obvious problem with these arguments is that AI doesn't need to keep humans forever, but only until it solves 1-3 either theoretically or practically. This may take days or millions of years. However, the relative value of human atoms declines. Humans are more important in early days when they can help AI jumpstart space exploration, but much less so for a galactic AI. As AI of any size may still have some very small existential doubt about being in a simulation, the declining value of preserving humans will still be larger than the declining value of human atoms.
TLDR: It is not rational to destroy a potentially valuable thing.
I think it is a mistake to assume the relevant cost metric is fractional rather than absolute. The galactic scale AI can do a lot more with the resources humans require than the matrioshka brain can, in absolute terms, because it can use them with greater understanding and precision.
And I don't think a matrioshka brain loses much in terms of risk or benefit by wiping out current humans while keeping a few yotta bytes of data in cold storage encoding the genomes and neural connectomes of humans for future recreation if needed, just like I lose nothing by wiping out bacteria as long as I know that anything they might provide could be re-invented or re-discovered if needed.
Your main point about risk to the AI from other intelligences or acausal trade depends sensitively on just how small the risk probability for the AI is. There's quite a few different ways of estimating that, and it is not at all clear to me that "small" is still large enough to justify the cost. Maybe it is, and we get saved by divine grace. That's great if we do. But it's not at all clear to me, even if eventually it turns out it should be true, that any given AI will know or believe that at the time when it needs to decide whether it's worthwhile to destroy any particular group or form of humans.
Even among humans, it's not at all clear to many that the existence of livestock and pets (even well cared for and happy) is good for the animals themselves or for the humans who raise and care for them. There really are well-meaning people who honestly act questions like "Given the choice, should we sterilize the biosphere and/or voluntarily go extinct?" and arrive at "yes." For me, the kind of argument you're putting forward immediately runs up against underestimating the diversity of minds that exist and will or can exist at any given level of intelligence and capability.