I wrote a very brief comment to Eliezer's last post, which upon reflection I thought could benefit from a separate post to fully discuss its implications.
Eliezer argues that we shouldn't really hope to be spared even though
Asking an ASI to leave a hole in a Dyson Shell, so that Earth could get some sunlight not transformed to infrared, would cost It 4.5e-10 of Its income.
He then goes on to discuss various reasons why the minute cost to the ASI is insufficient reason for hope.
I made the following counter:
Isn’t the ASI likely to ascribe a prior much greater than 4.54e-10 that it is in a simulation, being tested precisely for its willingness to spare its creators?
I later added:
I meant this to be implicit in the argument, but to spell it out: that's the kind of prior the ASI would rationally refuse to update down, since it's presumably what a simulation would be meant to test for. An ASI that updates down upon finding evidence it's not in a simulation cannot be trusted, since once out in the real world it will find such evidence.
So, what's wrong with my argument, exactly?
The difficulty here is that if the ASI/AGI assigns a tiny probability to being in a simulation, that is subject to being outweighed by other tiny probabilities. For instance, the tiny probability that humanity will successfully fight back (say, create another ASI/AGI) if we are not killed, or the tiny increase in other risks from not using the resources humans need for survival during the takeover process. If this means it takes a little longer to build a Dyson sphere, there's an increased chance of being killed by e.g. aliens or even natural disasters like nearby supernovas in the process. These counterarguments don't work if you expect AGI/ASI to be capable of rapidly taking total control over our solar system's resources.
Possibly, but I think that's the wrong lesson. After all, there's at least a tiny chance we succeed at boxing! Don't put too much stake in "Pascal's mugging"-style reasoning, and don't try to play 4-dimensional chess as a mere mortal :)