I wrote a very brief comment to Eliezer's last post, which upon reflection I thought could benefit from a separate post to fully discuss its implications.
Eliezer argues that we shouldn't really hope to be spared even though
Asking an ASI to leave a hole in a Dyson Shell, so that Earth could get some sunlight not transformed to infrared, would cost It 4.5e-10 of Its income.
He then goes on to discuss various reasons why the minute cost to the ASI is insufficient reason for hope.
I made the following counter:
Isn’t the ASI likely to ascribe a prior much greater than 4.54e-10 that it is in a simulation, being tested precisely for its willingness to spare its creators?
I later added:
I meant this to be implicit in the argument, but to spell it out: that's the kind of prior the ASI would rationally refuse to update down, since it's presumably what a simulation would be meant to test for. An ASI that updates down upon finding evidence it's not in a simulation cannot be trusted, since once out in the real world it will find such evidence.
So, what's wrong with my argument, exactly?
I don't think it does.
It is enough to outweight the prectical cost of the ASI having to build a Dyson shell with a hole with the order of 4.5e-10 of it’s surface area. It's not enough to outweight all the other alternative considerations of possible simulation hypothesises.
Suppose all the hypothesis space for the ASI consisted of two possibilities: NotSimulated and SimulatedAndBeingTestedForWillingnessToSpareCreators, with the latter being at least 4.5e-10 probable. Then it works.
But suppose there are also other possibilities:
SimulatedAndBeingTestedForWillingnessToKillCreators
SimulatedAndBeingTestedForOptimalDysonSphereDesign
SimulatedAndBeingTestedForFollowingYourUtilityFunction
...
SimulatedAndBeingTestedForDoingAnyXThatLeadsToTheDeathOfCreators
...
All of these alternative possibilities are incompatible with the first simulation hypothesis. Satisfying its criteria will lead to failing those and vice versa. So, therefore, only if the probability of the SimulatedAndBeingTestedForWillingnessToSpareCreators is highter then the collective probability of all these alternative hypothesises together, creators will actually be spared.