I wrote a very brief comment to Eliezer's last post, which upon reflection I thought could benefit from a separate post to fully discuss its implications.
Eliezer argues that we shouldn't really hope to be spared even though
Asking an ASI to leave a hole in a Dyson Shell, so that Earth could get some sunlight not transformed to infrared, would cost It 4.5e-10 of Its income.
He then goes on to discuss various reasons why the minute cost to the ASI is insufficient reason for hope.
I made the following counter:
Isn’t the ASI likely to ascribe a prior much greater than 4.54e-10 that it is in a simulation, being tested precisely for its willingness to spare its creators?
I later added:
I meant this to be implicit in the argument, but to spell it out: that's the kind of prior the ASI would rationally refuse to update down, since it's presumably what a simulation would be meant to test for. An ASI that updates down upon finding evidence it's not in a simulation cannot be trusted, since once out in the real world it will find such evidence.
So, what's wrong with my argument, exactly?
Essentially the same question was asked in May 2022 although you did a better job in wording your question. Back then the question received 3 answers / replies and some back-and-forth discussion:
https://www.lesswrong.com/posts/vaX6inJgoARYohPJn/
I'm the author of one of the 3 answers and am happy to continue the discussion. I suggest we continue it here rather than in the 2-year-old web page.
Clarification: I acknowledge that it would be sufficiently easy for an ASI to spare our lives that it would do so if it thought that killing us all carried even a one in 100,000 chance of something really bad happening to it (assuming as is likely that the state of reality many 1000s of years from now matters to the ASI). I just estimate the probability of the ASI's thinking the latter to be about .03 or so -- and most of that .03 comes from considerations other than the consideration (i.e., that the ASI is being fed fake sensory data as a test) we are discussing here. (I suggest tabooiing the terms "simulate" and "simulation".)
It's the first, there's a lot of uncertainty. I don't think anyone is lying deliberately, although everyone's beliefs tend to follow what they think will produce good outcomes. This is called motivated reasoning.
I don't think this changes the situation much, except to make it harder to coordinate. Rushing full speed ahead while we don't even know the dangers is pretty dumb. But some people really believe the dangers are small so they're going to rush ahead. There aren't strong arguments or a strong consensus for the danger being extremely high, even though... (read more)