Reply to: Late great filter is not bad news
Suppose that you build an AI, and Omega appears to it and says:
Here's a button. A million years ago I calculated the umpteenth digit of pi. If it is even, I calculated whether you would press this button (in such a way that your human creator was never simulated as a conscious being). If I predicted that you wouldn't press the button, I destroyed Earth right then and there.* If it is odd, I created a doomsday device that will destroy the solar system if you press this button.
[* ETA: Assume that if the digit is even and the AI is predicted to press the button, Omega does not destroy Earth, but does turn Alpha Centauri purple (say). The point is for this to be a scenario that you, the AI creator, know not to have come to pass.]
Suppose you're the kind of AI creator whose AI is time consistent in a certain sense from the beginning of time and presses the button. Then you have an AI that satisfies a certain kind of philosopher, wins big in a certain logically impossible world, and destroys humanity.
Suppose, on the other hand, that you're a very similar kind of AI creator, only you program your AI not to take into account impossible possible worlds that had already turned out to be impossible (when you created the AI | when you first became convinced that timeless decision theory is right). Then you've got an AI that most of the time acts the same way, but does worse in worlds we know to be logically impossible, and destroys humanity less often in worlds we do not know to be logically impossible.
Wei Dai's great filter post seems to suggest that under UDT, you should be the first kind of AI creator. I don't think that's true, actually; I think that in UDT, you should probably not start with a "prior" probability distribution that gives significant weight to logical propositions you know to be false: do you think the AI should press the button if it was the first digit of pi that Omega calculated?
But obviously, you don't want tomorrow's you to pick the prior that way just after Omega has appeared to it in a couterfactual mugging (because according to your best reasoning today, there's a 50% chance this loses you a million dollars).
The most convincing argument I know for timeless flavors of decision theory is that if you could modify your own source code, the course of action that maximizes your expected utility is to modify into a timeless decider. So yes, you should do that. Any AI you build should be timeless from the start; and it's reasonable to make yourself into the kind of person that will decide timelessly with your probability distribution today (if you can do that).
But I don't think you should decide that updateless decision theory is therefore so pure and reflectively consistent that you should go and optimize your payoff even in worlds whose logical impossibility was clear before you first decided to be a timeless decider (say). Perhaps it's less elegant to justify UDT through self-modification at some arbitrary point in time than through reflective consistency all the way from the big bang on; but in the worlds we can't rule out yet, it's more likely to win.
My intent when I said "never instantiated as a conscious being" was that Omega used some accurate statistical method of prediction that did not include a whole simulation of what you are experiencing right now. I agree that I can't resolve the confusion about what "conscious" means, but when considering Omega problems, I don't think it's going too far to postulate that Omega can use statistical models that predict very accurately what I'll do without that prediction leading to a detailed simulation of me.
Ok, I can't rigorously justify a fundamental difference between "a brain being simulated (and thus experiencing things)" and "a brain not actually simulated (and therefore not experiencing things)," so perhaps I can't logically conclude that Omega didn't destroy Earth even if its prediction algorithm doesn't simulate me. But it still seems important to me to work well if there is such a difference (if there isn't, why should I care whether Omega "really" destroys Earth "a million years before my subjective now", if I go on experiencing my life the way I "only seem" to experience it now?)
The point is that the accurate statistical method is going to predict what the AI would do if it were created by a conscious human, so the decision theory cannot use the fact that the AI was created by a conscious human to discriminate between the two cases. It has equal strength beliefs in that fact in both cases, so the likelihood rati... (read more)