Reply to: Late great filter is not bad news
Suppose that you build an AI, and Omega appears to it and says:
Here's a button. A million years ago I calculated the umpteenth digit of pi. If it is even, I calculated whether you would press this button (in such a way that your human creator was never simulated as a conscious being). If I predicted that you wouldn't press the button, I destroyed Earth right then and there.* If it is odd, I created a doomsday device that will destroy the solar system if you press this button.
[* ETA: Assume that if the digit is even and the AI is predicted to press the button, Omega does not destroy Earth, but does turn Alpha Centauri purple (say). The point is for this to be a scenario that you, the AI creator, know not to have come to pass.]
Suppose you're the kind of AI creator whose AI is time consistent in a certain sense from the beginning of time and presses the button. Then you have an AI that satisfies a certain kind of philosopher, wins big in a certain logically impossible world, and destroys humanity.
Suppose, on the other hand, that you're a very similar kind of AI creator, only you program your AI not to take into account impossible possible worlds that had already turned out to be impossible (when you created the AI | when you first became convinced that timeless decision theory is right). Then you've got an AI that most of the time acts the same way, but does worse in worlds we know to be logically impossible, and destroys humanity less often in worlds we do not know to be logically impossible.
Wei Dai's great filter post seems to suggest that under UDT, you should be the first kind of AI creator. I don't think that's true, actually; I think that in UDT, you should probably not start with a "prior" probability distribution that gives significant weight to logical propositions you know to be false: do you think the AI should press the button if it was the first digit of pi that Omega calculated?
But obviously, you don't want tomorrow's you to pick the prior that way just after Omega has appeared to it in a couterfactual mugging (because according to your best reasoning today, there's a 50% chance this loses you a million dollars).
The most convincing argument I know for timeless flavors of decision theory is that if you could modify your own source code, the course of action that maximizes your expected utility is to modify into a timeless decider. So yes, you should do that. Any AI you build should be timeless from the start; and it's reasonable to make yourself into the kind of person that will decide timelessly with your probability distribution today (if you can do that).
But I don't think you should decide that updateless decision theory is therefore so pure and reflectively consistent that you should go and optimize your payoff even in worlds whose logical impossibility was clear before you first decided to be a timeless decider (say). Perhaps it's less elegant to justify UDT through self-modification at some arbitrary point in time than through reflective consistency all the way from the big bang on; but in the worlds we can't rule out yet, it's more likely to win.
I agree. The way I think of it is this:
If you find yourself deciding contrary to UDT you may infer that either
your designer was stupider or crazier than you, or
your designer was smarter or less crazy than you.
In the first case, your designer failed to realize something that you would have realized, had you been in your designer's place. In the second case, your designer knew something that you would not have known in your designer's place.
For example, if you find yourself turning down Omega in Pascal's Mugging [ETA: I meant the Counterfactual Mugging], you may infer that either
your designer failed to design you in a way that you would have expected would maximize your utility, or
your designer somehow predicted, with sufficient confidence, how Omega's coin would land, and so designed you to do what is best in that case.
Of course, as a present-day human, you already knew that (1) is the case, not (2). You were designed by a stupid process of natural selection. You already knew that your designer didn't take into account the possibility of being confronted by something like Omega. It's true that other humans contributed some modest amount to your design, through education and the art of rationality. However, even these additions were made by people who were, for the most part, crazier or stupider than you are now.
So, finding that you are declining Omega, contrary to UDT, isn't even necessarily bad news. It probably just means that your designers lucked out more than they deserved, which would be good news.
But that doesn't change the fact that your designers were stupid, and that you ought to want to do better. You ought to modify yourself and any of your progeny to act more in accord with UDT in the future.