dankane comments on Causal decision theory is unsatisfactory - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (158)
This is closer to describing the self-modifying CDT approach. One of the motivations for development of TDT and UDT is that you don't necessarily get an opportunity to do such self-modification beforehand, let alone to compute the optimal decisions for all possible scenarios you think might occur.
So the idea of UDT is that the design of the code should already suffice to guarantee that if you end up in a newcomblike situation you behave "as if" you did have the opportunity to do whatever precommitment would have been useful. When prompted for a decision, UDT asks "what is the (fixed) optimal conditional strategy" and outputs the result of applying that strategy to its current state of knowledge.
Basically this, except there's no need to actually do it beforehand.
If you like, you can consider the UDT agent's code itself to be the output of such "preprocessing"... except that there is no real pre-computation required, apart from giving the UDT agent a realistic prior.
Actually, no. To implement things correctly, UDT needs to determine its entire strategy all at once. It cannot decide whether to one-box or two-box in Newcomb just by considering the Newcomb that it is currently dealing with. It must also consider all possible hypothetical scenarios where any other agent's action depends on whether or not UDT one-boxes.
Furthermore, UDT cannot decide what it does in Newcomb independently of what it does in the Counterfactual Mugging, because some hypothetical entity might give it rewards based on some combination of the two behaviors. UDT needs to compute its entire strategy (i.e. it's response to all possible scenarios) all at the same time before it can determine what it should do in any particular situation [OK. Not quite true. It might be able to prove that whatever the optimal strategy is it involves doing X in situation Y without actually determining the optimal strategy. Then again, this seems really hard since doing almost anything directly from Kolmogorov priors is basically impossible].
Conceptually, yes. The point is that you don't need to actually literally explicitly compute your entire strategy at
t=-∞. All you have to do is prove a particular property of the strategy (namely, its action in situation Y) at the time when you are asked for a decision.Obviously, like every computational activity ever, you must still make approximations, because it is usually infeasible to make inferences over the entire tegmark-IV multiverse when you need to make a decision. An example of such approximations would be neglecting the measure of "entities that give it rewards based on some combination of [newcomb's and counterfactual mugging]" in many situations because I expect such things to be rare (significantly rarer than newcomb's and counterfactual mugging themselves).