We could make an ad-hoc repair to TDT by saying that you're not allowed to infer from a logical fact to another logical fact going via a physical (empirical) fact.
In this case, the mistake happened because we went from "My decision algorithm's output" (Logical) to "Money in box" (Physical) to "Digits of Pi" (Logical), where the last step involved following an arrow on a causal graph backwards: The digits of Pi has a causal arrow going into the "money in box" node.
The TDT dependency inference could be implemented by, for example, by first making all sufficiently simple logical inferences from "My decision algorithm's output" to be made, and a limited set of logical nodes generated, and then physical influences tracked forward from there.
The key is that in the step where you infer logical consequences of the logical node for your decision algorithm, you should only be able to use mathematical proofs, not empirical evidence. Once you've done all you can with proofs (logical influence), then place all relevant derived logical facts in your causal graph, and use causal decision theory as usual.
The logical/physical distinction itself can be seen as ad-hoc: you can consider the whole set-up Q as a program that is known to you (R), because the rules of the game were explained, and also consider yourself (R) as a program known to Q. Then, Q can reason about R in interaction with various situations (that is, run, given R, but R is given as part of Q, so "given R" doesn't say anything about Q), and R can do the same with Q (and with the R within that Q, etc.). Prisoner's dilemma can also be represented in this way, even though nobody is pull...
According to Ingredients of Timeless Decision Theory, when you set up a factored causal graph for TDT, "You treat your choice as determining the result of the logical computation, and hence all instantiations of that computation, and all instantiations of other computations dependent on that logical computation", where "the logical computation" refers to the TDT-prescribed argmax computation (call it C) that takes all your observations of the world (from which you can construct the factored causal graph) as input, and outputs an action in the present situation.
I asked Eliezer to clarify what it means for another logical computation D to be either the same as C, or "dependent on" C, for purposes of the TDT algorithm. Eliezer answered:
I replied as follows (which Eliezer suggested I post here).
If that's what TDT means by the logical dependency between Platonic computations, then TDT may have a serious flaw.
Consider the following version of the transparent-boxes scenario. The predictor has an infallible simulator D that predicts whether I one-box here [EDIT: if I see $1M]. The predictor also has a module E that computes whether the ith digit of pi is zero, for some ridiculously large value of i that the predictor randomly selects. I'll be told the value of i, but the best I can do is assign an a priori probability of .1 that the specified digit is zero.