But what if the system proves it will one-box, then forms a counterfactual that two-boxing will get it 10^10$ and so two-boxes. This makes the thing that it proved false, which makes the system inconsistent.
If we know the inference system to be consistent, this proves that the line of reasoning you describe can't happen. Indeed this is essentially the way we prove that the diagonal step guarantees that the agent doesn't infer its decision: if it did, that would make its inference system unsound, and we assume it's not. So what happens is that if the system proves that it will one-box, it doesn't prove that two-boxing leads to $10^10, instead it proves something that would make it one-box, such as that two-boxing leads to minus $300.
By orthonormal's suggestion, I take this out of comments.
Consider a CDT agent making a decision in a Newcomb's problem, in which Omega is known to make predictions by perfectly simulating the players. Assume further that the agent is capable of anthropic reasoning about simulations. Then, while making its decision, the agent will be uncertain about whether it is in the real world or in Omega's simulation, since the world would look the same to it either way.
The resulting problem has a structural similarity to the Absentminded driver problem1. Like in that problem, directly assigning probabilities to each of the two possibilities is incorrect. The planning-optimal decision, however, is readily available to CDT, and it is, naturally, to one-box.
Objection 1. This argument requires that Omega is known to make predictions by simulation, which is not necessarily the case.
Answer: It appears to be sufficient that the agent only knows that Omega is always correct. If this is the case, then a simulating-Omega and some-other-method-Omega are indistinguishable, so the agent can freely assume simulation.
[This is a rather shaky reasoning, I'm not sure it is correct in general. However, I hypothesise that whatever method Omega uses, if the CDT agent knows the method, it will one-box. It is only a "magical Omega" that throws CDT off.]
Objection 2. The argument does not work for the problems where Omega is not always correct, but correct with, say, 90% probability.
Answer: Such problems are underspecified, because it is unclear how the probability is calculated. [For example, Omega that always predicts "two-box" will be correct in 90% cases if 90% of agents in the population are two-boxers.] A "natural" way to complete the problem definition is to stipulate that there is no correlation between correctness of Omega's predictions and any property of the players. But this is equivalent to Omega first making a perfectly correct prediction, and then adding a 10% random noise. In this case, the CDT agent is again free to consider Omega a perfect simulator (with added noise), which again leads to one-boxing.
Objection 3. In order for the CDT agent to one-box, it needs a special "non-self-centered" utility function, which when inside the simulation would value things outside.
Answer: The agent in the simulation has exactly the same experiences as the agent outside, so it is the same self, so it values the Omega-offered utilons the same. This seems to be a general consequence of reasoning about simulations. Of course, it is possible to give the agent a special irrational simulation-fearing utility, but what would be the purpose?
Objection 4. CDT still won't cooperate in the Prisoner's Dilemma against a CDT agent with an orthogonal utility function.
Answer: damn.
1 Thanks to Will_Newsome for pointing me to this.