You're right, and I don't know how to fix the problem without adding the equivalent of "playing chicken". But now I wonder why the "playing chicken" step needs to consider the possible actions one at a time. Instead suppose that using an appropriate oracle, the agent looks through all theorems of PA, and returns action a as soon as it sees a theorem of the form A()≠a, for any a. This was my original interpretation of cousin_it's post. Does this have the same problem, if two otherwise identical agents look through the theorems of PA in different orders?
Well, depend how different the order is...
If there are theorem in PA of the form "If there are theorems of the form A()≠a and of the form A'()≠a' then the a and the a' such that the corresponding theorem come first in the appropriate ordering must be identical." then you should be okay in the prisoner dilemma setting but otherwise there will be a model of PA in which the two players end up playing different actions and we end up in the same situation as in the post.
More generally, no matter how you try to cut it, there will always be a model of P...
Let's consider the agent given in A model of UDT with a halting oracle. One will notice that that agent is not quite well defined because it doesn't tell us in what order we are supposed to consider actions in step 1. But surely that doesn't matter, right? Wrong.
Let's consider the prisoner dilemma with payment matrix given by
and consider agent A which consider whether there is a proof that A()≠D before considering whether there is a proof that A()≠C and agent A' which do things in the opposite order. If A or A' is pitted against itself everything is well and mutual cooperation is the result of the game but what if A is pitted against A'? Then A break down and cry.
Let's call the utility functions of A U and the utility function of A' U' and consider a model of PA in which PA is inconsistent (such a model must exist if PA is consistent). In such a model we will have A()=D and A'()=C and so U()=5 and U'()=0. That means that A will not be able to prove that A()=D => U()=u for any u different from 5 and so either A will defect and A' will cooperate or A will break down and cry, but A' will not cooperate because it cannot prove A'()=C => U()=u' for any u' except possibly 0, so A will break down and cry. QED
More generally if M is a model of PA in which PA is inconsistent, an agent defined in this way will never be able to prove that A()=a => U()=u (where a is the first action considered in step 1) except possibly for u=u0 where u0 is the value of U() in M. That seems to create a huge problem for that approach to UDT.