Please stop using the words "rational" and "optimal", and give me some sign that you've read the linked post on counterfactuals rather than asking counterfactual questions whose assumptions you refuse to spell out.
The only difficult question here concerns the imbalance in knowledge between Omega and a human, per comment by shminux. Because of this, I don't actually know what TDT does here (much less 'rationality').
Assumptions: The game uses the payout matrix described OP, and the second player learns of the first player's move before making his move. Both players know that both players are trying to win and will not use a strategy which does not result in them winning.
My conclusion is that both players defect. My problem is that it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated.
I've thrown out cooperatebot and reverse quid pro quo as candidates for best strategy.
FYI: I'm using this as my reference, and this hinges ...
Sometimes I see new ideas that, without offering any new information, offers a new perspective on old information, and a new way of thinking about an old problem. So it is with this lecture and the prisoner's dilemma.
Now, I worked a lot with the prisoners dilemma, with superrationality, negotiations, fairness, retaliation, Rawlsian veils of ignorance, etc. I've studied the problem, and its possible resolutions, extensively. But the perspective of that lecture was refreshing and new to me:
The prisoner's dilemma is resolved only when the off-diagonal outcomes of the dilemma are known to be impossible.
The "off-diagonal outcomes" are the "(Defect, Cooperate)" and the "(Cooperate, Defect)" squares where one person walks away with all the benefit and the other has none:
Facing an identical (or near identical) copy of yourself? Then the off-diagonal outcomes are impossible, because you're going to choose the same thing. Facing Tit-for-tat in an iterated prisoner's dilemma? Well, the off-diagonal squares cannot be reached consistently. Is the other prisoner a Mafia don? Then the off-diagonal outcomes don't exist as written: there's a hidden negative term (you being horribly murdered) that isn't taken into account in that matrix. Various agents with open code are essentially publicly declaring the conditions under which they will not reach for the off-diagonal. The point of many contracts and agreements is to make the off-diagonal outcome impossible or expensive.
As I said, nothing fundamentally new, but I find the perspective interesting. To my mind, it suggests that when resolving the prisoner's dilemma with probabilistic outcomes allowed, I should be thinking "blocking off possible outcomes", rather than "reaching agreement".