Assumptions: The game uses the payout matrix described OP, and the second player learns of the first player's move before making his move. Both players know that both players are trying to win and will not use a strategy which does not result in them winning.
My conclusion is that both players defect. My problem is that it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated.
I've thrown out cooperatebot and reverse quid pro quo as candidates for best strategy.
FYI: I'm using this as my reference, and this hinges on reflexive inconsistency. I can't find a reflexively consistent strategy even with only two options available. (Note that defectbot consistently equals or outperforms quid pro quo in all cases)
Again, you don't sound like you've read this post here. Let's say that, in fact, "it would be better for player 2 if player 2 did not have the option to defect if player 1 cooperated" - though I'm not at all sure of that, when player 2 is Omega - and let's say Omega uses TDT. Then it will ask counterfactual questions about what "would" happen if Omega's own abstract decision procedure gave various answers. Because of the nature of the counterfactuals, these will screen off any actions by player 1 that depend on said answers, even 'known...
Sometimes I see new ideas that, without offering any new information, offers a new perspective on old information, and a new way of thinking about an old problem. So it is with this lecture and the prisoner's dilemma.
Now, I worked a lot with the prisoners dilemma, with superrationality, negotiations, fairness, retaliation, Rawlsian veils of ignorance, etc. I've studied the problem, and its possible resolutions, extensively. But the perspective of that lecture was refreshing and new to me:
The prisoner's dilemma is resolved only when the off-diagonal outcomes of the dilemma are known to be impossible.
The "off-diagonal outcomes" are the "(Defect, Cooperate)" and the "(Cooperate, Defect)" squares where one person walks away with all the benefit and the other has none:
Facing an identical (or near identical) copy of yourself? Then the off-diagonal outcomes are impossible, because you're going to choose the same thing. Facing Tit-for-tat in an iterated prisoner's dilemma? Well, the off-diagonal squares cannot be reached consistently. Is the other prisoner a Mafia don? Then the off-diagonal outcomes don't exist as written: there's a hidden negative term (you being horribly murdered) that isn't taken into account in that matrix. Various agents with open code are essentially publicly declaring the conditions under which they will not reach for the off-diagonal. The point of many contracts and agreements is to make the off-diagonal outcome impossible or expensive.
As I said, nothing fundamentally new, but I find the perspective interesting. To my mind, it suggests that when resolving the prisoner's dilemma with probabilistic outcomes allowed, I should be thinking "blocking off possible outcomes", rather than "reaching agreement".