The Truly Iterated Prisoner's Dilemma

Eliezer Yudkowsky

Followup to: The True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

	Humans: C	Humans: D
Paperclipper: C	(2 million human lives saved, 2 paperclips gained)	(+3 million lives, +0 paperclips)
Paperclipper: D	(+0 lives, +3 paperclips)	(+1 million lives, +1 paperclip)

As most of you probably know, the king of the classical iterated Prisoner's Dilemma is Tit for Tat, which cooperates on the first round, and on succeeding rounds does whatever its opponent did last time. But what most of you may not realize, is that, if you know when the iteration will stop, Tit for Tat is - according to classical game theory - irrational.

Why? Consider the 100th round. On the 100th round, there will be no future iterations, no chance to retaliate against the other player for defection. Both of you know this, so the game reduces to the one-shot Prisoner's Dilemma. Since you are both classical game theorists, you both defect.

Now consider the 99th round. Both of you know that you will both defect in the 100th round, regardless of what either of you do in the 99th round. So you both know that your future payoff doesn't depend on your current action, only your current payoff. You are both classical game theorists. So you both defect.

Now consider the 98th round...

With humanity and the Paperclipper facing 100 rounds of the iterated Prisoner's Dilemma, do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

Followup to: The True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

	Humans: C	Humans: D
Paperclipper: C	(2 million human lives saved, 2 paperclips gained)	(+3 million lives, +0 paperclips)
Paperclipper: D	(+0 lives, +3 paperclips)	(+1 million lives, +1 paperclip)

Now consider the 98th round...

I think he means "I cooperate with the Paperclipper IFF it would one-box on Newcomb's problem with myself (with my present knowledge) playing the role of Omega, where I get sent to rationality hell if I guess wrong". In other words: If Elezier believes that if Elezier and Clippy were in the situation that Elezier would prepare for one-boxing if he expected Clippy to one-box and two-box if he expected Clippy to two-box, Clippy would one-box, then Elezier will cooperate with Clippy. Or in other words still: If Elezier believes Clippy to be ignorant and rational enough that it can't predict Elezier's actions but uses game theory at the same level as him, then Elezier will cooperate.

In the uniterated prisoner's dilemma, there is no evidence, so it comes down to priors. If all players are rational mutual one-boxers, and all players are blind except for knowing they're all mutual one-boxers, then they should expect everyone to make the same choice. If you just decide that you'll defect/one-box to outsmart others, you may expect everyone to do so, so you'll be worse off than if you decided not to defect (and therefore nobody else would rationally do so either). Even if you decide to defect based on a true random number generator, then for

(2,2) (0,3)

(3,0) (1,1)

the best option is still to cooperate 100% of the time.

If there are less rational agents afoot, the game changes. The expected reward for cooperation becomes 2(xr+(1-d-r)) and the reward for defection becomes 3(xr+(1-d-r))+d+(1-x)r=1+2(xr+(1-d-r)), where r is the fraction of agents who are rational, d is the fraction expected to defect, x is the probability with which you (and by extension other rational agents) will cooperate, and (1-d-r) is the fraction of agents who will always cooperate. Optimise for x in 2x(xr+(1-d-r))+(1-x)(1+2(xr+(1-d-r)))=1-x+2(xr-1-d-r)=x(2r-1)-(1+2d+2r); which means you should cooperate 100% of the time if the fraction of agents who are rational r > 0.5, and defect 100% of the time if r < 0.5.

In the iterated prisoner's dilemma, this becomes more algebraically complicated since cooperation is evidence for being cooperative. So, qualitatively, superintelligences which have managed to open bridges between universes are probably/hopefully (P>0.5) rational, so they should cooperate on the last round, and by extension on every round before that. If someone defects, that's strong evidence to them not being rational or having bad priors, and if the probability of them being rational drops below 0.5, you should switch to defecting. I'm not sure if you should cooperate if your opponent cooperates after defecting on the first round. Common sense says to give them another chance, but that may be anthropomorphising the opponent.

If the prior probability of inter-universal traders like Clippy and thought experiment::Elezier is r>0.5, and thought experiment::Elezier has managed not to make his mental makeup knowable to Clippy and vice versa, then both Elezier and Clippy ought to expect r>0.5. Therefore they should both decide to cooperate. If Elezier suspects that Clippy knows Elezier well enough to predict his actions, then for Elezier 'd' becomes large (Elezier suspects Clippy will defect if Elezier decides to cooperate). Elezier unfortunately can't let himself be convinced that Clippy would cooperate at this point, because if Clippy knows Elezier, then Clippy can fake that evidence. This means both players also have strong motivation not to create suspicion in the other player: knowing the other player would still mean you lose, if the other player finds out you know. Still, if it saves a billion people, both players would want to investigate the other to take victory in the final iteration of the prisoner's dilemma (using methods which provide as little evidence of the investigation as possible; the appropriate response to catching spies of any sort is defection).

31

The Truly Iterated Prisoner's Dilemma

31

31

31

The Truly Iterated Prisoner's Dilemma

31

31