The Truly Iterated Prisoner's Dilemma

Eliezer Yudkowsky

Followup to: The True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

	Humans: C	Humans: D
Paperclipper: C	(2 million human lives saved, 2 paperclips gained)	(+3 million lives, +0 paperclips)
Paperclipper: D	(+0 lives, +3 paperclips)	(+1 million lives, +1 paperclip)

As most of you probably know, the king of the classical iterated Prisoner's Dilemma is Tit for Tat, which cooperates on the first round, and on succeeding rounds does whatever its opponent did last time. But what most of you may not realize, is that, if you know when the iteration will stop, Tit for Tat is - according to classical game theory - irrational.

Why? Consider the 100th round. On the 100th round, there will be no future iterations, no chance to retaliate against the other player for defection. Both of you know this, so the game reduces to the one-shot Prisoner's Dilemma. Since you are both classical game theorists, you both defect.

Now consider the 99th round. Both of you know that you will both defect in the 100th round, regardless of what either of you do in the 99th round. So you both know that your future payoff doesn't depend on your current action, only your current payoff. You are both classical game theorists. So you both defect.

Now consider the 98th round...

With humanity and the Paperclipper facing 100 rounds of the iterated Prisoner's Dilemma, do you really truly think that the rational thing for both parties to do, is steadily defect against each other for the next 100 rounds?

Followup to: The True Prisoner's Dilemma

For everyone who thought that the rational choice in yesterday's True Prisoner's Dilemma was to defect, a follow-up dilemma:

Suppose that the dilemma was not one-shot, but was rather to be repeated exactly 100 times, where for each round, the payoff matrix looks like this:

	Humans: C	Humans: D
Paperclipper: C	(2 million human lives saved, 2 paperclips gained)	(+3 million lives, +0 paperclips)
Paperclipper: D	(+0 lives, +3 paperclips)	(+1 million lives, +1 paperclip)

Now consider the 98th round...

Eliezer: the rationality of defection in these finitely repeated games has come under some fire, and there's a HUGE literature on it. Reading some of the more prominent examples may help you sort out your position on it.

My position is already sorted, I assure you. I cooperate with the Paperclipper if I think it will one-box on Newcomb's Problem with myself as Omega.

As Paul says, this is very well trodden ground. Since it hasn't been assumed that we are sure we know how the other party reasons, we might want to invest some early rounds in probing to see how the party thinks.

As someone who rejects defection as the inevitable rational solution to both the one-shot PD and the iterated PD, I'm interested in the inconsistency of those who accept defection as the rational equilibrium in the one-shot PD, but find excuses to reject it in the finitely iterated known-horizon PD.

True, the iteration does present the possibility of "exploiting" an "irrational" opponent whose "irrationality" you can probe and detect, if there's any doubt about it in your mind. But that doesn't resolve the fundamental issue of rationality; it's like saying that you'll one-box on Newcomb's Problem if you think there's even a slight chance that Omega is hanging around and will secretly manipulate box B after you make your choice. What if neither party to the IPD thinks there's a realistic chance that the other party is stupid - if they're both superintelligences, say? Do they automatically defect against each other for 100 rounds?

And are you really "exploiting" an "irrational" opponent, if the party "exploited" ends up better off? Wouldn't you end up wishing you were stupider, so you could be exploited - wishing to be unilaterally stupider, regardless of the other party's intelligence? Hence the phrase "regret of rationality"...

Do you mean "I cooperate with the Paperclipper if AND ONLY IF I think it will one-box on Newcomb's Problem with myself as Omega AND I think it thinks I'm Omega AND I think it thinks I think it thinks I'm Omega, etc." ? This seems to require an infinite amount of knowledge, no?

Edit: and you said "We have never interacted with the paperclip maximizer before", so do you think it would one-box?

31

The Truly Iterated Prisoner's Dilemma

31

31

31

The Truly Iterated Prisoner's Dilemma

31

31