Prerequisites: Familiarity with decision theories (in particular, Eliezer's Timeless Decision Theory) and of course the Prisoner's Dilemma.
Summary: I show an apparent paradox in a three-agent variant of the Prisoner's Dilemma: despite full knowledge of each others' source codes, TDT agents allow themselves to be exploited by CDT, and lose completely to another simple decision theory. Please read the post and think for yourself about the Exercises and the Problem below before reading the comments; this is an opportunity to become a stronger expert at and on decision theory!
We all know that in a world of one-shot Prisoner's Dilemmas with read-access to the other player's source code, it's good to be Timeless Decision Theory. A TDT agent in a one-shot Prisoner's Dilemma will correctly defect against an agent that always cooperates (call this CooperateBot) or always defects (call this DefectBot, and note that CDT trivially reduces to this agent), and it will cooperate against another TDT agent (or any other type of agent whose decision depends on TDT's decision in the appropriate way). In fact, if we run an evolutionary contest as Robert Axelrod famously did for the Iterated Prisoner's Dilemma, and again allow players to read the other players' source codes, TDT will annihilate both DefectBot and CooperateBot over the long run, and it beats or ties any other decision theory.1 But something interesting happens when we take players in threes...
Consider a population of agents in a simulated world. Omega, being the trickster from outside the Matrix that ze is, decides to spend a couple of eons playing the following game with these agents: ze selects three of them at random (call them X, Y and Z), wipes their memories,2 gives them each others' source codes, and privately asks each whether they cooperate or defect. If X defects, then Omega will create 2 "children" of X (distinct near-copies of X, with the same decision theory as X) and return them to the simulation. If X cooperates, then Omega will return 3 "children" of Y and 3 "children" of Z to the simulation. Simultaneously, Y and Z make the analogous decisions.
(Just to reiterate: cooperating gives +3 to each other player, nothing to oneself; defecting gives +2 to oneself, nothing to anyone else. The analogy to the Prisoner's Dilemma should be obvious.)
Assume maximal selfishness: each agent is motivated solely to maximize its own number of children (the agent itself doesn't get returned!), and doesn't care about the other agents using the same decision theory, or even about its other "relatives" in the simulation.3 Although we've had to explicitly state quite a few conditions, this seems like a pretty simple and fair evolutionary tournament.
It's clear that CDT agents will simply defect each time. What about TDT agents?
Exercise 1: Prove that if the population consists of TDT agents and DefectBots, then a TDT agent will cooperate precisely when at least one of the other agents is also TDT. (Difficulty: 1 star.)
Notice that we've created a free-rider problem. Any DefectBot paired with two TDT agents gets 8 children- even better than the 6 that each of three TDT agents get in their best case! As you might expect, this bonus balances against the fact that three TDTs played together will fare much better than three DefectBots played together, and so it turns out that the population settles into a nontrivial equilibrium:
Exercise 2: Prove that if a very large population starts with equal numbers of TDTs and DefectBots, then the expected population growth in TDTs and DefectBots is practically equal. (If Omega samples with replacement– assuming that the agents don't care about their exact copy's children– then the expected population growth is precisely equal.) (Difficulty: 2 stars.)
Exercise 3: Prove that if the initial population consists of TDTs and DefectBots, then the ratio of the two will (with probability 1) tend to 1 over time. (Difficulty: 3 stars.)
This should already perplex the reader who believes that rationalists should win, and that in particular TDT should beat the socks off of DefectBots in any fair fight. The DefectBots aren't harmless parasites, either: the TDTs' rate of reproduction in equilibrium with DefectBots is less than 30% of their rate of reproduction in a population of TDTs alone! (Easy to verify if you've done Problem 2.)
And it gets worse, in two ways.
First, if we adjust the payoff matrix so that defecting gets (+200,+0,+0) and cooperating gets (+0,+201,+201), then any population of TDTs and DefectBots ends up (with probability 1) with the DefectBots outnumbering TDTs by a ratio of 100:1. (Easy to verify if you've done Exercise 3.)
Secondly and more damningly, we can introduce CliqueBots, who cooperate only if both other agents are CliqueBots. These, and not TDTs, are the champions of the three-way Prisoner's Dilemmas:
Exercise 4: If the initial population consists of CliqueBots, DefectBots and TDTs4 in any proportion, then the ratio of both others to CliqueBots approaches 0 (with probability 1). (Difficulty: 4 stars.)
Problem: The setup looks perfectly fair for TDT agents. So why do they lose? (Difficulty: 2+3i stars.)
Note: This problem is solved in a more general followup post; but do try and think about it yourself first! Also, I've posted solutions to the exercises in the discussion section. It's worth noting that I asked Eliezer, Vladimir Nesov, Wei Dai, cousin_it, and other decision-theory heavyweights to avoid posting spoilers on the main problem below, and they obliged; many of the other commenters figured out the problem, but nobody posted solutions to the exercises in the comments (as of 9/5/11).
Footnotes:
1. What's meant by "beat or tie" is sort of complicated- there are decision theories that (with high probability) crush TDT when starting out as a majority, and (with equally high probability) get crushed by TDT when starting out as a minority. Also, it gets intractably messy if you allow populations consisting of three or more decision theories (because then A can decide, not just based on how B will decide to act towards A, but also on how B would act towards C, etc).
2. No playing Tit-for-Tat in this game! Omega will make sure there's no funny business steganographically hidden in the source codes, either.
3. I'm intentionally blocking the analogue of kin altruism among related populations, in order to exclude any reasons for cooperation besides the purely decision-theoretical ones.
4. Yeah, I know I said that populations with three different decision theories are basically intractable. That's true in general, but here we have the simplifying factor that two of them (CliqueBots and DefectBots) are simple decision theories that don't bother to simulate the other agents.
I'll write up my explanations and post them to Discussion this weekend.