Prerequisites: Familiarity with decision theories (in particular, Eliezer's Timeless Decision Theory) and of course the Prisoner's Dilemma.
Summary: I show an apparent paradox in a three-agent variant of the Prisoner's Dilemma: despite full knowledge of each others' source codes, TDT agents allow themselves to be exploited by CDT, and lose completely to another simple decision theory. Please read the post and think for yourself about the Exercises and the Problem below before reading the comments; this is an opportunity to become a stronger expert at and on decision theory!
We all know that in a world of one-shot Prisoner's Dilemmas with read-access to the other player's source code, it's good to be Timeless Decision Theory. A TDT agent in a one-shot Prisoner's Dilemma will correctly defect against an agent that always cooperates (call this CooperateBot) or always defects (call this DefectBot, and note that CDT trivially reduces to this agent), and it will cooperate against another TDT agent (or any other type of agent whose decision depends on TDT's decision in the appropriate way). In fact, if we run an evolutionary contest as Robert Axelrod famously did for the Iterated Prisoner's Dilemma, and again allow players to read the other players' source codes, TDT will annihilate both DefectBot and CooperateBot over the long run, and it beats or ties any other decision theory.1 But something interesting happens when we take players in threes...
Consider a population of agents in a simulated world. Omega, being the trickster from outside the Matrix that ze is, decides to spend a couple of eons playing the following game with these agents: ze selects three of them at random (call them X, Y and Z), wipes their memories,2 gives them each others' source codes, and privately asks each whether they cooperate or defect. If X defects, then Omega will create 2 "children" of X (distinct near-copies of X, with the same decision theory as X) and return them to the simulation. If X cooperates, then Omega will return 3 "children" of Y and 3 "children" of Z to the simulation. Simultaneously, Y and Z make the analogous decisions.
(Just to reiterate: cooperating gives +3 to each other player, nothing to oneself; defecting gives +2 to oneself, nothing to anyone else. The analogy to the Prisoner's Dilemma should be obvious.)
Assume maximal selfishness: each agent is motivated solely to maximize its own number of children (the agent itself doesn't get returned!), and doesn't care about the other agents using the same decision theory, or even about its other "relatives" in the simulation.3 Although we've had to explicitly state quite a few conditions, this seems like a pretty simple and fair evolutionary tournament.
It's clear that CDT agents will simply defect each time. What about TDT agents?
Exercise 1: Prove that if the population consists of TDT agents and DefectBots, then a TDT agent will cooperate precisely when at least one of the other agents is also TDT. (Difficulty: 1 star.)
Notice that we've created a free-rider problem. Any DefectBot paired with two TDT agents gets 8 children- even better than the 6 that each of three TDT agents get in their best case! As you might expect, this bonus balances against the fact that three TDTs played together will fare much better than three DefectBots played together, and so it turns out that the population settles into a nontrivial equilibrium:
Exercise 2: Prove that if a very large population starts with equal numbers of TDTs and DefectBots, then the expected population growth in TDTs and DefectBots is practically equal. (If Omega samples with replacement– assuming that the agents don't care about their exact copy's children– then the expected population growth is precisely equal.) (Difficulty: 2 stars.)
Exercise 3: Prove that if the initial population consists of TDTs and DefectBots, then the ratio of the two will (with probability 1) tend to 1 over time. (Difficulty: 3 stars.)
This should already perplex the reader who believes that rationalists should win, and that in particular TDT should beat the socks off of DefectBots in any fair fight. The DefectBots aren't harmless parasites, either: the TDTs' rate of reproduction in equilibrium with DefectBots is less than 30% of their rate of reproduction in a population of TDTs alone! (Easy to verify if you've done Problem 2.)
And it gets worse, in two ways.
First, if we adjust the payoff matrix so that defecting gets (+200,+0,+0) and cooperating gets (+0,+201,+201), then any population of TDTs and DefectBots ends up (with probability 1) with the DefectBots outnumbering TDTs by a ratio of 100:1. (Easy to verify if you've done Exercise 3.)
Secondly and more damningly, we can introduce CliqueBots, who cooperate only if both other agents are CliqueBots. These, and not TDTs, are the champions of the three-way Prisoner's Dilemmas:
Exercise 4: If the initial population consists of CliqueBots, DefectBots and TDTs4 in any proportion, then the ratio of both others to CliqueBots approaches 0 (with probability 1). (Difficulty: 4 stars.)
Problem: The setup looks perfectly fair for TDT agents. So why do they lose? (Difficulty: 2+3i stars.)
Note: This problem is solved in a more general followup post; but do try and think about it yourself first! Also, I've posted solutions to the exercises in the discussion section. It's worth noting that I asked Eliezer, Vladimir Nesov, Wei Dai, cousin_it, and other decision-theory heavyweights to avoid posting spoilers on the main problem below, and they obliged; many of the other commenters figured out the problem, but nobody posted solutions to the exercises in the comments (as of 9/5/11).
Footnotes:
1. What's meant by "beat or tie" is sort of complicated- there are decision theories that (with high probability) crush TDT when starting out as a majority, and (with equally high probability) get crushed by TDT when starting out as a minority. Also, it gets intractably messy if you allow populations consisting of three or more decision theories (because then A can decide, not just based on how B will decide to act towards A, but also on how B would act towards C, etc).
2. No playing Tit-for-Tat in this game! Omega will make sure there's no funny business steganographically hidden in the source codes, either.
3. I'm intentionally blocking the analogue of kin altruism among related populations, in order to exclude any reasons for cooperation besides the purely decision-theoretical ones.
4. Yeah, I know I said that populations with three different decision theories are basically intractable. That's true in general, but here we have the simplifying factor that two of them (CliqueBots and DefectBots) are simple decision theories that don't bother to simulate the other agents.
The problem is underspecified. I think the post is really operating under the assumption that each generation Omega takes the population, divides it in triples, kills the remaining 0-2 without issue, runs the PD, and takes the result as the population for the next generation.* But if (as the post seems to say) Omega replaces each triple by its children before picking the next triple, exercise 1 is considerably more difficult than the others. In the first version, the agents care only about their children and not the children of other agents or of defectbot. But in the second case, an agent might encounter someone else's child when it plays the game, so it does care about the proportion of population. Then the situation is not a 1-shot PD and cannot be analyzed so simply. In particular, the answer depends on the utility function. It is not enough to say that the agent prefers more children to fewer; the answer depends on how much.
Consider the modified situation in which I claim that a homogeneous population of TDT/UDT agents defect. In this version, Omega picks replaces a triple with its children before picking the new triples. The payoffs depend on on the number of triples Omega has run: defection is always 1 copy, but cooperation by the N-th triple is worth N copies. If everyone cooperates, the population grows quadratically. There is a chance that an agent will never get picked. A population of sufficiently risk-averse agents will prefer to defect and get 1 child with probability 1 over cooperating and risking having no children. (There are some anthropic issues here, but I don't think that they matter.)
I had to change a lot of details to get there, but that means that the answer to Exercise 1 must depend on those details. If Omega replaces population every trial, then it cannot be analyzed like a triple in isolation, but must consider the population and utility function explicitly. If Omega runs it generation by generation, then exercise does reduce to an isolated triple. But since the agents only care about their children, they only care about what happens the first generation, and don't care whether Omega runs any more generations, making the other questions about later generations seem out of place.
* If you change the payoffs from 2/3 to 6/9, then there are no stragglers to kill. Or you could let the stragglers survive to to play next generation; if the population is large enough, this will not affect strategy.
I don't understand what you mean by this.
Here, I believe you're saying that you can pick a rate of growth in the payoffs such that, with probability 1, if everyone cooperates, then eventually one particular lineage (chosen randomly) comes to dominate to the extent that nobody else even gets selected to ... (read more)