Last year, AlexMennen ran a prisoner's dilemma tournament with bots that could see each other's source code, which was dubbed a "program equilibrium" tournament. This year, I will be running a similar tournament. Here's how it's going to work: Anyone can submit a bot that plays the iterated PD against other bots. Bots can not only remember previous rounds, as in the standard iterated PD, but also run perfect simulations of their opponent before making a move. Please see the github repo for the full list of rules and a brief tutorial.
There are a few key differences this year:
1) The tournament is in Haskell rather than Scheme.
2) The time limit for each round is shorter (5 seconds rather than 10) but the penalty for not outputting Cooperate or Defect within the time limit has been reduced.
3) Bots cannot directly see each other's source code, but they can run their opponent, specifying the initial conditions of the simulation, and then observe the output.
All submissions should be emailed to pdtournament@gmail.com or PM'd to me here on LessWrong by September 15th, 2014. LW users with 50+ karma who want to participate but do not know Haskell can PM me with an algorithm/psuedocode, and I will translate it into a bot for them. (If there is a flood of such requests, I would appreciate some volunteers to help me out.)
My main point is that the NE for the fixed-length IPD is "always defect", but the fact that we really ought to be able to do much better than "always defect" is what makes that case particularly interesting.
(2 & 3) Sorry; I misunderstood. was thinking about the infinite-horizon case. If you have a probability distribution over possible lengths of the game then the problem is indeed more complex, but I don't see that much benefit to it - it really doesn't change things all that much.
In particular, if you still have a limit on the number of rounds, then the same reasoning by backwards induction still applies (i.e .the optimal strategy is to always defect), and so the same interesting aspect of the problem is still there.
Similarly, the optimal counter-strategy to TFT stays mostly the same. It will simply be "TFT, but always defect starting from round N" where N is some number that will depend on the specific probabilities in question.
The interesting aspect of the problem is still the part that comes from the finite limit, regardless of whether it's fixed or has some kind of distribution over a finite number of possibilities.
Not so fast.
Once a prisoner condemned to death was brought before the king to set the date of the execution. But the king was in a good mood, having just had a tasty breakfast, and so he said: "You will be executed next week, but to spare you the agony of counting down the hours of your life, I promise you that you will not know the day of your execution until the jailers come to take you to the gallows".
The prisoner was brought back to... (read more)