Click here to participate. Entries must be submitted on October 18th, 2020 or earlier.
Entry is now closed.
In 2017, Zvi posted an exciting story about The Darwin Game, a variation of iterated prisoner's dilemma.
I will run my own version of the game in the week following October 18th, 2020. You do not have to know how to program in order to participate. I will code simple bots for non-programmers. If you do know how to program then you may create your own complicated bot.
Here are the rules. Changes from Zvi's original game are in brackets [like this].
For the first round, each player gets 100 copies of their program in the pool, and the pool pairs those programs at random. You can and often will play against yourself.
Each pair now plays an iterated prisoner’s dilemma variation, as follows. Each turn, each player simultaneously submits [an integer] from 0 to 5. If the two numbers add up to 5 or less, both players earn points equal to their number. If the two numbers add up to 6 or more, neither player gets points. This game then lasts for a large but unknown number of turns, so no one knows when the game is about to end; [I guarantee it will be at least 100 turns per iterated prisoner's dilemma].
Each pairing is independent of every other pairing. [You do know what round of the game it is and that you are facing an opponent. If you face a copy of yourself you are automatically awarded the maximum 5 points per round (2.5 points per bot). You otherwise do not know any history of the game to this point.] Your decision algorithm does the same thing each pairing.
At the end of the round, all of the points scored by all of your copies are combined. Your percentage of all the points scored by all programs becomes the percentage of the pool your program gets in the next round. So if you score 10% more points, you get 10% more copies next round, and over time successful programs will displace less successful programs. Hence the name, The Darwin Game.
Your goal is to have as many copies in the pool at the end of the last round as possible, or failing that, to survive as many rounds as possible with at least one copy.
[I will attempt to iterate until there is a stable equilibrium.]
I will add some silly bots of my own to make the early game more interesting.
Instructions for non-programmers
Please give a simple explanation of what you want your bot to do. Write it with mathematical precision. If your specification is even slightly ambiguous then you will be disqualified.
Instructions for programmers
Write a program of the following format.
class TitForTatBot():
def __init__(self, round=0): # Edit: Changed "1" to "0"
self.turn = 0
self.round = round
self.previous = None
def move(self, previous=None):
self.turn += 1
if self.previous:
output = self.previous
self.previous = previous
return output
else:
return 2
# Edit 2020-10-11: The above code is wrong. The properly-implemented TFT `move` method looks like this.
# def move(self, previous=None):
# self.turn += 1
# if previous == None:
# return 2
# else:
# return previous
Your class must have an __init__(self, round=1)
intializer and a move(self, previous=None)
method. You may write your class in Python 3 or Hy.
Unlike Zvi's original game, you do get to know what round it is. Rounds are indexed starting at 0. The previous
parameter of the move
method is an integer indicating what your opponent did last iteration. If it is the first iteration then previous
equals None
.
A new instance of your class will be initialized in each round. You may save whatever information you want into the class instance's fields but you may not save information between rounds or between class instantiations. The move
method must always return an integer from 0 to 5 inclusive.
You may import standard libraries like random
, scikit
and numpy
.
Coordinating with other players
Anyone can play, but only players with a Less Wrong account that existed before I declared this tournament will be allowed to coordinate out-of-game. This rule exists to prevent players from submitting multiple entries to this contest and self-coordinating. Coordinating with other people is encouraged. Coordinating with yourself between multiple separate entries is cheating.
Click here to participate. Entries must be submitted on October 18th, 2020 or earlier.
Entry is now closed.
So, in case anyone's wondering what I did...
I cared enough to think and enter, but not to program.
I designed a simulator, but was told it wouldn't be coded for me, so that was out.
So instead, I wrote this:
Until symmetry breaks, if the round # is 1, is 3 or is even, play sequence 23322232323322233 and then repeat 22222223 until symmetry breaks. If round # is odd and 5 or more, the reverse, except repeating 22222223 at the end.
Once it breaks, alternate 2 and 3 for 4 turns.
Then, if the last turn added to 5, keep doing that until they don't.
Once they don't...
If they've always played 0 or less, play 5.
If they've always played 1 or less, play 4.
If they've always played 2 or less, play 3.
Otherwise, depending on round number:
Rounds 1-50: Keep alternating until turn t+10. After that, if last turn added to 5, alternate 2 and 3. Otherwise, check their average score per round after symmetry. If it's 2.5 or lower, play 2, otherwise play 3.
Rounds 51-100: Same as above, except you also always play 3 if their score is 5 or more higher than yours.
Rounds 101+: Same as above, except you also always play 3 if their score is higher than yours.
(We could improve by adding more logic to properly exploit in strange situations, but we don't care enough so we won't.)
That's it. Keep it simple. Still call it BendBot I guess.
The intention here was pretty basic. Endgame behavior varies by round to get stingier if anyone tries something, to grab early pool share without being exploited later.
The big thing is that this bot is deterministic. I intentionally avoid calling the random function by choosing a semi-random set of 2s and 3s, on the theory that it's unlikely anyone else would choose an identical sequence, and if I meet myself I get the 2.5 anyway.
If they are not simulating or checking code, it won't matter.
If they are looking at all, then my not loading their code and not being random tells them the water's fine, take a look, see what's going on, and we can cooperate fully - you can see what I'm starting with, and we can get 2.5 each without incident. I'm sad that some people thought that more than two parenthesis was high risk to simulate/examine - I thought that the obvious thing to do was check to see if someone ever loads code or uses a random function, and if you don't do either, you should be safe.
So the thought was, many of the best bots would be simulator bots and I'd get full cooperation from them, whereas when they faced each other, they'd have to do some random matching to cooperate, so I'd have an edge there, and I'd do reasonably well against anything else that went late unless some alliance was afoot.
Turns out an alliance is afoot after all, but I certainly didn't care enough to worry about that. Let them come, and let the backstabbers profit, I say.
I was told that I had by far the most complicated non-coded entry even then, and that my endgame logic was being replaced with randomly 50% 2, 50% 3. I was asked, submit as-is, fix it, or withdraw?
That modification definitely didn't work, and the code that was written was not something I felt OK touching. So I explained why, and suggested it be replaced with this:
If (last round added to 5 or less) play whatever they played last.
Else If (their score > my score and round > 5) play 3.
Else Play 2.
I figured that was one extra line of code and should take like 2 minutes tops, and if that was 'too complex' then that was fine, I'd sit out.
So basically, let myself get exploited very early since there would likely be at least one all-3s in the mix but all such things would swiftly lose, then shift to hardcore mode a little faster to keep it simple.
I didn't get a reply to that, so I don't know if my entry is in or not. I hope it is, but either way, good luck everyone.
Zack_M_Davis isn't the only one.
I've also written Hissp now. I'm curious how they compare for data science work (and other applications).
I've also seen evhub, the author of Coconut, here on LessWrong.