All of selbram's Comments + Replies

I went and ran this another 100 times, so I could see what it would look like without the variance. The mean scores are:

A: 32.03
B: 28.53
C: 32.48
D: 24.94
E: 28.75
F: 29.62
G: 28.42
H: 26.12
I :26.06
J: 26.10
K: 36.15
L: 27.21
M: 25.14
N: 34.37
O: 31.06
P: 26.55
Q: 34.95
R: 32.93
S: 37.08
T: 26.43
U: 24.24

If you're interested, here's the code for the test (takes a day to run) and the raw output for my run (an inconvenient format, but it shows the stats for the matchups).

1Caspar Oesterheld
I tried to run this with racket and #lang scheme (as well as #lang racket) but didn't get it to work (though I didn't try for very long), perhaps because of backward compatibility issues. This is a bit unfortunate because it makes it harder for people interested in this topic to profit from the results and submitted programs of this tournament. Maybe you or Alex could write a brief description of how one could get the program tournament to run? Edit: I think after posting this, I did manage to get it to work (though I don't remember how).

And for the lazy, here these scores are in sorted order (with original scores in parentheses):

S: 37.08 - Quinn (33)
K: 36.15 - selbram (34)
Q: 34.95 - Margaret Sy (39)
N: 34.37 - BloodyShrimp (34)
R: 32.93 - So8res, NateBot (33)
C: 32.48 - THE BLACK KNIGHT (36)
A: 32.03 - rpglover64 (38)
O: 31.06 - caa (32)
F: 29.62 - Billy, Mimic-- (27)
E: 28.75 - Devin Bayer (30)
B: 28.53 - Watson Ladd (27)
G: 28.42 - itaibn (34)
L: 27.21 - Alexei (25)
P: 26.55 - nshepperd (25)
T: 26.43 - HonoreDB (23)
H: 26.12 - CooperateBot (24)
J: 26.1 - oaz (26)
I: 26.06 - Sean Nolan (28)
M: 25.14 - L... (read more)

One good way to interpret code is to run the code with "eval", which many submitted bots did. This method has no problems with the examples you gave. One important place it breaks down is with bots that behave randomly. In that case a robot may, by chance, be simulated to cooperate and defect in whatever sequence would make it seem worth cooperating with even if it actually ends up defecting. This, combined with a little luck, made the random bots come out ahead. There are ways to get around this problem, and a few bots did so, but they still didn't do better random bots because they had less potential for exploitation in this particular pool of entrants.

1BloodyShrimp
I agree, my fellow top-ranking-non-source-ignoring player. Saying "nobody could do any better than randomness in this tournament" is strictly true but a bit misleading; the tiny, defect-happy pool with almost 20% random players (the top 3 and also G; he just obfuscated his somewhat) didn't provide a very favorable structure for more intelligent bots to intelligently navigate, but there was still certainly some navigation. I'm pretty pleased with how my bot performed; it never got deterministically CD'd and most of its nonrandom mutual defections were against bots who had some unusual trigger condition for defecting based on source composition, not performance, or had very confused performance triggers (e.g. O--why would you want to play your opponent's anti-defectbot move when you determine they cooperate with cooperatebot?). Some of its mutual defections were certainly due to my detect-size-changes exploit, but so were its many DCs.

You're right. K is a MimicBot with an additional check for proper quining. I primarily intended it to cause defection against CooperateBots, RandomBots, and others that don't simulate their opponents meaningfully. I expected a lot more MimicBot variants and mutual cooperations...