Observe the payoff matrix at right (the unit of reward? Cookies.). Each player wants to play 'A', but only so long as the two players play different moves.
Suppose that Red got to move first. There are some games where moving first is terrible - take Rock Paper Scissors for example. But in this game, moving first is great, because you get to narrow down your opponent's options! If Red goes first, Red picks 'A', and then Blue has to pick 'B' to get a cookie.
This is basically kidnapping. Red has taken all three cookies hostage, and nobody gets any cookies unless Blue agrees to Red's demands for two cookies. Whoever gets to move first plays the kidnapper, and the other player has to decide whether to accede to their ransom demand in exchange for a cookie.
What if neither player gets to move before the other, but instead they have their moves revealed at the same time?
Pre-Move Chat:
Red: "I'm going to pick A, you'd better pick B."
Blue: "I don't care what you pick, I'm picking A. You can pick A too if you really want to get 0 cookies."
Red: "Okay I'm really seriously going to pick A. Please pick B."
Blue: "Nah, don't think so. I'll just pick A. You should just pick B."
And so on. They are now playing a game of Chicken. Whoever swerves first is worse off, but if neither of them give in, they crash into each other and die and get no cookies.
So, The Question: is it better to play A, or to play B?
Waidaminute - those aren't "A-players" or "B-players" any more, but variants. And if two "A-player variants" vary in a slightly different way, will they recognize each other as such? Doing so is not a trivial problem!
Anyway, my algorithm would be something like:
"Calculate a hash of my code, hashMe, and a hash of my opponent, hashYou. If (hashMe + hashYou) is odd, the player with the highest hash shall be the Chosen One, otherwise it's the player with the lowest hash (if the hashes are identical, find some other unambiguous higher/lower criteria both players share, player IDs or something or lexicographic ordering of source code, if none exists, flip a coin). If I'm the Chosen One, play A, otherwise play B. Also, random string of comments (to ensure different instances of this algorithm have different hash functions)."
This should have an expected value 1.5 against predictors (it can be simulated safely without recursion problems), 0.5 against A-players, 1 against B-players and 0.75 against coin-flippers. And more importantly, it's resistant to exploitation/blackmail, and gets the max utility of 1.5 against itself.
The only problem with this algorithm is that it's not good at coordinating with similar algorithms that have a different "Chosen One" criteria (e.g. "look at the parity of the (hashMe+HashYou)th digit of pi"). That could be fixed, but is considerably more complicated.
This is way too complicated. Instead of ending with "flip a coin" why not just start with it?