Observe the payoff matrix at right (the unit of reward? Cookies.). Each player wants to play 'A', but only so long as the two players play different moves.
Suppose that Red got to move first. There are some games where moving first is terrible - take Rock Paper Scissors for example. But in this game, moving first is great, because you get to narrow down your opponent's options! If Red goes first, Red picks 'A', and then Blue has to pick 'B' to get a cookie.
This is basically kidnapping. Red has taken all three cookies hostage, and nobody gets any cookies unless Blue agrees to Red's demands for two cookies. Whoever gets to move first plays the kidnapper, and the other player has to decide whether to accede to their ransom demand in exchange for a cookie.
What if neither player gets to move before the other, but instead they have their moves revealed at the same time?
Pre-Move Chat:
Red: "I'm going to pick A, you'd better pick B."
Blue: "I don't care what you pick, I'm picking A. You can pick A too if you really want to get 0 cookies."
Red: "Okay I'm really seriously going to pick A. Please pick B."
Blue: "Nah, don't think so. I'll just pick A. You should just pick B."
And so on. They are now playing a game of Chicken. Whoever swerves first is worse off, but if neither of them give in, they crash into each other and die and get no cookies.
So, The Question: is it better to play A, or to play B?
Don't conflate the player with the bot.
In any cliquing strategy, you (the player) want to submit a bot that always defects against a bot that is not functionally equivalent to itself: this is crucial to guarantee the stability of the Nash equilibrium that you hope to reach.
You also want your bot to cooperate with any bot it can prove to be functionally equivalent to itself, and to always compute an answer in a finite time. Due to Rice's theorem , functional equivalence is a semidecidable property, therefore you need to use a stronger, decidable, equivalence relation that underestimates, but never overestimates, functional equivalence.
Textual equality is the most obvious of such relations. You can invent many more weaker equivalence relations that still guarantee functional equivalence, but they would add complexity to your bot and presumbly make coordination with the other player more difficult, therefore it is not an obvious choice to use any of them.
Once you have chosen to use a cliquing strategy, you still have to choose between (infinitely) many cliques, therefore you face a "choosing sides" coordination game with the other player. This a non-trivial problem, but fortunately it is easier than the original game.