Epictetus comments on Open Thread, Apr. 27 - May 3, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (352)
Still More to the Prisoner's Dilemma
After reading http://www.pnas.org/content/early/2012/05/16/1206569109.full.pdf+html , the detail that's caught my attention: "The player with the shortest memory sets the terms of the game." If a strategy remembers 0 turns, and simply Always Cooperates, or Always Defects, or randomly chooses between them, then no matter how clever its opponent might be, it can't do any better than by acting as if it were also a Memory-0 strategy. Tit-for-Tat is a Memory-1 strategy - and despite all the analysis that I've read on it before, I now see it from a new perspective, in that it's one of the few Memory-1 strategies that gracefully falls back to the appropriate Memory-0 strategy when faced with All-C or All-D... and any strategy which tries to implement a more complicated scheme based on longer strings is faced with the fact that Tit-for-Tat simply doesn't remember anything beyond a single turn.
I would like to see if this perspective can be extended to a Memory-2 strategy that falls gracefully back to appropriate Memory-1 strategies such as Tit-for-Tat when faced with Memory-1 strategies, and like Tit-for-Tat, to a suitable Memory-0 strategy when faced with Memory-0 ones.
Does anyone have a link to a suitable set of programs to run some experimental tourneys, and instructions on how to apply them? (If it matters, the OSes I have available are WinXP and Fedora 21.)
There's still some subtlety here. A Memory-0 strategy picks C with probability p and D with probability q, independent of any past results. If you know p and q, you can devise a strategy to optimize your score. The result in the paper is that this new strategy is Memory-0 and that you can't do better by increasing your memory.
The advantage of a longer memory is that, given enough iterations, you can get a good approximation for p and q and so deduce the appropriate Memory-0 strategy. Something like Tit-for-Tat is devised to basically get the same score as its opponent (the opponent can get an advantage of one defection). It's not going to do worse than any individual opponent, but neither is it going to do better. A strategy that remembers the entire game can recognize, say, All-C and exploit it by defecting, which Tit-for-Tat can't do.
A Memory-1 strategy is one where p and q are functions of the previous round. In general, they'll depend both on what it did last round and what the opponent did last round. There are four possible results (C-C, C-D, D-C, D-D), which means that the strategy will have up to four distinct probabilities for cooperation next round. If you can learn those, you can come up with the optimal strategy for playing against it. This strategy can be modeled as a Memory-1 strategy.
The big difference, I think, is that having a longer memory is helpful if you're in a diverse environment. In any individual game, there's always a strategy with a shorter memory that will do as well as yours. However, the same short-memory strategy will not be optimal against every opponent, while you can use your longer memory to devise the best short-memory strategy for a given match.