Here's another strategy computed from the paper: cooperate with probabilities (0.9, 0.7, 0.2, 0.1) in the case (CC, CD, DC, DD). Supposedly this strategy is one that sets the opponent's score to an average of 2, regardless of his or her actions. You could come up with a similar strategy to force any outcome in the interval (1,3), excluding the endpoints.
I've yet to check how this works.
Bill "Numerical Recipes" Press and Freeman "Dyson sphere" Dyson have a new paper on iterated prisoner dilemas (IPD). Interestingly they found new surprising results:
They discuss a special class of strategies - zero determinant (ZD) strategies of which tit-for-tat (TFT) is a special case:
The evolutionary player adjusts his strategy to maximize score, but doesn't take his opponent explicitly into account in another way (hence has "no theory of mind" of the opponent). Possible outcomes are:
A)
B)
This latter case sounds like a formalization of Hosfstadter's superrational agents. The cooperation enforcement via cross-setting the scores is very interesting.
Is this connection true or am I misinterpreting it? (This is not my field and I've only skimmed the paper up to now.) What are the implications for FAI? If we'd get into an IPD situation with an agent for which we simply can not put together a theory of mind, do we have to live with extortion? What would effectively mean to have a useful theory of mind in this case?
The paper ends in a grand style (spoiler alert):