I didn't find this paper particularly interesting, mostly because it doesn't show the strength of extortionate strategies, but rather the limits of evolution in the way the paper defines it, and because these kind of "evolutionary" strategies have never been empirically shown to be particularly successful in IPD matches of infinite length, so their exploitation is not a "significant mathematical feature" as claimed.
To sum up the paper: In a non-zero-sum game of this kind, strategies that only care about gradually improving their own score cannot fully utilize defection as a punishment or deterrent, and thus are permanently exploited by strategies that can.
Note that the kind of evolution the paper talks about has little to do with actual evolution, and the last paragraph is nothing but an empty phrase.
Alright, thank you. As far as the last paragraph goes, I took it of course more on the "metaphorical" level. I agree their evolutionary agent might be too restricted to be fully interesting (though it is valuable if their inferiority is demonstrated analytically not only from simulations).
Since it seems you have lot's of experience with IPD, what do you think about the case B)? The paper makes the claim specifically for the ZD strategies, but do you think this "superrationally" result could generalize to any strategy which has also a ...
Bill "Numerical Recipes" Press and Freeman "Dyson sphere" Dyson have a new paper on iterated prisoner dilemas (IPD). Interestingly they found new surprising results:
They discuss a special class of strategies - zero determinant (ZD) strategies of which tit-for-tat (TFT) is a special case:
The evolutionary player adjusts his strategy to maximize score, but doesn't take his opponent explicitly into account in another way (hence has "no theory of mind" of the opponent). Possible outcomes are:
A)
B)
This latter case sounds like a formalization of Hosfstadter's superrational agents. The cooperation enforcement via cross-setting the scores is very interesting.
Is this connection true or am I misinterpreting it? (This is not my field and I've only skimmed the paper up to now.) What are the implications for FAI? If we'd get into an IPD situation with an agent for which we simply can not put together a theory of mind, do we have to live with extortion? What would effectively mean to have a useful theory of mind in this case?
The paper ends in a grand style (spoiler alert):