Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

itaibn0 comments on Robust Cooperation in the Prisoner's Dilemma - Less Wrong

69 Post author: orthonormal 07 June 2013 08:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (145)

You are viewing a single comment's thread. Show more comments above.

Comment author: itaibn0 09 June 2013 05:11:14PM 2 points [-]

I agree that it is best to defect against CooperateBot. However, the case for PsTitTatBot[630] is far from clear; when your opponent is that you should seriously consider the possibility that you are being simulated. Thinking about this some more, there are other strategies, for example, cooperating for 5<N<1000 but defecting for N=1000 based on the belief that the environment is likely to contain PsTitTatBot[1000] (although if you think PsTitTatBot[1000] is unusually likely, all you really need to do is cooperate against PsTitTatBot[999]). So I'm not certain at all what the best response is to a pseudo tit-for-tat agent. However, this situation provides a testing ground for agents that could be stronger than PrudentBot.

Comment author: CCC 09 June 2013 05:58:23PM *  0 points [-]

One possibility is to recognise such self-simulating bots (possibly by recognising that they simulate their opponent against a slightly modified version of themselves), finding their opponent's N-value, and cooperating if N is odd, acting as Prudentbot in all other cases.

Such a bot will likely defect against 50% of PsTitTatBot[N] bots when the PsTitTatBot[N] cooperates (i.e. where N is even). It will defect against CooperateBot. Unfortunately, against the other 50% of PsTitTatBots (i.e. where N is odd) it will cooperate while the PsTitTatBot defects.

Whether this does better than PrudentBot or not depends on the exact payoff matrix; is the cost of cooperating against a defection half the time worth the benefit of defecting against a cooperation the other half the time, as compared to mutual defection all the time?

Another possible strategy is simply to defect against CooperateBot, but cooperate with all PsTitTatBot[N] where N>0. This will lose against PsTitTatBot[1] only, and achieve mutual cooperation with all higher levels of PsTitTatBot. Again, whether this is better than the above or not depends on the exact payoff matrix. (This one is guaranteed to do better than PrudentBot, assuming few examples of PsTitTatBot[1] and many examples of PsTitTatBot[N] for N>1)

Comment author: Will_Sawin 11 June 2013 02:06:08AM 4 points [-]

To recognize what?

What you need to recognize is bots that are simulated by other bots. Consider the pair of BabyBear, which does whatever, and MamaBear, which cooperates with you if and only if you cooperate with BabyBear. Estimating the ratio of MamaBears to any particular BabyBear is an empirical question.

What seems problematic about these strategies is that they are not evolutionarily stable. In a world filled with those bots, PsTitTat bots would proliferate and drive them out. That hardly seems optimal!