Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

solipsist comments on Robust Cooperation in the Prisoner's Dilemma - Less Wrong

69 Post author: orthonormal 07 June 2013 08:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (145)

You are viewing a single comment's thread.

Comment author: solipsist 13 June 2013 01:37:07AM *  5 points [-]

Suppose we decide that proving systems are too (NP) hard and we want to restrict agents to polynomial time. If an agent has access to a random coin, can it interactively prove that its opponent is cooperative?

I stumbled across an interesting class of agent while attempting to design a probabilistically proving bot for AlexMennen's contest. The agent below always cooperates with opponents who always cooperate back, (almost) always defects against opponents who always defect against them, and runs in probabilistic polynomial time if its opponent does as well.

def mimic_bot(opponent):
    if random() > ε:
        my_source = quine_my_source()
        return eval(opponent, my_source) # do as your opponent does to you
    else:
        # unlikely, but necessary to break out of recursive simulations in polynomial time
        return COOPERATE

This particular agent is a bit naïve, but more sophisticated versions (e.g ones that do not cooperate with CooperateBot) are possible. Does anyone have any insight on probabilistically proving opponents will cooperate?

Comment author: orthonormal 13 June 2013 06:53:54PM 2 points [-]

An interesting idea! One problem is that it doesn't halt (with probability 1 - epsilon) if the opponent doesn't halt, and I don't see a way to fix this while keeping its nice properties. (But there might well be a suitable fix!)

Comment author: solipsist 13 June 2013 08:03:31PM *  2 points [-]

Proposed method to make MimicBot halt:

If eval() takes more than 1,000,000,000 * n^3 steps (where n is the length of the input programs), eval aborts and returns an error flag. MimicBot defects if the simulation returned an error flag.

Programming eval to count its search steps will add overhead, and that overhead could increase exponentially as the stack of simulations grows deeper. But that's OK! The expected depth of simulations depends on ε, not n, and ε is a constant. MimicBot still halts in polynomial time.

"But wait!" I hear you protest. "If the outermost program kills simulations after a fixed number of cycles, then an inner simulation will find that its world ends before the simulation was expecting the world to end in its subjective timeline. That means simulations aren't completely faithful mimics."

That is true, but aborting the simulation should not be the usual case. Most simulations will take less than 1,000,000,000 * n^3 steps—only the dawdling or non-halting agents will have their simulations aborted.

Comment author: cousin_it 13 June 2013 07:05:58PM 1 point [-]

The idea looks similar to scenario 2 from my first PD post ;-)