Qiaochu_Yuan comments on Prisoner's Dilemma (with visible source code) Tournament - Less Wrong

47 Post author: AlexMennen 07 June 2013 08:30AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (232)

You are viewing a single comment's thread.

Comment author: Qiaochu_Yuan 05 June 2013 07:38:40PM *  13 points [-]

Injecting some keywords: this field of study is known as program equilibrium. Previous LW post on the subject, with links.

Edit: Can you explain how you decided on the parts of the payoff matrix involving "other"? These seem quite important as they affect the viability of strategies based on either convincing your opponent not to halt or not halting yourself.

Comment author: lukeprog 05 June 2013 10:28:31PM 7 points [-]

Handy introductory article: Computation and the Prisoner's Dilemma.

Comment author: shminux 05 June 2013 11:21:01PM *  7 points [-]

If HisProgram == MyProgram then
do (C);
else
do (D);
end.

TIL that ethnic hatred and tribalism is a Nash (folk) equilibrium.

Comment author: SilasBarta 06 June 2013 09:58:22PM 3 points [-]

Make sure the equality comparison only depends on things that affect functionality -- i.e. it will declare any functionally equivalent programs equal even they use a different nameset for variables or something.

(Yes, I know that's reducible to the halting problem; in practice, you'll use a computable, polynomial time approximation for it that will inevitably have to throw out equivalent programs that are too complex or otherwise be too 'clever'.)

Comment author: shminux 06 June 2013 10:20:52PM 2 points [-]

Patrick discusses this issue here in some depth.

Comment author: PeterisP 07 June 2013 01:38:52PM 1 point [-]

It's quite likely that the optimal behaviour should be different in case the program is functionally equal but not exactly equal.

If you're playing yourself, then you want to cooperate.

If you're playing someone else, then you'd want to cooperate if and only if that someone else is smart enough to check if you'll cooperate; but if it's decision doesn't depend on yours, then you should defect.

Comment author: ThrustVectoring 06 June 2013 02:35:06AM 0 points [-]

You only need to evaluate the equivalence of the first two lines of the program, by the way. It cooperates with those who can't not cooperate with it if it goes through that branch of the logic, and does something else to everyone else.

Comment author: iopq 06 June 2013 01:40:24AM 0 points [-]

Can you write that in Scheme so I can submit this? Thanks

Comment author: AlexMennen 05 June 2013 09:52:07PM 6 points [-]

The payoffs for "other" were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don't halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected. That way, not halting should be strictly dominated by defecting, since you are better off if you defect, and the other player should react the same way to each threat. And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.

Comment author: ThrustVectoring 06 June 2013 12:43:11PM 4 points [-]

And tricking the other player into not halting is also a bad idea, since the payoff you get from it is the same as if they defected.

(Defect, non-halt) is actually better than (Defect, Defect) for you, since it gives you a relative advantage over competitors in the tournament.

Comment author: AlexMennen 06 June 2013 04:03:32PM 1 point [-]

True, but I still don't expect it will be a big problem. If there are a lot of submissions, the effect will be small, and if it is paying enough attention to your source code for it to be possible to trick it into not halting, then it is probably looking for a way to achieve mutual cooperation, so tricking it is still not a good strategy.

Comment author: Gurkenglas 07 June 2013 02:47:29PM 4 points [-]

If you trust all sufficiently smart players to try to induce (Defect, non-halt) if (Defect, Defect) is otherwise inevitable, the effect adds up over a hopefully significant portion of the submissions.

Comment author: wedrifid 11 June 2013 08:27:31AM 2 points [-]

The payoffs for "other" were designed so that neither failing to halt, nor convincing the other player not to halt, should ever be a worthwhile strategy. If you don't halt, it gives you the same payoff as if you had cooperated, and gives the other player the same payoff as if you had defected.

Your game world implements an "enthusiastic consent" policy