# Benja comments on An Alternative Approach to AI Cooperation - Less Wrong

6 31 July 2009 12:14PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 01 August 2009 02:39:39PM 3 points [-]

Hm. I see. Thanks for pointing that out. Embarrassingly, I completely ignored mixed strategies above -- implicitly assumed that the constructions of G_M and G_S would be over pure strategies of G, and analyzed only pure strategy profiles of G_M and G_S.

I do see how constructing G_M and G_S over mixed strategies of G would make the two values equal by the minimax theorem, but I think there are complications when we analyze mixed strategies. Mixed Nash equilibria of G_S can enforce correlated play in G, even without relying on cryptography, as follows: Have the pure strategies in the equilibrium correspond to a common quined program (as before) plus a natural number < n, for some given n. Then the program adds the different players' numbers modulo n, and uses the result to determine the strategy profile in G. If a player chooses their number with uniform probability through a mixed strategy of G_S, then they know that all sums modulo n are equally likely, no matter how the other players choose their numbers. So everybody choosing their numbers uniformly at random can be a Nash equilibrium, because no player can change the expected result by unilaterally switching to a different way of choosing their number.

But G_M, as defined above (whether over pure or mixed strategies), can, as far as I can see, not enforce correlated play in G.

A natural way to rectify this is to define the first component of G_M to be neither just a pure nor just a mixed strategy profile of G, but an arbitrary distribution over pure strategy profiles of G -- a "correlated strategy profile." Can the earlier proof then be used to show that G_M has a mixed strategy profile realizing a certain outcome ⇔ G has "correlated strategy profile" realizing that outcome that pays each player at least their security value ⇔ G_S has a mixed strategy profile realizing that outcome?

Comment author: 01 August 2009 05:59:59PM *  2 points [-]

Good catch. To tell the truth, I didn't even think about mixing strategies in G_M and G_S, only playing deterministically and purely "on top of" mixed strategies in G. When we add mixing, G_S does turn out to be stronger than G_M due to correlated play; your construction is very nice.

Your final result is correct, here's a proof:

1) Any Nash equilibrium of G_S or "new" G_M plays a correlated strategy profile of G (by definition of correlated strategy profile, it's broad enough) that on average gives each player no less than their security value (otherwise they'd switch).

2) Any such "good" profile can be implemented as a Nash equilibrium of G_S in mixed strategies, using your construction above and the usual method of punishment by security level. If all of the enemy's strategies are quine-compatible with yours, the profile gets played exactly thanks to the niceties of your construction. If any small part of his strategies are incompatible with yours, that's enough to bring him down on average.

3) For "new" G_M you just submit the profile and the punishment fallback. So "new" G_M doesn't actually need mixed strategies, our latest deus ex machina is too strong.

Comment author: 01 August 2009 11:21:19PM *  1 point [-]

Thanks, Benja and cousin_it, for working out the math. The "new" G_M looks correct to me (and let's just call it G_M from now, since I see no reason why the joint machine can't be programmed with a correlated strategy profile). To make it a little more realistic, we can add a step after the joint machine is constructed, where each player has a choice to transfer his resources or not. It's pretty obvious that adding this step makes no difference in the outcome, since the players can program the joint machine to execute their fallback strategies if one of the player fails to transfer.

Benja's method for G_S to enforce correlated play in G seems to require simultaneous source-swapping. Otherwise, if I choose my random number first, then the second player can pick his number to favor him, right? It seems that the common quined program has to use some cryptographic method to generate a common random number in a way that's not vulnerable to manipulation by the players.