This post was prompted by Vladimir Nesov's comments, Wei Dai's intro to cooperative games and Eliezer's decision theory problems. Prerequisite: Re-formalizing PD.
Some people here have expressed interest in how AIs that know each other's source code should play asymmetrical games, e.g. slightly asymmetrized PD. The problem is twofold: somehow assign everyone a strategy so that the overall outcome is "good and fair", then somehow force everyone to play the assigned strategies.
For now let's handwave around the second problem thus: AIs that have access to each other's code and common random bits can enforce any correlated play by using the quining trick from Re-formalizing PD. If they all agree beforehand that a certain outcome is "good and fair", the trick allows them to "mutually precommit" to this outcome without at all constraining their ability to aggressively play against those who didn't precommit. This leaves us with the problem of fairness.
(Get ready, math ahead. It sounds massive, but is actually pretty obvious.)
Pure strategy plays of an N-player game are points in the N-dimensional space of utilities. Correlated plays form the convex hull of this point set, an N-polytope. Pareto-optimal outcomes are points on the polytope's surface where the outward normal vector has all positive components. I want to somehow assign each player a "bargaining power" (by analogy with Nash bargaining solutions); collectively they will determine the slope of a hyperplane that touches the Pareto-optimal surface at a single point which we will dub "fair". Utilities of different players are classically treated as incomparable, like metres to kilograms, i.e. having different dimensionality; thus we'd like the "fair point" to be invariant under affine recalibrations of utility scales. Coefficients of tangent hyperplanes transform as covectors under such recalibrations; components of a covector should have dimensionality inverse to components of a vector for the application operation to make sense; thus the bargaining power of each player must have dimensionality 1/utility of that player.
(Whew! It'll get easier from now.)
A little mental visualization involving a sphere and a plane confirms that when a player stretches their utility scale 2x, stretching the sphere along one of the coordinate axes, the player's power (the coefficient of that coordinate in the tangent hyperplane equation) must indeed go down 2x to keep the fair point from moving. Incidentally, this means that we cannot somehow assign each player "equal power" in a way that's consistent under recalibration.
Now, there are many ways to process an N-polytope and obtain N values, dimensioned as 1/coordinate each. A natural way would be to take the inverse measure of the polytope's projection onto each coordinate axis, but this approach fails because irrelevant alternatives can skew the result wildly. A better idea would be taking the inverse measures of projections of just the Pareto-optimal surface region onto the coordinate axes; this decision passes the smoke test of bargaining games, so it might be reasonable.
To reiterate the hypothesis: assign each player the amount of bargaining power inversely proportional to the range of their gains possible under Pareto-optimal outcomes. Then pick the point on the polytope's surface that touches a hyperplane with those bargaining powers for coefficients, and call this point "fair".
(NB: this idea doesn't solve cases where the hyperplane touches the polytope at more than one point, e.g. risk-neutral division of the dollar. Some more refined fairness concept is required for those.)
At this point I must admit that I don't possess a neat little list of "fairness properties" that would make my solution unique and inevitable, Shapley value style. It just... sounds natural. It's an equilibrium, it's symmetric, it's invariant under recalibrations, it often gives a unique answer, it solves asymmetrized PD just fine, and the True PD, and other little games I've tried it on, and something like it might someday solve the general problem outlined at the start of the post; but then again, we've tossed out quite a lot of information along the way. For example, we didn't use the row/column structure of strategies at all.
What should be the next step in this direction?
Can we solve fairness?
EDIT: thanks to Wei Dai for the next step! Now I know that any "purely geometric" construction that looks only at the Pareto set will fail to incentivize players to adopt it. The reason: we can, without changing the Pareto set, give any player an additional non-Pareto-optimal strategy that always assigns them higher utility than my proposed solution, thus making them want to defect. Pretty conclusive! So much for this line of inquiry, I guess.
My understanding is that cousin_it is suggesting just such a base decision algorithm, which works like this:
We can all probably see a whole bunch of difficulties here, both technical and philosophical. But Eliezer, it's not clear from reading your comment what your objection is exactly.
ETA: I just noticed that cousin it's proposed "good and fair" formula doesn't actually ensure my point above in parenthesis (that anyone who doesn't follow the decision algorithm will fail to maximize its utility). To see this, suppose in PD one of the players can choose a third option, which is not Pareto-optimal but unilaterally gives it a higher payoff than assigned by cousin_it's formula.
cousin_it, if you're reading this, please see http://lesswrong.com/lw/102/indexical_uncertainty_and_the_axiom_of/sk8 , where Vladimir Nesov proposed a notion that turns out to coincide with the concept of the core in cooperative game theory. This is necessary to ensure that the "good and fair" solution will be self-enforcing.
If one of the PD players has a third option of "get two bucks guaranteed and screw everyone else" - if the game structure doesn't allow other players to punish him - then no algorithm at all can punish him. Or did you mean something else?
Yep, I know what the core is, and it does seem relevant. But seeing as my solution is definitely wrong for stability reasons, I'm currently trying to think of any stable solution (continuous under small changes in game payoffs), and failing so far. Will think about the core later.