In my experience, constant-sum games are considered to provide "maximally unaligned" incentives, and common-payoff games are considered to provide "maximally aligned" incentives. How do we quantitatively interpolate between these two extremes? That is, given an arbitrary payoff table representing a two-player normal-form game (like Prisoner's Dilemma), what extra information do we need in order to produce a real number quantifying agent alignment?
If this question is ill-posed, why is it ill-posed? And if it's not, we should probably understand how to quantify such a basic aspect of multi-agent interactions, if we want to reason about complicated multi-agent situations whose outcomes determine the value of humanity's future. (I started considering this question with Jacob Stavrianos over the last few months, while supervising his SERI project.)
Thoughts:
- Assume the alignment function has range or .
- Constant-sum games should have minimal alignment value, and common-payoff games should have maximal alignment value.
- The function probably has to consider a strategy profile (since different parts of a normal-form game can have different incentives; see e.g. equilibrium selection).
- The function should probably be a function of player A's alignment with player B; for example, in a prisoner's dilemma, player A might always cooperate and player B might always defect. Then it seems reasonable to consider whether A is aligned with B (in some sense), while B is not aligned with A (they pursue their own payoff without regard for A's payoff).
- So the function need not be symmetric over players.
- The function should be invariant to applying a separate positive affine transformation to each player's payoffs; it shouldn't matter whether you add 3 to player 1's payoffs, or multiply the payoffs by a half.
The function may or may not rely only on the players' orderings over outcome lotteries, ignoring the cardinal payoff values. I haven't thought much about this point, but it seems important.EDIT: I no longer think this point is important, but rather confused.
If I were interested in thinking about this more right now, I would:
- Do some thought experiments to pin down the intuitive concept. Consider simple games where my "alignment" concept returns a clear verdict, and use these to derive functional constraints (like symmetry in players, or the range of the function, or the extreme cases).
- See if I can get enough functional constraints to pin down a reasonable family of candidate solutions, or at least pin down the type signature.
Are you sure zero-sum games are maximally misaligned? Consider the joint payoff matrix
P=[(1,1)(1,1)(1,1)(1,1)]∼[(0,0)(0,0)(0,0)(0,0)]This matrix doesn't appear minimally aligned to me; instead, it seems maximally aligned. It might be a trivial case but has to be accounted for in the analysis, as it's simultaneously a constant sum game and a symmetric/common payoff game.
Q=[(1,−1)(−1,1)(−1,1)(1,−1)],I suppose alignment should be understood in terms of payoff sums. Let s be the (random!) strategy of player 1 and r be the strategy of player 2, and A and B be their individual payoff matrices. (So that the expected payoff of player 1 is sTAr.) Then they are aligned at s,r if the sum of expected payoffs sTAr+sTBr is "large" and misaligned if it is "small", where "large" and "small" need to be quantified, perhaps in relation to the maximal individual payoff, or perhaps something else.
For the matrix P above (with 1s), every strategy will yield the same large sum compared to the maximal individual payoff, and appears to be maximally aligned. In the case of, say
any strategy will yield a sum that is minimally mall (0) compared to the maximal individual payoff (1), which isn't minimally small, and it is minimally aligned.
(Comparing the sum of payoffs to the maximal individual may be wrong though, as it's not invariant under affine transformations. For instance, the sum of payoffs in the (0,0) representation of P is 0 and the individual payoffs are 0...)
It's a game, just a trivial one. Snakes and Ladders is also a game, and its payoff matrix is similar to this one, just with a little bit of randomness involved.
My intuition says that this game not only has maximal alignment, but is the only game (up to equivalence) game with maximal alignment for any set of strategies s,r. No matter what player 1 and player 2 does, the world is as good as it could be.
The case can be compared to the R2 when the variance of the dependent variable is 0. How much of the variance in the dependent variable does th... (read more)