(midco developed this separately from our project last term, so this is actually my first read)
I have a lot of small questions.
What is your formal definition of the IEU ? What kinds of goals is it conditioning on (because IEU is what you compute after you view your type in a Bayesian game)?
Multi-agent "impact" seems like it should deal with the Shapley value. Do you have opinions on how this should fit in?
You note that your formalism has some EDT-like properties with respect to impact:
Well, in a sense, they do. The universes where player shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player doesn't choose their action, the coin () does. If we condition on the value of , then each player's action becomes deterministic, thus IEU is constant across each player's (trivial) action space.
This seems weird and not entailed by the definition of IEU, so I'm pretty surprised that IEU would tell you to shout 'heads.'
Given arbitrary R.V.s A, B, we define the estimate of A given B=b as
Is this supposed to be ? If so, this is more traditionally called the conditional expectation of A given B=b.
Answering questions one-by-one:
Upon reflection, I now suspect that the Impact is analogous to Shapley Value. In particular, the post could be reformulated using Shapley values and would attain similar results. I'm not sure whether Impact-scarcity of Shapley values holds, but the examples from the post suggest that it does.
(thanks to TurnTrout for pointing this out!)
This post is an informal writeup of an idea discovered while investigating POWER. It offers an additional tool for framing multi-agent games in terms of attainable utility, with the hope that data from both IMPACT and POWER can offer a more complete view of POWER-scarcity dynamics in multi-agent games.
Note on writing style: I use royal "we" for explaining math; the results are essentially my own unreviewed work.
Overview
The purpose of this post is to define and give intuition for a measure of a player's impact on some real-valued outcome variable in an arbitrary multiplayer game; we call this measure "IMPACT". We present motivating examples of multiplayer games, then define IMPACT in the more general context of arbitrary Bayesian networks. We then explore some basic connections between IMPACT and standard multiplayer game theory and present some conjectures to motivate further research.
Motivating Examples - Understanding Multiplayer Games
One of the difficulties in understanding multiplayer games is the principle that in general, a player's optimal action is dependent on the actions of other players. One way around this problem is to restrict consideration to Nash Equilibria, which fully specify optimal actions for us. However, this loses the power to describe sub-optimal actions, which are relevant in real-world examples involving imperfect humans and intractably large action spaces.
While Nash Equilibria "work" by first conditioning on optimal actions and then considering probabilistic strategies, we go the other way around: fix some arbitrary mixed strategy profile, then analyze expected utility. In the single-player setting, this is analogous to a value function and can be constructed straightforwardly. In the multiplayer setting, we're forced to deal with dependencies on other players' strategies that complicate the notion of "expected utility". To handle this dependency, our framework makes the following assumptions motivated by Bayesian games:
Borrowing notation from Bayesian games, we now have enough machinery to consider various games in terms of IEU ui:Ai→R. In each example, we fix a strategy profile a∼σ and assume some common payoff R (which we'll later describe as the "outcome variable" in the formal definition of IMPACT).
Counting heads
Let ai∼Ber(12), R=∑ni=1ai. This game can be thought of as follows: "everyone flips a coin, with reward given by the number of heads". Intuitively, the strategy for this game should be simple: flipping heads is better than flipping tails. Calculating IEU, we see that the results match our prediction:
ui(1)=1+Ea−i[∑j≠iai]=1+∑j≠i(12)=n−12+1ui(0)=Ea−i[∑j≠iai]=∑j≠i(12)=n−12In fact, the difference in IEU is exactly 1, contributed by the added heads coming from coin i.
Illusory Impact
Let ai∼Ber(12), R=⊕ni=1ai. This game can be thought of as follows: "everyone flips a coin; you win iff there are an odd number of heads". This game has nice correlated equilibria: if players can coordinate and determine their coin's side, then they can easily win. However, we assume each player just blindly flips their coin, regardless of reward. Even though player i's strategy is random, we can compute IEU for each choice of action:
ui(ai)=P[⊕nj=1aj=1]=P[⊕j≠iaj≠ai]=12Importantly, IEU is constant over choice of ai, which means that player i has no "good" strategy (even assuming they can choose what side their coin lands on).
Coordinated Impact
Let ω∼Ber(12). Now, let ai,R=ω. This game can be thought of as follows: "a referee flips a coin, everyone shouts the result of the coin flip, and everyone wins iff it's heads". Intuitively, this is a ridiculous game: no player produces a meaningful action; the entire game is determined by the referee's coin flip. However, if we shut our eyes and blindly compute IEU, we find that ui(1)=1,ui(0)=0, thus player i benefits from shouting "heads"?!
Well, in a sense, they do. The universes where player i shouts "heads" are exactly the universes in which everyone wins. The problem is that of agency: player i doesn't choose their action, the coin (ω) does. If we condition on the value of ω, then each player's action becomes deterministic, thus IEU is constant across each player's (trivial) action space.
Interestingly, we have a clear notion of the "IEU" of values of ω, even though it's an external variable rather than an action in the game. This suggests a limitation of conceptualizing IMPACT strictly in terms of games: variables have impact, not just actions. In the Bayesian network formalization to come, we'll see that the node ω impacts the outcome variable, while no nodes ai do.
Defining IMPACT using Bayesian Networks
As suggested in the "Coordinated impact" example, the most principled approach is to define IMPACT as a property of dependent variables, then consider game theory as a useful application. Again motivated by the "Coordinated impact" example, we can show that IMPACT must explicitly consider variable dependencies to avoid issues with double-counting. We formalize with Bayesian networks to provide the desired dependency structure.
Definition - General Bayesian Networks
Borrowing notation from Wikipedia, consider an arbitrary Bayesian network G=(V,E) with variables {Xv}v∈V. Additionally, choose an "outcome node" vO∈V (we can assume that vO is a descendant of each v∈V, but the assumption isn't required); we use XvO as our outcome variable to measure IMPACT against.
We now define some notation:
- Given arbitrary R.V.s A,B, we define the conditional expectation of A given B=b as
e(A,B):=EB=b[A]Note that e(A,B) is itself a random variable in the value of B.
- Given R.V.s A,B, we call B a marginal variable of A if the R.V. identity B=e(A,B) holds. Intuitively, we can think of B as an estimate of A given limited information.
- Consider nodes v1,v2∈V. We say v1 is an ancestor of v2 (equivalently, v2 is a descendant of v1) iff v1≠v2 and there exists a directed path v1→v2. This relationship is direct iff such a path consists of a single edge.
- Let A(v) be the set of ancestors of node v∈V. Let Ad(v) be the set of direct ancestors of node v∈V.
- Given node v∈V, define the IMPACT of Xv on XvO to be the following R.V.
I(v):=e(XVO,{Xu∣u∈Ad(v)∨u=v})−e(XVO,{Xu∣u∈Ad(v)})We now work toward a notion of IMPACT-scarcity - the idea that the "magnitude" of IMPACT of each node is bounded above. We will eventually demonstrate this claim in terms of the sum of variances of I(v). First, we prove some necessary lemmas:
Lemma 1: Given an arbitrary topological ordering V={vi}ni=1, we can construct the following collection of R.V.s
Δi:=e(XvO,{Xvj∣j≤i})−e(XvO,{Xvj∣j<i})We now claim the following identity on R.V.s:
n∑i=1Δi≡XvO−e(XvO,∅)Proof: The identity follows from a telescoping sums argument, as well as the observation that for vi=vO, we have e(Xvi,{Xvj∣j≤i})≡XvO.
Lemma 2: Consider R.V.s A,B s.t. B is a marginal variable of A. Then Var(A)≥Var(B).
Proof: Consider an arbitrary vector space of R.V.s containing A,B. We see that the function f(v)→e(v,B) is a projection, while g(v)→Var(v) is a norm. Thus, the claim is equivalent to g(v)≥g(f(v)), which is a property of general vector spaces.
Lemma 3: Consider arbitrary R.V.s A,B,C. If e(A,B)≡A (if B fully determines A), then e(C,A) is a marginal variable of e(C,B).
Proof: We observe the following
e(C,A)=EB|A[e(C,{A,B})]=EB|A[e(C,B)]=e(e(C,B),A)Now, consider the quantity e(e(C,B),e(C,A)). First, we see that A fully determines e(C,A). Thus, viewing e(X,Y) as a least-squares estimate of X given Y, we find that the estimate e(e(C,B),e(C,A)) is at most as accurate as e(e(C,B),A)=e(C,A). However, e(e(C,B),e(C,A)) "knows" e(C,A) by the definition of e, thus the optimal estimate is
e(e(C,B),e(C,A))=e(C,A)The result follows by the definition of a marginal variable.
Note: We could also prove the result by expressing "A is a marginal variable of B" as "some vector space projection maps B to A" (equivalently, A=e(B,C) for some C), the result then follows from e(C,A)=e(e(C,B),A).
Lemma 4: Consider arbitrary 1≤i<j≤n. Then the R.V. e(Δj,Δi)=0.
Proof: By "pausing" evaluation of the Bayesian network before Xvj is determined, we can argue that the R.V. e(Δj,{Xvk|k<j})≡0. Since Δi is fully determined by {Xvk|k<j}, we conclude by Lemma 3 that e(Δj,Δi)is a marginal variable of e(Δj,{Xvk|k<j}).
By Lemma 2 (and noting that all vector space projections map 0 to 0), we conclude e(Δj,Δi)≡0.
Lemma 5: For each 1≤i≤n, I(vi) is a marginal variable of Δi.
Proof: We invoke Lemma 3, choosing A={Xu∣u∈A(vi)∨u=vi} and B={Xvj∣j≤i}.
We can now proceed with our claim of IMPACT-scarcity:
Theorem 1 (IMPACT-scarcity):
n∑i=1Var(I(vi))≤Var(XvO)Proof: By Lemma 1, we have the following R.V. identity
n∑i=1Δi≡XvO−e(XvO,∅)We now compute variance of both sides. By Lemma 4, the Cov(Δi,Δj) terms are 0 for i≠j. Thus, we're left with
n∑i=1Var(Δi)≤Var(XvO)We finish by applying lemmas 5 and 2, which prove Var(I(vi))≤Var(Δi).
Note: The only inequality in the above proof is the equation Var(I(vi))≤Var(Δi). Thus, equality is achieved when this equation is an equality for each 1≤i≤n (for example, in chain-shaped Bayesian networks).
Application - Representing a Multiplayer Game as a Bayesian Network
As promised, we now apply our framework for IMPACT to the game theory framework from earlier. To begin, consider an arbitrary multiplayer (Bayesian) game with fixed strategy profile σ . We represent the mechanics of the game and the players' strategies as a Bayesian network:
Note: In an abuse of notation, we let ai refer both to the R.V. representing player i's action and to the node ai in the Bayesian network (thus, Xai (Bayesian network variable) ∼ai (action)). Thus, statements like I(ai) can be parsed as "the impact of player i's action ai" without the need for cumbersome notation (and similarly for other nodes in the Bayesian network).
For now, we define an arbitrary outcome node O as a direct descendant of every other node. We will later set XO to represent game-theoretically meaningful quantities; in particular player i's reward Ri.
Additionally, call a node v deterministic if Xv is fully determined by {Xu∣u∈Ad(v)} (equivalently, if e(Xv,{Xu∣u∈Ad(v)})=Xv). Observe that for any deterministic node v, we have I(v)≡0 by the definition of IMPACT. This has two important implications for our model:
We're left with the IMPACT terms from ω and each ai, which we interpret in game-theoretic language:
- The IMPACT I(ω) can be thought of as "how good a random draw is the chosen value of ω?" This doesn't translate precisely (as far as I know), but intuition can be gained from viewing the coordinated impact example as a function ω→O.
- The IMPACT I(ai) "looks like" player i's IEU under the reward function given by XO. More specifically: letting O=Ri, we have
ui(ai)≡e(Ri,{ti,ai})≡e(Ri,{ti})+I(ai)The residual term e(XO,{ti}) is best understood as analogous to I(ω), but considering ti as the fundamental random quantity (instead of ω, its source of randomness). Since player i only acts based on ti, this corresponds to player i's logic of "how good a random draw do I think ω is, given only knowledge of ti?"
Game Theory in terms of IMPACT
While research on game-theoretic results from the perspective of IMPACT is extremely limited (I only know of the preliminary work I've already done), the best litmus test for a proposed framework is to see if it readily produces meaningful results. In this section, I'll outline the immediate results from defining Impact in the setting of multiplayer games and suggest some avenues for further exploration.
POWER- and IMPACT-scarcity
The crux of these results is the fundamental notion of IMPACT-scarcity and connection between I(ai) and player i's IEU. We begin with stating our IMPACT-scarcity result in terms of outcome variable O:
∑Var(I(vi))=I(ω)+n∑i=1I(ai)≤Var(O)One natural vein of results comes from plugging in variables for O and seeing what comes out. We give some basic examples:
As mentioned in the intro, one goal of IMPACT research is to unify the idea of POWER- and IMPACT-scarcity. I suspect that the intuitive understanding is "IMPACT = change in POWER", motivated by the simplification of "POWER = E, IMPACT = Var". While the notion remains far from precise, I conjecture that IMPACT on a player's POWER is a marginal variable of IMPACT on that player's reward, from which an upper bound on "ΔPOWER" follows.
IEU in terms of IMPACT
We now start from our other main result: ui(ai)=e(Ri,ti)+I(ai). Since we don't assume any information about IEU by default, a natural starting point is in the case of a Nash Equilibrium.
By definition of (Bayesian) Nash Equilibrium, each player's strategy must be a best response. Thus, for each ai in the support of σi(ti), we have ui(a)=maxai(ui(ai))=M. This implies e(Ri,ti)=I(ai)=0, which can be understood by arguing that in a Nash Equilibrium, no player can unilaterally increase their expected reward.
Sensing a deeper connection between IMPACT and Nash Equilibria, we define the self-IMPACT of action ai to be I(ai) given O=Ri. Above, we showed that in a Nash Equilibrium, each player's actions have zero self-IMPACT. Generalizing to suboptimal actions, we find that all actions have self-IMPACT ≤0, with equality when the action is a best response.
Unfortunately, the converse doesn't hold: each player only taking zero self-IMPACT actions doesn't imply a Nash equilibrium. It implies a Nash Equilibrium of the game where the action spaces Ai are restricted to only actions played with nonzero probability, but these notions aren't equivalent if strong actions remain unplayed (consider the mixed-strategy equilibrium for rock-paper-scissors when generalized to rock-paper-scissors-[insta-win action]).
We can also explore the fact that in a Nash Equilibrium, POWER equals IEU. Thus, we can write POWER(i,σ)=e(Ri,ti), which is strictly a function of ω. Intuitively, IMPACT accounts for variation in ai while POWER takes a max over it, thus they become equivalent in limiting cases for σi.
Considering other players / IMPACT-trading
As a final and unexplored angle on IMPACT, consider the case where player 1 impacts R2 and player 2 impacts R1. Assuming it wouldn't adversely affect their own utilities, the players can "trade" by modifying their strategies to mutually grant each other increased reward. The premise itself immediately raises red flags; I'll attempt to briefly address them:
Temporarily putting the issues aside, I intend to explore IMPACT-trading as an attempt to understand coordination-centric games like the Prisoners' Dilemma. More generally, I hope to apply the IMPACT framework to a broad range of multiplayer game-theoretic phenomena and see if new insight can be gained.