This visual reduces mental load, shortens feedback loops, and effectively uses visual intuition.
Before, my understanding of Shapley values mostly came from the desirable properties (listed at the end of this article). But the actual formula itself didn't have justification in my mind beyond "well, it's uniquely determined by the desirable properties". I had seen preceding justifications in terms of building up the coalition in a certain sequence, taking the marginal value a player contributes when they join, and then averaging over all ways coalition could've been built up (like Alice Bob then Charlie, or Alice Charlie then Bob). However, this is both less intuitively desirable and also harder to compute. The beauty of this explanation is that:
That second property is rather important, for the same reason that code that compiles in a second allows for faster debugging than code that takes 5 minutes. There are more detailed posts on the subject that I cannot remember right now, but the idea is that by making the feedback quicker, you can more easily experiment and fiddle to understand whatever situation you are dealing with.
Here's some examples of concepts that came up in a casual conversation about Shapley values that I would've had a much harder time understanding without the post's explanation, to the point that I might not have bothered: (they get a bit into the weeds)
I've added the wonderful diagrams to the Wikipedia page, and I hope that more people will think of the Shapley value that way. The picture can be explained to a child, and having it makes complicated properties intuitive. The topic is important, and teaches us about actual bargaining and coordination - so thinking effectively about it matters.
Hervé Moulin (2004). Fair Division and Collective Welfare. Cambridge, Massachusetts: MIT Press. ISBN 9780262134231. pg 147–156
same ref as before, or ibid as the cool kids say.
For the curious, here it is: (my pictures not included - you however probably should draw them)
Alice, Bob, and Charlie are buying software.
Software W suffices for Alice and Charlie and costs $8
Software X suffices for Bob and Charlie and costs $9
Software Y suffices for Alice and Bob and costs $10
Software Z suffices for everyone and costs $17
So for example, Alice by herself has the best option of W for $8, Bob by himself has X for $9, Charlie by himself has $8.
They together buy software Z, as that works for all of them, and then start bickering about who should pay what.
Let be the cost each player pays in the game. The stand alone core is the system of inequalities
add the three together and divide by to get
but to foot the whole bill they need to pay at least $17 and ideally don't charge any more than the actual cost. Thus, contradiction.
Lastly, the Shapley value would charge Alice $5.5, Bob $6.5, and Charlie $5 while if Alice and Bob played as a single combined player they would be together charged $9.5 while Charlie would pay $7.5.
Shapley values are the ONLY way to guarantee: <Efficiency, Symmetry, Linearity, Null player properties>
Well it doesn't end at that: it turns out Shapley values for more than 2 players are not nicely behaved and instead violate Maximin Dominance, as demonstrated in https://www.lesswrong.com/posts/vJ7ggyjuP4u2yHNcP/threat-resistant-bargaining-megapost-introducing-the-rose#ROSE_Value__N_Player_Case__.
The article I link showed how this is fixed:
Shapley values are about adding everyone one-by-one to a team in a random order and everyone gets their marginal value they contributed to the team.
And that's kinda like giving everyone a random initiative ordering and giving everyone the surplus they can extract in the resulting initiative game.
If we're doing that, then maybe a player, regardless of their position, can ensure they get their maximin value? Maybe this sort of Random-Order Surplus Extraction can work. ROSE.
Curated. This was a quite nice introduction. I normally see Shapley values brought up in a context that's already moderately complicated, and having a nice simple explainer is helpful!
I'd like it if the post went into a bit more detail about when/how Shapley values tend to get used in real world contexts.
This is (mostly) a crosspost of my (pending review? and so i can't link to it?) comment from the EA forums replying to a commenter also asking for actual uses of Shapley values
The first real world example that comes to mind... isn't about agents bargaining. Namely, statistical models. The idea is that you have some subparts that each contribute to the prediction, and want to know which are the most important, and so you can calculate shapley values ("how well does this model do if it only uses age and sex to predict life expectancy, but not race", etc. for the other coalitions).
Here's a microecon stack exchange question that asks a similar thing as you. The only non stats answer states that a bank used Shapley values to determine capital allocation in investments. It sounds like they didn't have a problem using a 'time machine' because they had the performance of the investments and so could simply evaluate what returns they would've gotten had they invested differently. But I haven't read it thoroughly, so for all I know they stopped using it soon after, or had some other way to evaluate counterfactuals, etc.
Also the Lightcone (so, including you?) fundraising post mentioned trying to charge for half the surplus produced when setting Lighthaven prices (i.e. the 2 player special case of the Shapley value).
Of course, the 2 player case is much easier than even the 3 player case because you only need to know the other person's willingness to pay (that is, their value oven BATNA) and can then estimate your own costs (in total, one advantage-over-batna that doesn't just involve you needs to be determined) while for 3 players you need 3*2 = 6 comparisons and for n players you need total comparisons (each player giving the benefit they get for if that coalition occurred) of which are your comparisons (which, to be clear, aren't trivial, but at least you know your own preferences and situation and don't have to ask others about them). The first sum is which is faster than exponential growth, while the second sum is which means that discounting the comparisons that are about the value you get doesn't make the asymptotics better. This suggests that even just the communication costs get pretty high pretty fast unless you have a compact way to encode how much value you get out of the interactions (like in the bank example, I think you only need to be told the individual performance history, and then can just compute the value in each investment counterfactual). So if there's nonlinear relationships between people (read: real life most of the time) my intuition is that you are screwed?
Explaining the Shapley value in terms of the "synergies" (and the helpful split in the Venn diagram) makes much more intuitive sense than the more complex normal formula without synergies, which is usually just given without motivation. That being said, it requires first computing the synergies, which seems somewhat confusing for more than three players. The article itself doesn't mention the formula for the synergy function, but Wikipedia has it.
I thought this too- working with Shapley values is quite intuitive, and the article does an excellent job of this- but how do we derive the synergy values to plug-in in the first place? How do we know that Liam + Emma’s synergy = 0?
Liam alone makes $10
Emma alone makes $20
Liam + Emma make $30
$30 - ($10 + $20) = $0, their synergy.
In general: the synergy is how much more or less the coalition gets than each member's individual contribution plus all subset synergies.
A problem I have with Shapley Values is that they can be exploited by "being more people".
Suppose Alice and Bob can make a joint venture with a payout of $300. Synergies:
Shapley says they each get $150. So far, so good.
Now suppose Bob partners with Carol and they make a deal that any joint ventures require both of them to approve; they each get a veto. Now the synergies are:
Shapley now says Alice, Bob, and Carol each get $100, which means Bob+Carol are getting more total money ($200) than Bob alone was ($150), even though they are (together) making exactly the same contribution that Bob was paid $150 for making in the first example.
(Bob personally made less, but if he charges Carol a $75 finder's fee then Bob and Carol both end up with more money than in the first example, while Alice ends up with less.)
By adding more partners to their coalition (each with veto power over the whole collective), the coalition can extract an arbitrarily large share of the value.
I'm not sure what you're trying to say.
My concern is that if Bob knows that Alice will consent to a Shapley distribution, then Bob can seize more value for himself without creating new value. I feel that a person or group shouldn't be able to get a larger share by intentionally hobbling themselves.
You can make it work without an explicit veto. Bob convinces Alice that Carol will be a valuable contributor to the team. In fact, Carol does nothing, but Bob follows a strategy of "Do nothing unless Carol is present". This achieves the same synergies:
In this way Bob has managed to redirect some of Alice's payouts by introducing a player who does nothing except remove a bottleneck he added into his own playstyle in order to exploit Alice.
Shapley values are constructed such that introducing a null player doesn't change the result. You are doing something different by considering the wrong counterfactual (one where C exists but isn't part of the coalition, vs one when it doesn't exist)
Sounds like you agree with both me and Ninety-Three about the descriptive claim that the Shapley Value has, in fact, been changed, and have not yet expressed any position regarding the normative claim that this is a problem?
No, the shapley value hasn't been changed. The correct way to do this would have been, that, if:
A: $0
B: $0
A+B: $300
then
A: $0
B: $0
A+B: $0 (Carol vetoes) A+B: $300
A+C: $0
B+C: $0
A+B+C: $300
Your example is wrong becuase you are not leaving the A+B case unchanged.
I agree that "being more people" is a problem in coalitional dynamics with vetos, but I don't think this is a problem with the Shapley value solution. I agree that when trying to apply the Shapley value solution, you should make sure to set C's value as zero (even though it might hurt egos), etc.
Your example is wrong becuase you are not leaving the A+B case unchanged.
On what basis do you claim that the A+B case should be unchanged? The entire point of the example is that Carol now actually has the power to stop A+B and thus they actually can't do anything without her on board.
If you are intending to make some argument along the lines of "a veto is only a formal power, so we should just ignore it" then the example can trivially be modified so that B's resources are locked in a physical vault with a physical lock that literally can't be opened without C. The fact that B can intentionally surrender some of his capabilities to C is a fact of physical reality and exists whether you like it or not.
I taught game theory at Princeton and wish I'd seen this explanation beforehand, excellent framing.
Do you know whether the person who wrote this would be OK with crossposting the complete content of the article to LW? I would be interested in curating it and sending it out in our 30,000 subscriber curation newsletter, if they were up for it.
The notion of "who was involved" is kinda weird. Like, suppose there is Greg. Greg will firebomb The Project if he is not involved. If he's involved, he will put in modest effort. Should he receive enormous share just because of this threat? It's very not fair.
What is the counterfactual construction procedure here? Like, assume that other players stopped existing for calculating value of a coalition that doesn't include them? But they are still there, in the world. And often it's not clear even what it would mean for them to do nothing.
o1 suggested to model The Greg Gambit as a Partition Function Game, but claimed it's all complicated there. Or maybe model it as a bargaining.
I think you need a decision theory (+ a theory of counterfactuals, which is basically going to have to be a theory of logical counterfactuals if you want to prevent extortion from Omega, and uhh good luck figuring that out) for this. We compare to counterfactuals where the other agents aren't destroying value for the sake of extortion because agents with a good decision theory will refuse to give in in those cases. Now let's imagine that Greg, for genuinely unrelated reasons, will lead to the project's downfall (say, he usually mows his lawn in the morning, and the project requires quiet at that time). If Greg chooses to not mow his lawn to help the project, I'd call that "participating in the coalition", and he should get some value from doing so. The point, after all, is to incentivize people to contribute to the project and also to be resistant to extortion.
Yeah, I don't know how it should work properly when people factor in information about decision procedures of other people. I guess Shapley values might be Newton's laws versus Special relativity kind of deal, when they might mostly work most of the time. Or it might be more like applied design thing, where everything switches to work on completely different underlying logic if it gets you even modest improvement. Idk.
Thank you for this insightful post! When discussing value distribution with my partners, we faced the challenge of fairly allocating contributions without precise knowledge of their impact. I proposed a solution: involving an external evaluator with business expertise but no direct access to the function. Their task was to predict value splits, and their reward was proportional to how accurate their estimates were compared to the final distribution.
This approach aimed to handle uncertainty while guiding team efforts strategically. It’s fascinating to see how Shapley values offer a theoretical foundation for such practical challenges.
To clarify: the claim is that Shapley values are the only way to guarantee the set containing all four properties: {Efficiency, Symmetry, Linearity, Null player}. There are other metrics that can achieve proper subsets.
Hopefully, you have gained some intuition for why Shapley values are “fair” and why they account for interactions among players.
The article fails to make a key point: in political economy and game theory, there are many definitions of "fairness" that seem plausible at face value, especially when considered one at a time. Even if one puts normative questions to the side, there are mathematical limits and constraints as one tries to satisfy various combinations simultaneously. Keeping these in mind, you can think of this as a design problem; it takes some care to choose metrics that reinforce some set of desired norms.
I think you may have mixed up the ordering halfway through the example: in the first and third tables 'Emma and you' is $90 while 'Emma and Liam'is $30, but in the second it's the other way around, and some of the charts seem odd as a result?
Shapley values are the ONLY way to guarantee:
- Efficiency — The sum of Shapley values adds up to the total payoff for the full group (in our case, $280).
- Symmetry — If two players interact identically with the rest of the group, their Shapley values are equal.
- Linearity — If the group runs a lemonade stand on two different days (with different team dynamics on each day), a player’s Shapley value is the sum of their payouts from each day.
- Null player — If a player contributes nothing on their own and never affects group dynamics, their Shapley value is 0.
I don't think this is true. Consider an alternative distribution in which each player receives their full "solo profits", and receives a share of each synergy bonus equal to their solo profits divided by the sum of all solo profits of all players involved in the synergy bonus. In the above example, you receive 100% of your solo profits, 30/(30+10)=3/4 of the You-Liam synergy, 30/(30+20)=3/5 of the You-Emma synergy, and (30/30+20+10)=1/2 of the everyone synergy, for a total payout of $159. This is justified on the intuition that your higher solo profits suggest you are doing "more work" and deserve a larger share.
This distribution does have the unusual property that if a player's solo profits are 0, they can never receive any payouts even if they do produce synergy bonuses. This seems like a serious flaw, since it gives "synergy-only" players no incentive to participate, but unless I've missed something it does meet all the above criteria.
I don't think this proposal satisfies Linearity (sorry, didn't see kave's reply before posting). Consider two days, two players.
Day 1:
Result: $400 to A, $0 to B.
Day 2:
Result: $100 to A, $100 to B.
Combined:
Result: $450 to A, $150 to B. Whereas if you add the results for day 1 and day 2, you get $500 to A, $100 to B.
Ah, I was going off the given description of linearity which makes it pretty trivial to say "You can sum two days of payouts and call that the new value", looking up the proper specification I see it's actually about combining two separate games into one game and keeping the payouts the same. This distribution indeed lacks that property.
I'm just learning this, please forgive me if I'm misunderstanding. I'm calculating your example differently though:
Day 1: (200 + (400-200-0)/2) = 300 to A (0 + (400-200-0)/2) = 100 to B
Day 2: (100 + (200-100-100)/2) = 100 to A (100 + (200-100-100)/2) = 100 to B
Day 1+2: (300 + (600-300-100)/2) = 400 to A (100 + (600-300-100)/2) = 200 to A
300+100 does equal 400, 100+100 does equal 200
Sum of parts does equal the combined?
Doh! Thanks for the clarification. I see I misunderstood you targeting Ninety-Three's proposal about locking in a "more work" ratio.
For me, locking in the ratio of solo profits intuitively feels unfair, and would not be a deal I'd agree to. Translating feeling to words, my personally-intuitive Alice (A) and Bethany (B) story would go:
Alice is a trained watchmaker, Bethany makes robots. They both go into the business of watch-making.
Alone, Alice pulls in $10,000/day. Expensive watches, but very slow to make.
Alone, Bethany pulls a meer $150/day. Cheapo ones, but she can produce tons!
Together, with Alice's expertise + Bethany's robot automation, they make $150,000/day!
Alone, neither is able to compensate for their weakness: Alice's is production speed, Bethany's being quality. The magic from their synergy comes from their individual weaknesses being overwritten by the other's strengths. Value added is a completely separate entity versus the ratio of their solo efforts; it simply does not exist unless they partner up. Hence, I must treat it separately and that difference in total value vs. the sum of their individual efforts rightfully should be divided equally.
The value/cost of doing business w/ others, perhaps?
Emma, Liam$20 + $30 + $40$90
I think the emma/liam and you/emma rows are switched in the synergy table
Playing around with the math, it looks like Shapley Values are also cartel-independent, which was a bit of a surprise to me given my prior informal understanding. Consider a lemonade stand where Alice (A) has the only lemonade recipe and Bob (B1) and Bert (B2) have the only lemon trees. Let's suppose that the following coalitions all make $100 (all others make $0):
Then the Shapley division is:
If Bob and Bert form a cartel/union/merger and split the profits then the fair division is the same.
Previously I was expecting that if there are a large number of Bs and they don't coordinate, then Alice would get a higher proportion of the profits, which is what we see in real life. This also seems to be the instinct of others (example).
I think I'm still missing something, not sure what.
If B1 and B2 structure their cartel such that each of them gets a veto over the other, then the synergies change so that A+B1 and A+B2 both generate nothing, and you need A+B1+B2 to make the $100, which means B1 and B2 each now have a Shapley value of $33.3 (up from $25).
Also, I wouldn't describe the original Shapley Values as "no coordination". With no coordination, there's no reason the end result should involve paying any non-zero amount to both B1 and B2, since you only need one of them to assent. I think Shapley Values represent a situation that's more like "everyone (including Alice) coordinates".
They definitely aren't Cartel independent! Let's take your example, and imagine that our "cartel" is Alice and Bob, forming a combined coalition player ("AlicoBob, sitting in a tree, k-i-s-s-i-n-g")
AlicoBob by themself can make $100. Bert by himself can make $0. AlicoBob + Bert can make $100. The synergy is $0, so the Shapley value is that AlicoBob gets everything and Bert goes home sad.
However, Alice could've also formed a cartel with Bert, and then Bob would go home sad. So there's like an equilibrium thing going on here where both Bob and Bert want to be the sole partners of Alice and leave the other one out, and so what I expect would happen in real life is that if for some reason one of the two got there first that they would naturally form a coalition and demand more concessions from the excluded party, while Alice would then also demand more concessions from Bob because she can threaten to go and collude with Bert instead.
This is basically the "stand alone core" property that is sometimes logically impossible to satisfy, so I guess it's not too sad that the Shapley value doesn't live up to it.
A huge thanks to @Agustín Covarrubias 🔸 for his feedback and support on the following article:
Shapley values are an extremely popular tool in both economics and explainable AI.
In this article, we use the concept of “synergy” to build intuition for why Shapley values are fair. There are four unique properties to Shapley values, and all of them can be justified visually. Let’s dive in!
The Game
On a sunny summer day, you and your two best friends decide to run a lemonade stand. Everyone contributes something special: Emma shares her family’s secret recipe, Liam finds premium-quality sugar, and you draw colorful posters.
The stand is a big hit! The group ends up making $280. But how best to split the profits? Each person contributed in a different way, and the success was clearly due to teamwork…
Luckily, Emma has a time machine. She goes back in time — redoing the day with different combinations of team members and recording the profits. This is how each simulation went:
Individually, Emma makes 20 dollars running the lemonade stand, and you make 30 dollars. But working together, the team makes 90 dollars.
The sum of individual profits is 20 + 30 = 50 dollars, which is clearly less than 90 dollars. That extra 90 - 50 = 40 dollars can be attributed to team dynamics. In game theory, this bonus is called the “synergy” of you and Emma. Let’s visualize our scenario as a Venn diagram.
The synergy bonuses in the Venn diagram are “unlocked” when the intersecting people are part of the team. To calculate total profit, we add up all areas relevant to that team.
For example, when the team consists of just you and Liam, three portions of the Venn diagram are unlocked: the area exclusive to you (30 dollars), the area exclusive to Liam (10 dollars), and the area exclusively shared by you and Liam (60 dollars). Adding these areas together, the total profit for team “You and Liam” comes out to 30 + 10 + 60 = 100 dollars.
Referring to our Venn diagram, the same formula holds true for every other team:
Emma and Liam are impatient and want their fair share of money. They turn to you, the quick-witted leader for help. While staring at the Venn diagram, an idea strikes!
Take a moment to look over the visual. How would you slice up the Venn diagram fairly? Pause here, and continue when ready.
You decide to take each “synergy bonus” and cut it evenly among those involved.
Doing the math, each person’s share comes out to:
Emma's share=20+12⋅40+12⋅0+13⋅120=80Your share=30+12⋅40+12⋅60+13⋅120=120Liam's share=10+12⋅60+12⋅0+13⋅120=80Emma and Liam agree the splits are fair. The money is handed out, and everyone skips happily home to dinner.
In this story, the final payouts are the Shapley values of each team member. This intuition is all you need to understand Shapley values. For the adventurous reader, we now tie things back to formal game theory.
The Formalities
Shapley values are a concept from cooperative game theory. You, Liam, and Emma are all considered “players” in a “coalition game”. Every possible “coalition” (or team) has a certain “payoff” (or profit). The mapping between coalition and payoff (a.k.a. which just corresponds to our first table of profits) is called the “characteristic function” (as it defines the nature, or *character*, of the game).
We define a set of players N (which, in this case, is You, Emma, and Liam), and a characteristic function v(S), where S⊆N:
v({"Emma"})=20v({"You"})=30v({"Liam"})=10v({"Emma","You"})=90v({"You","Liam"})=100v({"Liam","Emma"})=30v({"Emma","You","Liam"})=280We can see how this is the same mapping we had in our table of profits by players:
We also define a synergy function labeled w(S) where S⊆N:
w({"Emma"})=20w({"You"})=30w({"Liam"})=10w({"Emma","You"})=40w({"You","Liam"})=60w({"Liam","Emma"})=0w({"Emma","You","Liam"})=120Similarly, the synergy function just corresponds to areas of the Venn diagram:
Thus, for a given player i, the Shapley value is written as:
∑all groups Sincluding isynergy of group Snumber of players in SWhich, in more compact notation, becomes:
∑all groups Sincluding iw(S)number of players in S∑all groups Sincluding iw(S)|S|∑i∈S⊆Gw(S)|S|The last is exactly the formula described on Wikipedia.
Concluding Notes
Shapley values are the ONLY way to guarantee:
Take a moment to justify these properties visually.
No matter what game you play and who you play with, Shapley values always preserve these natural properties of “fairness”.
Hopefully, you have gained some intuition for why Shapley values are “fair” and why they account for interactions among players. Proofs and more rigorous definitions can be found on Wikipedia.
Thanks for reading! :)