So how are we going to calculate weights ψ which make p optimal among F?
The idea here is to identify the Harsanyi hyperplane H, which contains all of the joint utilities u∈Rn which satisfy H(u,ϕ)=H(p,ϕ). Where ϕ are the weights which make our chosen point p∈Rn optimal with respect to H(_,ϕ). And we're going to calculate new weights ψ which make p optimal with respect to G(_,ψ). It turns out it's sufficient to make p optimal among H, and p will also be optimal across our entire feasible set F.
In terms of calculus, we're going to be constructing a function G∘H, which tells us about how moving around on H changes G. And we're going to choose weights ψ which make the gradient ∇v(G∘H) equal 0 at p. This makes it a local optimum, and it will turn out to be a global maximum across H, which in turn will make it a global maximum across F.
Geometrically, we can think of that as the surface gradient of G across H. And so in terms of the overall gradient ∇uG, we're designing ψ so that ∇uG is perpendicular to H at p.
Parameterizing the Harsanyi Hyperplane
When thinking about moving around on the Harsanyi hyperplane H, we have a linear constraint that says no matter which u∈H we pick, we know that u⋅ϕ=p⋅ϕ=H(p,ϕ). If we know u lies on H, we can calculate the the n-th agent's utility un from the first n−1 utilities. We'll be referring to these first n-1 utilities a lot, so let's call them v∈Rn−1. So vi=ui for all i<n.
H and G are both symmetrical with respect to shuffling the indices of agents around, so without loss of generality we'll assume that the n-th agent is one we're assigning positive Harsanyi weight to: ϕn > 0. This is necessary for the reconstruction to work for all ϕ.
So we can think of H as a function H:Rn−1→Rn, where the j-th output is Hj(v)=uj for j<n. We can use the n-th output to reconstruct un given v like this: Hn(v)=H(p,ϕ)−∑n−1i=1viϕiϕn. This lets us move around Rn−1 to pick v however we want, and the function H will map that to its image, helpfully also called H!
Alright, now we have H:Rn−1→Rn and we also have G:Rn×[0,1]n→R, which is the geometric weighted average whose gradient we're trying to design through our choice of ψ. So let's compose them together to form G∘H:Rn−1×[0,1]n→R. And since we want p to be an optimum of G across the hyperplane H, we can set the gradient ∇(G∘H)(q,ψ)=0, where q∈Rn−1 are the first n−1 utilities of our target joint utility p. Solving this equation for ψ will give us the weights we need!
This looks like to solving a family of n−1 equations∂(G∘H)∂vi(q,ψ)=0. Where we're holding the weights constant for the purposes of differentiation, but we'll be solving for the weights that make the derivative 0 at p.
How Does G Change as We Change These Parameters?
Ok, we've built up a few layers of abstraction, so let's start unpacking. By the chain rule, and using the notation that Hj is the j-th element of the output of H:
∂(G∘H)∂vi(v,ψ)=∑nj=1∂G∂Hj(H(v),ψ)∂Hj∂vi(v)
How does our point on H change as we change these parameters?
Let's start by computing ∂Hj∂vi(v).
For the first n-1 terms this is simple, because Hj simply returns vj. So ∂Hj∂vi is 1 when i=j, and 0 otherwise, which we can represent using the Kronecker deltaδji. And ∂Hn∂vi=−ϕiϕn.
Geometrically, this is telling us about the slope of H. Note that:
∂Hn∂vi is constant and doesn't depend on our choice of vi
∂Hn∂vi≤0 (We can never increase agent n's utility by increasing another agent's utility. This is always true at the Pareto frontier.)
How Does G Change as We Move Around on H?
We can start solving for ∂G∂Hj by substituting in the definition of G:
Thankfully, ∂∂Hj(Hψkk)=0 whenever k≠j, leaving just ∂∂Hj(Hψjj)=ψjHψj−1j. We can also notice ∏nk=1Hψkk=G, leaving us with the much nicer ∂G∂Hj=GψjHψj−1jHψjj=GψjHj=ψjHjG.
It will be important later that this partial derivative is undefined when Hj=0, aka wherever any agent is receiving their least feasible utility.
Writing function arguments explicitly:
∂G∂Hj(H(v),ψ)=ψjHj(v)G(H(v),ψ)
Putting These Terms Together
Let's start putting these together. We can start by breaking apart the two cases of ∂Hj∂vi, like this:
Here's one reason why it's useful to know about the Kronecker deltaδji: it filters out all but the i-th element of a sum: ∑jajδji=ai. When you're working in Einstein notation (which is great by the way), you just write it as ajδji=ai and you can think of the j's as "cancelling".
And that is the family of n−1 equations that we want to all be 0 when v=q. (This causes H(v)=u=p.) We'll call this gradient ∇v(G∘H) to remind ourselves that this is the gradient of (G∘H)(v,ψ) where we're holding the weights ψ constant.
Solving for the Geometric Weights
Ok, now we can set v=q, ∂(G∘H)∂vi=0 and solve for ψi, for i<n:
(ψiHi(q)−ψnϕiHn(q)ϕn)G(H(q),ψ)=0
ψiHi(q)G(H(q),ψ)−ψnϕiHn(q)ϕnG(H(q),ψ)==0
ψiHi(q)G(H(q),ψ)=ψnϕiHn(q)ϕnG(H(q),ψ)
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
ψipi=ψnϕipnϕn
ψi=ψnpiϕipnϕn
This is still a system of linear equations we need to solve, since each ψi for i<n depends on ψn, which in turn satisfies ψn=1−∑n−1i=1ψi. So let's solve it for ψn!
ψn=1−∑n−1i=1ψi
ψn=1−∑n−1i=1ψnpiϕipnϕn
ψn=1−ψn∑n−1i=1piϕipnϕn
ψn+ψn∑n−1i=1piϕipnϕn=1
ψn(1+∑n−1i=1piϕipnϕn)=1
ψn=11+∑n−1i=1piϕipnϕn
ψn=11+∑n−1i=1piϕipnϕn
Remembering that H(p,ϕ)=∑ni=1piϕi=∑n−1i=1piϕi+pnϕn, we can notice that:
H(p,ϕ)pnϕn=∑n−1i=1piϕi+pnϕnpnϕn
H(p,ϕ)pnϕn=∑n−1i=1piϕipnϕn+1
This lets us simplify ψn down to
ψn=1(H(p,ϕ)pnϕn)
ψn=pnϕnH(p,ϕ)
ψn=ϕnpnH(p,ϕ)
And now we can plug that back into the formula for all the other ψi!
ψi=ϕnpnH(p,ϕ)piϕipnϕn
ψi=piϕiH(p,ϕ)
ψi=ϕipiH(p,ϕ)
Well isn't that convenient! The formula for all ψi has the same form, and we can think of it like starting with the Harsanyi weights ϕ (which make p optimal according to H(_,ϕ), along with anything else with the same Harsanyi score H(p,ϕ)), and then tweaking them to get G(_,ψ) to target p in particular.
We can simplify our formula by noting that H(p,ϕ)=p⋅ϕ=∑nj=1pjϕj
ψi=piϕip⋅ϕ
To make the formula a little prettier, and to get some extra geometric insight, we can introduce the element-wise product⊙, where (p⊙ϕ)i=piϕi.
ψ=p⊙ϕp⋅ϕ
Here's a good opportunity to make sure our weights ψ sum up to 1:
∑ni=1ψi=∑ni=1piϕip⋅ϕ
∑ni=1ψi=1p⋅ϕ∑ni=1piϕi
∑ni=1ψi=1p⋅ϕ(p⋅ϕ)
∑ni=1ψi=1
Great! p⋅ϕ is acting like a normalization term, and we can think of p⊙ϕ as telling us which direction ψ points in. This vector of weights is then scaled to land on the hypersurface of weights that sum to 1, known as the standard simplexΔn, which we'll discuss more later.
We can also think of ϕ as a function ϕ:Rn×Cn→Rn denoted as ϕ(p,F) which returns the Harsanyi weights ϕ for p in the context of a compact, convex subset F⊂Rn. This is it, so let's make a new heading to find it later!
How to Calculate Weights for p
We now have a formula for ψ:Rn×[0,1]n, which we can write as
ψ(p,ϕ(p,F))=p⊙ϕ(p,F)p⋅ϕ(p,F)
Or we can suppress function arguments and simply write
ψ=p⊙ϕp⋅ϕ
Where p⊙ϕ∈Rn is the element-wise product of p and ϕ: (p⊙ϕ)i=piϕi and p⋅ϕ∈Rn is the dot product p⋅ϕ=∑nj=1pjϕj
For a single component ψi, we have
ψi=piϕip⋅ϕ
Note that ψ isn't defined when p⋅ϕ=0.
Is this a problem? Not really! p⋅ϕ=H(p,ϕ), the Harsanyi aggregate utility of p when ϕ has been chosen to make p optimal under H(_,ϕ). When this is 0, it means the individual utilities must all be 0 and the entire feasible set F must be a single point at the origin. When that happens, any weights will make p optimal according to G(_,ψ) or H(_,ϕ). Feel free to use any convention that works for your application, if we're in a context where ϕ(0) is defined we can inherit ψ(0)=ϕ(0). If F is shrinking towards becoming a single point, we can use ψ(0)=limp→0ψ(p).
Checking Our Solution
Assuming we calculated ∇v(G∘H)(q,ψ) correctly, we can verify that these weights lead to ∇v(G∘H)(q,ψ)=0. This requires ∂(G∘H)∂vi(q,ψ)=0 for the first n−1 utilities, so let's check that:
∂(G∘H)∂vi(q,ψ)=0
(ψiHi(q)−ψnϕiHn(q)ϕn)G(H(q),ψ)=0
(ψipi−ψnϕipnϕn)G(p,ψ)=0
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
1pipiϕip⋅ϕG(p,ψ)=ϕipnϕnpnϕnp⋅ϕG(p,ψ)
ϕip⋅ϕG(p,ψ)=ϕiϕnϕnp⋅ϕG(p,ψ)
ϕip⋅ϕG(p,ψ)=ϕip⋅ϕG(p,ψ)
Success! p is an optimum of G among H. But is it unique?
P Is the Unique Optimum of G When Weights Are Positive
Let's see how ψ and p influenced the outcome here, and keep track of the critical points which can make ∇v(G∘H)i=0 or undefined. These are the only points which can be extrema, and for each we need to check if it is a minimum or maximum among H. (G doesn't have any saddle points, and H doesn't have any boundaries of its own to worry about. Where H meets the boundary of G's domain, the ui=0 axes, G=0 there.)
For example, whenever any individual utility ui=0, ∂G∂Hi is undefined, which causes ∇v(G∘H)i to be undefined. But note that these will be minimal points of G, unless ψi=0. To find maximal points u∈H of G across H we need
ψiui−ψnϕiunϕn=0
ψiunϕnuiunϕn−uiψnϕiuiunϕn=0
ψiunϕn−uiψnϕiuiunϕn=0
If ui or un are 0, then ∇v(G∘H)i is undefined, and we'll check later if these can still be optimal. We assumed that the index n refers to an agent with ϕn>0, in order to prevent that exact case from breaking our entire solution.
ψiunϕn−uiψnϕi=0
unϕnψi−uiϕiψn=0
If we were handed ψ from some external source, we could solve this equation to see which u∈H happened to be optimal. But we designed ψ, so let's see what we caused to be optimal.
unϕnpiϕip⋅ϕ−uiϕipnϕnp⋅ϕ=0
unϕnpiϕi−uiϕipnϕnp⋅ϕ=0
If p⋅ϕ=0 then ∇v(G∘H)i is undefined. This only happens when F is a single point, in which case p indeed the unique optimum of G.
unϕnpiϕi−uiϕipnϕn=0
uipnϕiϕn=unpiϕiϕn
Here we're going to be careful about which weights can be 0. We'll again use the fact that ϕn>0 to safely divide it from both sides.
uipnϕi=unpiϕi
Here again we can see that u=p solves this family of n-1 equations. And this is very exciting because this is our first maximum of G! Are there any other solutions?
Each of these equations is satisfied when one of the following is true:
ϕi=0
uipn=unpi
In other words, assigning an agent 0 Harsanyi weight ϕi (and thus geometric weight ψi) can allow G(_,ψ) to have multiple optima among H, which can give it multiple optima among F.
What about when all geometric weights ψ are positive? Are there any other solutions to that second family of n-1 equations?
Having all positive geometric weights ψ implies having all positive Harsanyi weights ϕ, and all positive individual utilities p. It also implies that any optimum of G(_,ψ) will have all positive individual utilities u. This lets us freely divide by any of these terms, without needing to worry that we might be dividing by 0.
uiun=pipn
Since un and pn are both positive, we can think of un as a scaled version of pn.
un=λpn
How does this scalar influence the other terms in these equations?
uiλpn=pipn
ui=λpi
u=λp
This forms a line from the origin to p, which only intersects H at p. (Since scaling p up or down changes H(p,ϕ).) So when all geometric weights ψ are positive, p is the unique optimum of G(_,ψ) among H!
When ϕi=0, ψi is also 0, so ui doesn't affect G(_,ψ). We can start with p, and then freely vary the utilities of any agent with 0 weight and remain optimal.
Interactive Implementations
We can also check our entire calculation, including those pages of calculus, by actually implementing our solution and seeing if it works! A graphing calculator is sufficient to check this in 2 and 3 dimensions. We can show all the points which satisfy G(s,ψ)=G(p,ψ) and they should trace out the contours of G, showing all the joint utilities which have the same G score as p.
In 2 dimension, the graph looks like this:
The Harsanyi hyperplane is a line, and the contour curves are skewed hyperbolas.
As expected, taking p out of the positive quadrant violates our assumption that utilities are non-negative, leading to invalid settings for ψ. Similarly, if H has a positive slope, this violates our assumption that p is on the Pareto frontier. (A positive slope implies that we can make both players better off simultaneously. If we calculate ϕ anyway, a positive slope implies that ϕi is negative for the agent on the x axis.) This allows H to pass up through the hyperbola at another point other than p, but this never happens when ϕi≥0.
With 3 agents, the graph looks like this:
In 3 dimensions, the Harsanyi hyperplane is a plane, and the contour surfaces are skewed hyperboloids.
We can move p around on the hyperplane, and this changes ψ, which changes where the contour touches H. We can see that p always lies at the intersection of this contour curve and H, and this is a visual proof that p maximizes G(_,ψ) among H. And when H corresponds to all agents having positive Harsanyi weight ϕ, this intersection only happens at p!
This is a supplemental post to Geometric Utilitarianism (And Why It Matters), in which I show how I derived the weights ψ which make any Pareto optimal point p optimal according to the geometric weighted average. This is a subproblem of the proof laid out in the first post of this sequence, and the main post describes why that problem is interesting.
Overview
So how are we going to calculate weights ψ which make p optimal among F?
The idea here is to identify the Harsanyi hyperplane H, which contains all of the joint utilities u∈Rn which satisfy H(u,ϕ)=H(p,ϕ). Where ϕ are the weights which make our chosen point p∈Rn optimal with respect to H(_,ϕ). And we're going to calculate new weights ψ which make p optimal with respect to G(_,ψ). It turns out it's sufficient to make p optimal among H, and p will also be optimal across our entire feasible set F.
In terms of calculus, we're going to be constructing a function G∘H, which tells us about how moving around on H changes G. And we're going to choose weights ψ which make the gradient ∇v(G∘H) equal 0 at p. This makes it a local optimum, and it will turn out to be a global maximum across H, which in turn will make it a global maximum across F.
Geometrically, we can think of that as the surface gradient of G across H. And so in terms of the overall gradient ∇uG, we're designing ψ so that ∇uG is perpendicular to H at p.
Parameterizing the Harsanyi Hyperplane
When thinking about moving around on the Harsanyi hyperplane H, we have a linear constraint that says no matter which u∈H we pick, we know that u⋅ϕ=p⋅ϕ=H(p,ϕ). If we know u lies on H, we can calculate the the n-th agent's utility un from the first n−1 utilities. We'll be referring to these first n-1 utilities a lot, so let's call them v∈Rn−1. So vi=ui for all i<n.
H and G are both symmetrical with respect to shuffling the indices of agents around, so without loss of generality we'll assume that the n-th agent is one we're assigning positive Harsanyi weight to: ϕn > 0. This is necessary for the reconstruction to work for all ϕ.
So we can think of H as a function H:Rn−1→Rn, where the j-th output is Hj(v)=uj for j<n. We can use the n-th output to reconstruct un given v like this: Hn(v)=H(p,ϕ)−∑n−1i=1viϕiϕn. This lets us move around Rn−1 to pick v however we want, and the function H will map that to its image, helpfully also called H!
Alright, now we have H:Rn−1→Rn and we also have G:Rn×[0,1]n→R, which is the geometric weighted average whose gradient we're trying to design through our choice of ψ. So let's compose them together to form G∘H:Rn−1×[0,1]n→R. And since we want p to be an optimum of G across the hyperplane H, we can set the gradient ∇(G∘H)(q,ψ)=0, where q∈Rn−1 are the first n−1 utilities of our target joint utility p. Solving this equation for ψ will give us the weights we need!
This looks like to solving a family of n−1 equations∂(G∘H)∂vi(q,ψ)=0. Where we're holding the weights constant for the purposes of differentiation, but we'll be solving for the weights that make the derivative 0 at p.
How Does G Change as We Change These Parameters?
Ok, we've built up a few layers of abstraction, so let's start unpacking. By the chain rule, and using the notation that Hj is the j-th element of the output of H:
∂(G∘H)∂vi(v,ψ)=∑nj=1∂G∂Hj(H(v),ψ)∂Hj∂vi(v)
How does our point on H change as we change these parameters?
Let's start by computing ∂Hj∂vi(v).
For the first n-1 terms this is simple, because Hj simply returns vj. So ∂Hj∂vi is 1 when i=j, and 0 otherwise, which we can represent using the Kronecker delta δji. And ∂Hn∂vi=−ϕiϕn.
Geometrically, this is telling us about the slope of H. Note that:
How Does G Change as We Move Around on H?
We can start solving for ∂G∂Hj by substituting in the definition of G:
∂G∂Hj=∂∂Hj(∏nk=1Hψkk).
From here we can apply the n-factor product rule:
∂G∂Hj=(∏nk=1Hψkk)(∑nk=1∂∂Hj(Hψkk)Hψkk).
Thankfully, ∂∂Hj(Hψkk)=0 whenever k≠j, leaving just ∂∂Hj(Hψjj)=ψjHψj−1j. We can also notice ∏nk=1Hψkk=G, leaving us with the much nicer ∂G∂Hj=GψjHψj−1jHψjj=GψjHj=ψjHjG.
It will be important later that this partial derivative is undefined when Hj=0, aka wherever any agent is receiving their least feasible utility.
Writing function arguments explicitly:
∂G∂Hj(H(v),ψ)=ψjHj(v)G(H(v),ψ)
Putting These Terms Together
Let's start putting these together. We can start by breaking apart the two cases of ∂Hj∂vi, like this:
∂(G∘H)∂vi(H(v),ψ)=∑nj=1∂G∂Hj(u,ψ)∂Hj∂ui(v)
∂(G∘H)∂vi(H(v),ψ)=(∑n−1j=1∂G∂Hj(H(v),ψ)∂Hj∂vi(v))+∂G∂Hn(H(v),ψ)∂Hn∂vi(v)
∂(G∘H)∂vi(v,ψ)=(∑n−1j=1∂G∂Hj(H(v),ψ)δji)−∂G∂Hn(H(v),ψ)ϕiϕn
Here's one reason why it's useful to know about the Kronecker delta δji: it filters out all but the i-th element of a sum: ∑jajδji=ai. When you're working in Einstein notation (which is great by the way), you just write it as ajδji=ai and you can think of the j's as "cancelling".
That leaves us with:
∂(G∘H)∂vi(v,ψ)=∂G∂Hi(H(v),ψ)−∂G∂Hn(H(v),ψ)ϕiϕn
And we know ∂G∂Hi(H(v),ψ), so let's plug that in:
∂(G∘H)∂vi(v,ψ)=ψiHi(v)G(H(v),ψ)−ψnHn(v)G(H(v),ψ)ϕiϕn
∂(G∘H)∂vi(v,ψ)=(ψiHi(v)−ψnϕiHn(v)ϕn)G(H(v),ψ)
And that is the family of n−1 equations that we want to all be 0 when v=q. (This causes H(v)=u=p.) We'll call this gradient ∇v(G∘H) to remind ourselves that this is the gradient of (G∘H)(v,ψ) where we're holding the weights ψ constant.
Solving for the Geometric Weights
Ok, now we can set v=q, ∂(G∘H)∂vi=0 and solve for ψi, for i<n:
(ψiHi(q)−ψnϕiHn(q)ϕn)G(H(q),ψ)=0
ψiHi(q)G(H(q),ψ)−ψnϕiHn(q)ϕnG(H(q),ψ)==0
ψiHi(q)G(H(q),ψ)=ψnϕiHn(q)ϕnG(H(q),ψ)
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
ψipi=ψnϕipnϕn
ψi=ψnpiϕipnϕn
This is still a system of linear equations we need to solve, since each ψi for i<n depends on ψn, which in turn satisfies ψn=1−∑n−1i=1ψi. So let's solve it for ψn!
ψn=1−∑n−1i=1ψi
ψn=1−∑n−1i=1ψnpiϕipnϕn
ψn=1−ψn∑n−1i=1piϕipnϕn
ψn+ψn∑n−1i=1piϕipnϕn=1
ψn(1+∑n−1i=1piϕipnϕn)=1
ψn=11+∑n−1i=1piϕipnϕn
ψn=11+∑n−1i=1piϕipnϕn
Remembering that H(p,ϕ)=∑ni=1piϕi=∑n−1i=1piϕi+pnϕn, we can notice that:
H(p,ϕ)pnϕn=∑n−1i=1piϕi+pnϕnpnϕn
H(p,ϕ)pnϕn=∑n−1i=1piϕipnϕn+1
This lets us simplify ψn down to
ψn=1(H(p,ϕ)pnϕn)
ψn=pnϕnH(p,ϕ)
ψn=ϕnpnH(p,ϕ)
And now we can plug that back into the formula for all the other ψi!
ψi=ϕnpnH(p,ϕ)piϕipnϕn
ψi=piϕiH(p,ϕ)
ψi=ϕipiH(p,ϕ)
Well isn't that convenient! The formula for all ψi has the same form, and we can think of it like starting with the Harsanyi weights ϕ (which make p optimal according to H(_,ϕ), along with anything else with the same Harsanyi score H(p,ϕ)), and then tweaking them to get G(_,ψ) to target p in particular.
We can simplify our formula by noting that H(p,ϕ)=p⋅ϕ=∑nj=1pjϕj
ψi=piϕip⋅ϕ
To make the formula a little prettier, and to get some extra geometric insight, we can introduce the element-wise product ⊙, where (p⊙ϕ)i=piϕi.
ψ=p⊙ϕp⋅ϕ
Here's a good opportunity to make sure our weights ψ sum up to 1:
∑ni=1ψi=∑ni=1piϕip⋅ϕ
∑ni=1ψi=1p⋅ϕ∑ni=1piϕi
∑ni=1ψi=1p⋅ϕ(p⋅ϕ)
∑ni=1ψi=1
Great! p⋅ϕ is acting like a normalization term, and we can think of p⊙ϕ as telling us which direction ψ points in. This vector of weights is then scaled to land on the hypersurface of weights that sum to 1, known as the standard simplex Δn, which we'll discuss more later.
We can also think of ϕ as a function ϕ:Rn×Cn→Rn denoted as ϕ(p,F) which returns the Harsanyi weights ϕ for p in the context of a compact, convex subset F⊂Rn. This is it, so let's make a new heading to find it later!
How to Calculate Weights for p
We now have a formula for ψ:Rn×[0,1]n, which we can write as
ψ(p,ϕ(p,F))=p⊙ϕ(p,F)p⋅ϕ(p,F)
Or we can suppress function arguments and simply write
ψ=p⊙ϕp⋅ϕ
Where p⊙ϕ∈Rn is the element-wise product of p and ϕ: (p⊙ϕ)i=piϕi and p⋅ϕ∈Rn is the dot product p⋅ϕ=∑nj=1pjϕj
For a single component ψi, we have
ψi=piϕip⋅ϕ
Note that ψ isn't defined when p⋅ϕ=0.
Is this a problem? Not really! p⋅ϕ=H(p,ϕ), the Harsanyi aggregate utility of p when ϕ has been chosen to make p optimal under H(_,ϕ). When this is 0, it means the individual utilities must all be 0 and the entire feasible set F must be a single point at the origin. When that happens, any weights will make p optimal according to G(_,ψ) or H(_,ϕ). Feel free to use any convention that works for your application, if we're in a context where ϕ(0) is defined we can inherit ψ(0)=ϕ(0). If F is shrinking towards becoming a single point, we can use ψ(0)=limp→0ψ(p).
Checking Our Solution
Assuming we calculated ∇v(G∘H)(q,ψ) correctly, we can verify that these weights lead to ∇v(G∘H)(q,ψ)=0. This requires ∂(G∘H)∂vi(q,ψ)=0 for the first n−1 utilities, so let's check that:
∂(G∘H)∂vi(q,ψ)=0
(ψiHi(q)−ψnϕiHn(q)ϕn)G(H(q),ψ)=0
(ψipi−ψnϕipnϕn)G(p,ψ)=0
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
ψipiG(p,ψ)=ψnϕipnϕnG(p,ψ)
1pipiϕip⋅ϕG(p,ψ)=ϕipnϕnpnϕnp⋅ϕG(p,ψ)
ϕip⋅ϕG(p,ψ)=ϕiϕnϕnp⋅ϕG(p,ψ)
ϕip⋅ϕG(p,ψ)=ϕip⋅ϕG(p,ψ)
Success! p is an optimum of G among H. But is it unique?
P Is the Unique Optimum of G When Weights Are Positive
Let's see how ψ and p influenced the outcome here, and keep track of the critical points which can make ∇v(G∘H)i=0 or undefined. These are the only points which can be extrema, and for each we need to check if it is a minimum or maximum among H. (G doesn't have any saddle points, and H doesn't have any boundaries of its own to worry about. Where H meets the boundary of G's domain, the ui=0 axes, G=0 there.)
For example, whenever any individual utility ui=0, ∂G∂Hi is undefined, which causes ∇v(G∘H)i to be undefined. But note that these will be minimal points of G, unless ψi=0. To find maximal points u∈H of G across H we need
ψiui−ψnϕiunϕn=0
ψiunϕnuiunϕn−uiψnϕiuiunϕn=0
ψiunϕn−uiψnϕiuiunϕn=0
If ui or un are 0, then ∇v(G∘H)i is undefined, and we'll check later if these can still be optimal. We assumed that the index n refers to an agent with ϕn>0, in order to prevent that exact case from breaking our entire solution.
ψiunϕn−uiψnϕi=0
unϕnψi−uiϕiψn=0
If we were handed ψ from some external source, we could solve this equation to see which u∈H happened to be optimal. But we designed ψ, so let's see what we caused to be optimal.
unϕnpiϕip⋅ϕ−uiϕipnϕnp⋅ϕ=0
unϕnpiϕi−uiϕipnϕnp⋅ϕ=0
If p⋅ϕ=0 then ∇v(G∘H)i is undefined. This only happens when F is a single point, in which case p indeed the unique optimum of G.
unϕnpiϕi−uiϕipnϕn=0
uipnϕiϕn=unpiϕiϕn
Here we're going to be careful about which weights can be 0. We'll again use the fact that ϕn>0 to safely divide it from both sides.
uipnϕi=unpiϕi
Here again we can see that u=p solves this family of n-1 equations. And this is very exciting because this is our first maximum of G! Are there any other solutions?
Each of these equations is satisfied when one of the following is true:
ϕi=0
uipn=unpi
In other words, assigning an agent 0 Harsanyi weight ϕi (and thus geometric weight ψi) can allow G(_,ψ) to have multiple optima among H, which can give it multiple optima among F.
What about when all geometric weights ψ are positive? Are there any other solutions to that second family of n-1 equations?
Having all positive geometric weights ψ implies having all positive Harsanyi weights ϕ, and all positive individual utilities p. It also implies that any optimum of G(_,ψ) will have all positive individual utilities u. This lets us freely divide by any of these terms, without needing to worry that we might be dividing by 0.
uiun=pipn
Since un and pn are both positive, we can think of un as a scaled version of pn.
un=λpn
How does this scalar influence the other terms in these equations?
uiλpn=pipn
ui=λpi
u=λp
This forms a line from the origin to p, which only intersects H at p. (Since scaling p up or down changes H(p,ϕ).) So when all geometric weights ψ are positive, p is the unique optimum of G(_,ψ) among H!
When ϕi=0, ψi is also 0, so ui doesn't affect G(_,ψ). We can start with p, and then freely vary the utilities of any agent with 0 weight and remain optimal.
Interactive Implementations
We can also check our entire calculation, including those pages of calculus, by actually implementing our solution and seeing if it works! A graphing calculator is sufficient to check this in 2 and 3 dimensions. We can show all the points which satisfy G(s,ψ)=G(p,ψ) and they should trace out the contours of G, showing all the joint utilities which have the same G score as p.
In 2 dimension, the graph looks like this:
The Harsanyi hyperplane is a line, and the contour curves are skewed hyperbolas.
As expected, taking p out of the positive quadrant violates our assumption that utilities are non-negative, leading to invalid settings for ψ. Similarly, if H has a positive slope, this violates our assumption that p is on the Pareto frontier. (A positive slope implies that we can make both players better off simultaneously. If we calculate ϕ anyway, a positive slope implies that ϕi is negative for the agent on the x axis.) This allows H to pass up through the hyperbola at another point other than p, but this never happens when ϕi≥0.
With 3 agents, the graph looks like this:
In 3 dimensions, the Harsanyi hyperplane is a plane, and the contour surfaces are skewed hyperboloids.
We can move p around on the hyperplane, and this changes ψ, which changes where the contour touches H. We can see that p always lies at the intersection of this contour curve and H, and this is a visual proof that p maximizes G(_,ψ) among H. And when H corresponds to all agents having positive Harsanyi weight ϕ, this intersection only happens at p!