Gains from trade: Slug versus Galaxy - how much would I give up to control you?

Stuart_Armstrong

63 Gains from trade: Slug versus Galaxy - how much would I give up to control you?

by Stuart_Armstrong

23rd Jul 2013

8 min read

63

Edit: Moved to main at ThrustVectoring's suggestion.

A suggestion as to how to split the gains from trade in some situations.

The problem of Power

A year or so ago, people in the FHI embarked on a grand project: to try and find out if there was a single way of resolving negotiations, or a single way of merging competing moral theories. This project made a lot of progress in finding out how hard this was, but very little in terms of solving it. It seemed evident that the correct solution was to weigh the different utility functions, and then for everyone maximise the weighted sum, but all ways of weighting had their problems (the weighting with the most good properties was a very silly one: use the "min-max" weighting that sets your maximal attainable utility to 1 and your minimal to 0).

One thing that we didn't get close to addressing is the concept of power. If two partners in the negotiation have very different levels of power, then abstractly comparing their utilities seems the wrong solution (more to the point: it wouldn't be accepted by the powerful party).

The New Republic spans the Galaxy, with Jedi knights, battle fleets, armies, general coolness, and the manufacturing and human resources of countless systems at its command. The dull slug, ARthUrpHilIpDenu, moves very slowly around a plant, and possibly owns one leaf (or not - he can't produce the paperwork). Both these entities have preferences, but if they meet up, and their utilities are normalised abstractly, then ARthUrpHilIpDenu's preferences will weigh in far too much: a sizeable fraction of the galaxy's production will go towards satisfying the slug. Even if you think this is "fair", consider that the New Republic is the merging of countless individual preferences, so it doesn't make any sense that the two utilities get weighted equally.

The default point

After looking at various blackmail situations, it seems to me that it's the concept of default, or status quo, that most clearly differentiates between a threat and an offer. I wouldn't want you to make a credible threat, because this worsens the status quo, I would want you to make a credible offer, because this improves it. How this default is established is another matter - there may be some super-UDT approach that solves it from first principles. Maybe there is some deep way of distinguishing between threats and promises in some other way, and the default is simply the point between them.

In any case, without going any further into it's meaning or derivation, I'm going to assume that the problem we're working on has a definitive default/disagreement/threat point. I'll use the default point terminology, as that is closer to the concept I'm considering.

Simple trade problems often have a very clear default point. These are my goods, those are your goods, the default is we go home with what we started with. This is what I could build, that's what you could build, the default is that we both build purely for ourselves.

If we imagine ARthUrpHilIpDenu and the New Republic were at opposite ends of a regulated wormhole, and they could only trade in safe and simple goods, then we've got a pretty clear default point.

Having a default point opens up a whole host of new bargaining equilibriums, such as the Nash Bargaining Solution (NBS) and the Kalai-Smorodinsky Bargaining Solution (KSBS). But neither of these are really quite what we'd want: the KSBKS is all about fairness (which generally reduced expected outcomes), while the NBS uses a product of utility values, something that makes no intrinsic sense at all (NBS has some nice properties, like independence of irrelevant alternatives, but this only matters if the default point is reached through a process that has the same properties - and it can't be).

What am I really offering you in trade?

When two agents meet, especially if they are likely to meet more in the future (and most especially if they don't know the number of times and the circumstances in which they will meet), they should merge their utility functions: fix a common scale for their utility functions, add them together, and then both proceed to maximise the sum.

This explains what's really being offered in a trade. Not a few widgets or stars, but the possibility of copying your utility function into mine. But why would you want that? Because that will change my decisions, into a direction you find more pleasing. So what I'm actually offering you, is access to my decision points.

What is actually on offer in a trade, is access by one player's utility function to the other player's decision points.

This gives a novel way of normalising utility functions. How much, precisely, is access to my decision points worth to you? If the default point gives a natural zero, then complete control over the other player's decision points is a natural one. "Power" is a nebulous concept, and different players may disagree as to how much power they each have. But power can only be articulated through making decisions (if you can't change any of your decisions, you have no power), and this normalisation allows each player to specify exactly how much they value the power/decision points of the other. Outcomes that involve one player controlling the other player's decision points can be designated the "utopia" point for that first player. These are what would happen if everything went exactly according to what they wanted.

What does this mean for ARthUrpHilIpDenu and the New Republic? Well, the New Republic stands to gain a leaf (maybe). From it's perspective, the difference between default (all the resources of the galaxy and no leaf) and utopia (all the resources of the galaxy plus one leaf) is tiny. And yet that tiny difference will get normalised to one: the New Republic's utility function will get multiplied by a huge amount. It will weigh heavily in any sum.

What about ARthUrpHilIpDenu? It stands to gain the resources of a galaxy. The difference between default (a leaf) and utopia (all the resources of a galaxy dedicated to making leaves) is unimaginably humongous. And yet that huge difference will get normalised to one: the ARthUrpHilIpDenu's utility function will get divided by a huge amount. It will weigh very little in any sum.

Thus if we add the two normalised utility functions, we get one that is nearly totally dominated by the New Republic. Which is what we'd expect, given the power differential between the two. So this bargaining system reflects the relative power of the players. Another way of thinking of this is that each player's utility is normalised taking into account how much they would give up to control the other. I'm calling it the "Mutual Worth Bargaining Solution" (MWBS), as it's the worth to players of the other player's decision points that are key. Also because I couldn't think of a better title.

Properties of the Mutual Worth Bargaining Solution

How does the MWBS compare with the NBS and the KSBS? The NBS is quite different, because it has no concept of relative power, normalising purely by the players' preferences. Indeed, one player could have no control at all, no decision points, and the NBS would still be unchanged.

The KSBS is more similar to the MWBS: the utopia points of the KSBS are the same as those of the MWBS. If we set the default point to (0,0) and the utopia points to (1,-) and (-,1), then the KSBS is given by the highest h such that (h,h) is a possible outcome. Whereas the MWBS is given by the outcome (x,y) such that x+y is highest possible.

Which is preferable? Obviously, if they knew exactly what the outcomes and utilities were on offer, then each player would have preferences as to which system to use (the one that gives them more). But if they didn't, if they had uncertainties as to what players and what preferences they would face in the future, then MWBS generally comes out on top (in expectation).

How so? Well, if a player doesn't know what other players they'll meet, they don't know in what way their decision points will be relevant to the other, and vice versa. They don't know what pieces of their utility will be relevant to the other, and vice versa. So they can expect to face a wide variety of normalised situations. To a first approximation, it isn't too bad an idea to assume that one is equally likely to face a certain situation as it's symmetric complement. Using the KSBS, you'd expect to get a utility of h (same in both case); under the MWBS, a utility of (x+y)/2 (x in one case, y in the other). Since x+y ≥ h+h = 2h by the definition of the MWBS, it comes out ahead in expectation.

Another important distinction between the MWBS is that while the KSBS and the NBS only allow Pareto improvements from the default point, MWBS does allow for some situation where one player will lose from the deal. It is possible, for instance, that (1/2,-1/4) is a possible outcome (summed utility 1/4), and there are no better options possible. Doesn't this go against the spirit of the default point? Why would someone go into a deal that leaves them poorer than before?

First off all, that situation will be rare. All MWBS must be in the triangle bounded by x<1, y<1 and x+y>0. The first bounds are definitional: one cannot get more expected utility that one's utopia point. The last bound comes from the fact that the default point is itself an option, with summed utility 0+0=0, so all summed utilities must be above zero. Sprinkle a few random outcome points into that triangle, and it very likely that the one with highest summed utility will be a Pareto improvement over (0,0).

But the other reason to accept the risk of losing, is because of the opportunity of gain. One could modify the MWBS to only allow Pareto improvements over the default: but in expectation, this would perform worse. The player would be immune from losing 1/4 utility from (1/2,-1/4), but unable to gain 1/2 from the (-1/4,1/2): the argument is the same as above. In ignorance as to the other player's preferences, accepting the possibility of loss improves the expected outcome.

It should be noted that the maximum that a player could theoretically lose by using the MWBS is equal to the maximum they could theoretically win. So the New Republic could lose at most a leaf, meaning that even powerful players would not be reluctant to trade. For less powerful players, the potential losses are higher, but so are the potential rewards.

Directions of research

The MWBS is somewhat underdeveloped, and the explanation here isn't as clear as I'd have liked. However, me and Miriam are about to have a baby, so I'm not expecting to have any time at all soon, so I'm pushing out the idea, unpolished.

Some possible routes for further research: what are the other properties of MWBS? Are they properties that make MWBS feel more or less likely or acceptable? The NBS is equivalent with certain properties: what are the properties that are necessary and sufficient for the MWBS (and can they suggest better Bargaining Solutions)? Can we replace the default point? Maybe we can get a zero by imagining what would happen if the second player's decision nodes were under the control of an anti-agent (an agent that's the opposite of the first player), or a randomly selected agent?

The most important research route is to analyse what happens if several players come together at different times, and repeatedly normalise their utilities using the MWBS: does it matter the order in which they meet? I strongly feel that it's exploring this avenue that will reach "the ultimate" bargaining solution, if such a thing is to be found. Some solution that is stable under large numbers of agents, who don't know each other or how many they are, coming together in a order they can't predict.

AIRationality

Frontpage

63

New Comment

Rendering 0/67 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 6:58 PM

Moderation Log