You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Lying in negotiations: a maximally bad problem

13 Stuart_Armstrong 17 November 2014 03:17PM

In a previous post, I showed that the Nash Bargaining Solution (NBS), the Kalai-Smorodinsky Bargaining Solution (KSBS) and own my Mutual Worth Bargaining Solution (MWBS) were all maximally vulnerable to lying. Here I can present a more general result: all bargaining solutions are maximally vulnerable to lying.

Assume that players X and Y have settled on some bargaining solution (which only cares about the defection point and the utilities of X and Y). Assume further that player Y knows player X's utility function. Let player X look at the possible outcomes, and let her label any outcome O "admissible" if there is some possible bargaining partner YO with utility function uO such that O would be the outcome of the bargain between X and YO.

For instance, in the case of NBS and KSBS, the admissible outcomes would be the outcomes Pareto-better than the disagreement point. The MWBS has a slightly larger set of admissible outcomes, as it allows players to lose utility (up to the maximum they could possibly gain).

Then the general result is:

If player Y is able to lie about his utility function while knowing player X's true utility (and player X is honest), he can freely select his preferred outcome among the outcomes that are admissible.

The proof of this is also derisively brief: player Y need simply claim to have utility uO, in order to force outcome O.

Thus, if you've agreed on a bargaining solution, all that you've done is determined the set of outcomes among which your lying opponent will freely choose.

There may be a subtlety: your system could make use of an objective (or partially objective) disagreement point, which your opponent is powerless to change. This doesn't change the result much:

If player Y is able to lie about his utility function while knowing player X's true utility (and player X is honest), he can freely select his preferred outcome among the outcomes that are admissible given the disagreement point.

 

Exploitation and gains from trade

Note that the above result did not make any assumptions about the outcome being Pareto - giving up Pareto doesn't make you non-exploitable (or "strategyproof" as it is often called).

But note also that the result does not mean that the system is exploitable! In the random dictator setup, you randomly assign power to one player, who then makes all the decisions. In terms of expected utility, this is a pUX+(1-p)UY, where UX is the best outcome ("Utopia") for X and UY the best outcome for Y, and p the probability that X is the random dictator. The theorem still holds for this setup: player X knows that player Y will be able to select freely among the admissible outcomes, which is the set S={pUX+(1-p)O | O an outcome}. However, player X knows that player Y will select pUX+(1-p)UY as this maximises his expected utility. So a bargaining solution which has a particular selection of admissible outcomes can be strategyproof.

However, it seems that the only strategyproof bargaining solutions are variants of random dictators! These solutions do not allow much gain from trade. Conversely, the more you open your bargaining solution up to gains from trade, the more exploitable you become from lying. This can be seen in the examples above: my MWBS tried to allow greater gains (in expectation) by not restricting to strict Pareto improvements from the disagreement point. As a result, it makes itself more vulnerable to liars.

 

What to do

What can be done about this? There seem to be several possibilities:

  1. Restrict to bargaining solutions difficult to exploit. This is the counsel of despair: give up most of the gains from trade, to protect yourself from lying. But there may be a system where the tradeoff between exploitability and potential gains is in some sense optimal.
  2. Figure out your opponent's true utility function. The other obvious solution: prevent lying by figuring out what your opponent really values, by inspecting their code, their history, their reactions, etc... This could be combined with refusing to trade with those who don't make their true utility easy to discover (or only using non-exploitable trades with those).
  3. Hide your own true utility. The above approach only works because the liar knows their opponent, and their opponent doesn't know them. If both utilities are hidden, it's not clear how exploitable the system really is.
  4. Play only multi-player. If there are many different trades with many different people, it becomes harder to construct a false utility that exploits them all. This is in a sense a variant of "hiding your own true utility": in that situation, the player has to lie given their probability distribution of your possible possible utilities; in this this situation, they have to lie, given the known distribution of multiple true utilities.

So there does not seem to be a principled way of getting rid of liars. But the multi-player (or hidden utility function) may point to a single "best" bargaining solution: the one that minimises the returns to lying and maximises the gains to trade, given ignorance of the other's utility function.

Cooperating with agents with different ideas of fairness, while resisting exploitation

38 Eliezer_Yudkowsky 16 September 2013 08:27AM

There's an idea from the latest MIRI workshop which I haven't seen in informal theories of negotiation, and I want to know if this is a known idea.

(Old well-known ideas:)

Suppose a standard Prisoner's Dilemma matrix where (3, 3) is the payoff for mutual cooperation, (2, 2) is the payoff for mutual defection, and (0, 5) is the payoff if you cooperate and they defect.

Suppose we're going to play a PD iterated for four rounds.  We have common knowledge of each other's source code so we can apply modal cooperation or similar means of reaching a binding 'agreement' without other enforcement methods.

If we mutually defect on every round, our net mutual payoff is (8, 8).  This is a 'Nash equilibrium' because neither agent can unilaterally change its action and thereby do better, if the opponents' actions stay fixed.  If we mutually cooperate on every round, the result is (12, 12) and this result is on the 'Pareto boundary' because neither agent can do better unless the other agent does worse.  It would seem a desirable principle for rational agents (with common knowledge of each other's source code / common knowledge of rationality) to find an outcome on the Pareto boundary, since otherwise they are leaving value on the table.

But (12, 12) isn't the only possible result on the Pareto boundary.  Suppose that running the opponent's source code, you find that they're willing to cooperate on three rounds and defect on one round, if you cooperate on every round, for a payoff of (9, 14) slanted their way.  If they use their knowledge of your code to predict you refusing to accept that bargain, they will defect on every round for the mutual payoff of (8, 8).

I would consider it obvious that a rational agent should refuse this unfair bargain.  Otherwise agents with knowledge of your source code will offer you only this bargain, instead of the (12, 12) of mutual cooperation on every round; they will exploit your willingness to accept a result on the Pareto boundary in which almost all of the gains from trade go to them.

(Newer ideas:)

Generalizing:  Once you have a notion of a 'fair' result - in this case (12, 12) - then an agent which accepts any outcome in which it does worse than the fair result, while the opponent does better, is 'exploitable' relative to this fair bargain.  Like the Nash equilibrium, the only way you should do worse than 'fair' is if the opponent also does worse.

So we wrote down on the whiteboard an attempted definition of unexploitability in cooperative games as follows:

"Suppose we have a [magical] definition N of a fair outcome.  A rational agent should only do worse than N if its opponent does worse than N, or else [if bargaining fails] should only do worse than the Nash equilibrium if its opponent does worse than the Nash equilibrium."  (Note that this definition precludes giving in to a threat of blackmail.)

(Key possible-innovation:)

It then occurred to me that this definition opened the possibility for other, intermediate bargains between the 'fair' solution on the Pareto boundary, and the Nash equilibrium.

Suppose the other agent has a slightly different definition of fairness and they think that what you consider to be a payoff of (12, 12) favors you too much; they think that you're the one making an unfair demand.  They'll refuse (12, 12) with the same feeling of indignation that you would apply to (9, 14).

Well, if you give in to an arrangement with an expected payoff of, say, (11, 13) as you evaluate payoffs, then you're giving other agents an incentive to skew their definitions of fairness.

But it does not create poor incentives (AFAICT) to accept instead a bargain with an expected payoff of, say, (10, 11) which the other agent thinks is 'fair'.  Though they're sad that you refused the truly fair outcome of (as you count utilons) 11, 13 and that you couldn't reach the Pareto boundary together, still, this is better than the Nash equilibrium of (8, 8).  And though you think the bargain is unfair, you are not creating incentives to exploit you.  By insisting on this definition of fairness, the other agent has done worse for themselves than other (12, 12).  The other agent probably thinks that (10, 11) is 'unfair' slanted your way, but they likewise accept that this does not create bad incentives, since you did worse than the 'fair' outcome of (11, 13).

There could be many acceptable negotiating equilibria between what you think is the 'fair' point on the Pareto boundary, and the Nash equilibrium.  So long as each step down in what you think is 'fairness' reduces the total payoff to the other agent, even if it reduces your own payoff even more.  This resists exploitation and avoids creating an incentive for claiming that you have a different definition of fairness, while still holding open the possibility of some degree of cooperation with agents who honestly disagree with you about what's fair and are trying to avoid exploitation themselves.

This translates into an informal principle of negotiations:  Be willing to accept unfair bargains, but only if (you make it clear) both sides are doing worse than what you consider to be a fair bargain.

I haven't seen this advocated before even as an informal principle of negotiations.  Is it in the literature anywhere?  Someone suggested Schelling might have said it, but didn't provide a chapter number.

ADDED:

Clarification 1:  Yes, utilities are invariant up to a positive affine transformation so there's no canonical way to split utilities evenly.  Hence the part about "Assume a magical solution N which gives us the fair division."  If we knew the exact properties of how to implement this magical solution, taking it at first for magical, that might give us some idea of what N should be, too.

Clarification 2:  The way this might work is that you pick a series of increasingly unfair-to-you, increasingly worse-for-the-other-player outcomes whose first element is what you deem the fair Pareto outcome:  (100, 100), (98, 99), (96, 98).  Perhaps stop well short of Nash if the skew becomes too extreme.  Drop to Nash as the last resort.  The other agent does the same, starting with their own ideal of fairness on the Pareto boundary.  Unless one of you has a completely skewed idea of fairness, you should be able to meet somewhere in the middle.  Both of you will do worse against a fixed opponent's strategy by unilaterally adopting more self-favoring ideas of fairness.  Both of you will do worse in expectation against potentially exploitive opponents by unilaterally adopting looser ideas of fairness.  This gives everyone an incentive to obey the Galactic Schelling Point and be fair about it.  You should not be picking the descending sequence in an agent-dependent way that incentivizes, at cost to you, skewed claims about fairness.

Clarification 3:  You must take into account the other agent's costs and other opportunities when ensuring that the net outcome, in terms of final utilities, is worse for them than the reward offered for 'fair' cooperation.  Offering them the chance to buy half as many paperclips at a lower, less fair price, does no good if they can go next door, get the same offer again, and buy the same number of paperclips at a lower total price.

Even with default points, systems remain exploitable

9 Stuart_Armstrong 19 July 2013 04:24PM

Still exploitable, even with defaults

A while ago, I posted a brief picture-proof of the fact that whatever bargaining system you use to reach deals, they are all exploitable, in some situations, by liars (as long as the outcome is Pareto and a few other assumptions).

That included any system with an internally assigned default point. The picture proofs work no matter how you calculate the bargaining outcome: if you use the utility value data to assign a default point, before picking the Nash bargaining equilibrium, then the whole process is susceptible to exploitation by lying.

Is the same thing true for externally assigned default points (i.e. default points that come from outside the data, and are not a mere function of everyone's preferences and the available outcomes)? A moment's thought shows that this is the case. The picture proofs never used translations, or scaling, or anything that would shift an external default point. So having an externally assigned default point does not solve the problem of lying.

But "any Pareto bargaining system is exploitable by lying" is an existence proof: in at least one circumstance, one player may be able to derive a non-zero benefit by lying about their utility function. This doesn't give an impression of the scale of the problem.

The scale of the problem

The problem is very severe, for the Nash Bargaining Solution (NBS), the Kalai-Smorodinsky Bargaining Solution (KSBS) and my Mutual Worth Bargaining Solution (MWBS). Essentially, it's as bad as it can get.

For KSBS and NBS, let's call an outcome admissible if it's Pareto-better than the default. For the MWBS, call an outcome admissible if the combined utility values it more than the default point (as we've seen, this needn't be an improvement for both players). In all three approaches, the bargaining solution must be admissible.

Then the dismal result is:

  • Let O be any admissible pure outcome. Then either player can lie, if they know everyone's preferences, to force the bargaining solution to pick O.
continue reading »

No independence of irrelevant alternatives (picture proof)

7 Stuart_Armstrong 03 May 2012 05:48PM

Back in the old days, when people were wise and the government was just, I did a post on the Nash bargaining solution for two player games. Here each player has their own utility function and they're choosing amongst joint options, and trying to bargain to find the best one. What was nice about this solution is that it is independent of irrelevant alternatives (IIA): once you've found the best solution, you can erase any other option, and it remains the best.

In order to do that, the Nash bargaining solution makes use of a "disagreement point", a special point that provides a zero to both utilities. This seems - and is - ugly. Can we preserve IIA without this clunky disagreement point?

By the title of the this post, you may have guessed that we can't. Specifically, assume the outcome is symmetric across both players (i.e. permuting the two utility functions preserves the outcome choice), the outcome is Pareto-optimal (any change will reduce the utility of at least one player) and there is no outside canonical choices for the utility functions (no special scales, no zeroes, no disagreement points). Then IIA must fail. It fails under weaker conditions as well, but the above lead to an easy picture-proof. And picture proofs are nice.

continue reading »

In the Pareto world, liars prosper

15 Stuart_Armstrong 08 December 2011 02:15PM

This is a simple picture proof to show that if there is any decision process that will find a Pareto outcome for two people, it must be that liars will prosper: there are some circumstances where you would come out ahead if you were to lie about your utility function.

Apart from Pareto, the only other assumption it needs are that if the data is perfectly symmetric, then the outcome will be symmetric as well. We won't even need to use affine independence or other scalings of utility functions.

Now, given Pareto-optimality, symmetry allows us to solve symmetric problems by taking the unique symmetric Pareto option. Two such symmetric problems presented here, and in one of them, one of the two players must be able to prosper by lying.

So first assume Pareto-optimality, symmetry, and (by contradiction) that liars don't prosper. The players are x and y, and we will plot their utilities in the (x,y) plane. The first setup is presented in this figure:


 

continue reading »