Paper review: A Cryptographic Solution to a Game Theoretic Problem

victorsintnicolaas

It's time for a puzzle involving two fields: game theory and cryptography! Both fields model self-interested agents in a competitive setting. In game theory, we list actions and payoffs, after which we can determine which actions will benefit agents the most. In cryptography, we model actions to ensure agents only get a limited amount of information.

Comparison of concepts between cryptography and game theory

The problem: game of chicken

To illustrate how cryptography can be used to achieve better outcomes in a game, we're going to look at the "Game of Chicken". Imagine two players driving towards each other in a car, in principle wanting to avoid a collision, but also not wanting to chicken out. The standard two-player game is characterized by the following payoffs:

	Player 1 cooperates	Player 1 defects
Player 2 cooperates	(4, 4)	(1, 5)
Player 2 defects	(5, 1)	(0, 0)

"While the “wisest” pair of actions is (C, C), this is not a Nash equilibrium, since both players are willing to deviate (believing that the other player will stay at C)."

We can display the three Nash equilibria (the optimal actions for non-coordinating self-interested players) on a graph. The players will either (cooperate, defect), (defect, cooperate), or have a 50% chance of (cooperate, defect) and (defect, cooperate).

Three Nash equilibria in the extended game

In the previous post about game theory, I referred to the Folk Theorem of game theory, which says that many equilibria can be enforced by the players in iterated games. However, those enforceable equilibria all fall in the so-called convex hull of the Nash equilibria - which is any situation contained by the Nash equilibria. In the Game of Chicken, situations in the convex hull are all situations below the purple diagonal, leading to a maximum total payoff of 6. Is there any way we can do better? Well yes! We can try to achieve a correlated equilibrium ( 1/3(C, D) + 1/3(D, C) + 1/3(C, C)), giving the two players a total expected payoff of 2*(5*1/3 + 1*1/3 + 4*1/3)=6.33.

"In a Nash equilibrium, each player chooses its move independently of the other player. (Hence, the induced distribution over the pairs of moves is a product distribution.) Yet, Aumann [2] showed that in many games, the players can achieve much higher expected payoffs, while preserving the “self-enforcement” property, if their strategies are correlated (so the induced distribution over the pairs of moves is no longer a product distribution). To actually implement such a correlated equilibrium, a “trusted third party” (called a mediator) is postulated. This mediator chooses the pair of moves according to the right joint distribution and privately tells each player what its designated move is. Since the strategies are correlated, the move of one player typically carries some information (not known a-priori) on the move of the other player. In a correlated equilibrium, no player has an incentive to deviate from its designated move, even knowing this extra information about the other player’s move."

In short: we can achieve better outcomes by asking a trusted third party to tell us what to do. An important method to enforce this is to threaten the other player with defection if they do not listen to the trusted third party. Moreover, both players should be disincentivized from stopping any protocol early:

"We employ the standard game-theoretic solution,which is to punish the cheating player to his minmax level. This is the smallest payoff that one player can “force” the other player to have. [...] A particular type of cheating is “early stopping”. Since the extended game must always result in players getting payoffs, early stopping is not an issue in game theory, since it will be punished by the minmax level as well."

The role of cryptography: replacing the trusted mediator

There is one thing about the above protocol which still bugs a lot of people, and it is the reliance on a trusted mediator.

On a societal level, governments and institutions are playing the role of a trusted party all the time, mediating many relationships between players. But trust can be fleeting, intangible and hard to scale. Many people grew up with the famous statement: "Trust, but verify". Many innovations in cryptography over the past decades have allowed us to replace trust by automating the verification of certain processes. For example, cryptocurrency users don't have to trust their central bank and intermediaries, but can verify open source software protocols and developer discussions.

As in almost all of our previously discussed papers, randomness plays a key role again when replacing trust. In cryptocurrency, validators are chosen at random to enforce honesty. In the Game of Chicken, an action can be chosen at random from the set of actions in the previously discussed correlated equilibrium: (C, D), (D, C) and (C, C). Importantly, players should not be able to infer ahead of the game which action the counterparty will be playing. By letting the two players follow a cryptographic protocol, they can achieve this by jointly computing a random action for each other.

The cryptography

In standard asymmetric cryptography protocols, you can encrypt a message with a public key (pk), and you can decrypt it with a secret key (sk). The public key can typically be shared with the world - representing your identity - while the secret key is typically something to... keep a secret! To get rid of the previously mentioned trusted mediator, today's discussed paper uses the following additional technique:

"Our protocol for Correlated Element Selection uses as a tool a useful primitive which we call blindable encryption (which can be viewed as a counterpart of blindable signatures [10]). Stated roughly, blindable encryption is the following: given an encryption c of an (unknown) message m, and an additional message m', a random encryption of m + m' can be easily computed. This should be done without knowing m or the secret key."

By using the techniques mentioned above, two players can then execute the following protocol:

0. "Two players share a public list of pairs (C,D), (D,C) and (C,C). For reasons that will soon become clear, we call the two players the “Preparer” (P ) and the “Chooser” (C)".

1. At the beginning of the protocol, the Preparer randomly permutes the list, encrypts it element-wise and sends the resulting list to the Chooser.

2. The Chooser picks a random pair of ciphertexts (c, d) from the permuted list. It then blinds c with 0 (i.e. makes a random encryption of the same plaintext), blinds d with a random blinding factor β, and sends the resulting pair of ciphertexts (e, f) back to the Preparer.

3. Decryption of e gives the Preparer its element a (and nothing more, since e is a random encryption of a after the blinding with 0), while the decryption b̃ of f does not convey the value of the actual encrypted message since it was blinded with a random blinding factor.

4. The Preparer sends b̃ to the Chooser, who recovers his element b by subtracting the blinding factor β.

And voilá! Two players have collectively chosen an optimal action without knowing the other player's action!

Reducing trust even further

The above protocol still relies on trusting two assumptions. The first is that players are honest. Zero knowledge proofs can be used to prevent dishonest players from misbehaving:

"after each flow of the original protocol, the corresponding player proves in zero knowledge that it indeed followed its prescribed protocol."

Such zero knowledge proofs can be quite computationally expensive to generate. A very important take-away from the field of cryptography is that generic solutions are inefficient:

"However, a closer look reveals that one does not need all the power of the generic transformation, and the protocol can be optimized in several ways."

The second trust assumption is that certain parameters need to be generated ahead of time:

"for simplicity, we assume that the keys for this scheme were chosen by a trusted party ahead of time and given to P, and that the public key was also given to C."

This assumption of a trusted setup is a contentious issue in the cryptocurrency community. We're using cryptography to get rid of trust, and we end up just replacing it by a different kind of trust! In reality, cryptography is a tool to make it more explicit what we trust. Besides, with enough effort, even this last bit of trust can be reduced, as certain groups go to great lengths in order to reduce the level of trust required.

Applications

As in any economic model, you may object that the assumption of 'self-interestedness' is not realistic. This assumption is, in many cases, just a simplification to make modeling easier. Many models can be extended or adjusted to take into account altruism, for example by pretending that one agent represents a whole group of people. In essence, there are many settings where people or groups do act selfishly and competitively, or where we want to protect against the worst case scenario where our selfish adversary wishes to gain information or utility.

So, taking the ideas of today's paper a step further, what if we all just submit our valuations and preferences to a giant algorithm which returns everyone's optimal action? I admit, having such a generic governance mechanism is a bit far-fetched, but there may be specific areas where this actually makes sense. For example, in salary negotiations or data reporting. In the game of Schedule Chicken, competing teams are incentivized to schedule or report unrealistically ambitious achievements (defection) as opposed to reporting honest but modest achievements (cooperation). This is a real problem, occuring not just in companies, but also in entire countries such as during China's Great Famine. A first solution is to install a mediator in the form of a product manager, but managers often have a strong conflict of interest! Perhaps a cryptographic protocol can prove helpful here when integrated into project planning software.

Finally, on the theoretical front, the combination of two fields can lead to new insights:

Notice that by implementing our cryptographic solution in the game-theory setting, we gain on the game-theory front (by eliminating the need for a mediator), but we also gain on the cryptography front (for example, in that we eliminate the problem of early stopping).

[-]philh4y30

Note: it seems the paper itself is linked from your original blog, but not on LW? So for the benefit of LW readers, it's: Dodis, Y., Halevi, S., & Rabin, T. (2000, August). A cryptographic solution to a game theoretic problem. In Annual International Cryptology Conference (pp. 112-130). Springer, Berlin, Heidelberg.

I had trouble following this, but I think I get it. To add some maybe-helpful clarifications:

At the beginning of the protocol, the Preparer randomly permutes the list, encrypts it element-wise and sends the resulting list to the Chooser.

Ecrypting it elementwise means encrypting the components of the pairs separately, not just the list element separately. So if the list starts as , it becomes $[(f (c_{1}), f (d_{1})), (f (c_{2}), f (d_{2})), (f (c_{3}), f (d_{3})]$ . The encryption is random, such that even if $c_{1} = c_{2}$ , the encryptions $f (c_{1}) \neq f (c_{2})$ . That way you can't look at the frequencies in the encrypted list.

The Chooser picks a random pair of ciphertexts (c, d) from the permuted list.

Here the Chooser picks one random list element, which is a pair of cyphertexts.

It then blinds c with 0 (i.e. makes a random encryption of the same plaintext), blinds d with a random blinding factor β, and sends the resulting pair of ciphertexts (e, f) back to the Preparer.

The blinding with 0 here is necessary so the Preparer won't be able to recognize $c$ , which would tell her what $d$ is.

[-]victorsintnicolaas4y30

Thanks for the additions, will note the original paper in future posts!

[-]Dagon4y30

I'm not sure I see how you're modeling the trusted mediator and the option of early stopping. Shouldn't you extend the payoff matrix to 3 dimensions (including the trusted moderator) with 3 outcomes for the players (C/D/Stopped) and 2 for the moderator (stop or continue, with payouts that switch based on whether a player defected in the last round)?

I think this reduces, with proper moderator payouts, to an overall Nash equilibrium that's the same as a standard 2-player series with communication: cooperate on all, but defect on the last iteration. And if you can add significant uncertainty to which iteration is last (so EV of possible future iterations is greater than the difference between min and max payouts), that goes away too.

[-]victorsintnicolaas4y10

Although the original paper didn't mention this, I think your description makes sense! Indeed “the game must go on” even in case of stopping. One simpler idea of how to model the situation with moderator: there are just two players who have actions (1) adhere to (crypto) mediator's advice (2) deviate from advice.

Thanks for the insight about adding uncertainty to avoid ending with defection!

LESSWRONG
LW

23