Mixed strategy Nash equilibrium

Meni_Rosenfeld

60 Mixed strategy Nash equilibrium

16th Oct 2010

5 min read

60

Inspired by: Swords and Armor: A Game Theory Thought Experiment

Recently, nick012000 has posted Swords and Armor: A Game Theory Thought Experiment. I was disappointed to see many confused replies to this post, even after a complete solution was given by Steve_Rayhawk. I thought someone really ought to post an explanation about mixed strategy Nash equilibria. Then I figured that that someone may as well be me.

I will assume readers are familiar with the concepts of a game (a setting with several players, each having a choice of strategies to take and a payoff which depends on the strategies taken by all players) and of a Nash equilibrium (an "optimal" assignment of strategies such that, if everyone plays their assigned strategy, no player will have a reason to switch to a different strategy). Some games, like the famous prisoner's dilemma, have a Nash equilibrium in so-called "pure strategies" (as opposed to mixed strategies, to be introduced later). Consider, however, the following variant of the matching pennies game:

Player 1 is a general leading an attacking army, and player 2 is the general of the defending army. The attacker can attack from the east or west, and the defender can concentrate his defenses on the east or west. By the time each side learns the strategy of its enemy, it is too late to switch strategies. Attacking where the defenses aren't concentrated gives a great advantage; additionally, due to unspecified tactical circumstances, attacking from the east gives a slight advantage. The sides have no interest in cooperating, so this is a zero-sum game (what one side wins, the other loses).

This elaborate description can be summarized in the following payoff matrix (these payoffs are for the attacker; the defender's payoffs are their negatives):

	2: East	2: West
1: East	-1	2
1: West	1	-2

What strategy should each side play? The attacker can think, "Overall, going east is advantageous. So I'll go east." The defender, anticipating this, might say, "Surely they will go east, so that's where I'll wait for them." But after some deliberation, the attacker will realize this, and say "They will expect me from the east! I'll surprise them and go west." You can see where this is going - the defender will think "they'll try to be smart and go west; I'll be smarter and be ready for them", the attacker will think "I was right the first time, east is the way to go" and so on, ad infinitum.

Indeed, looking at the table does not reveal any obvious Nash equilibrium. Every assignment of strategies, represented as a square in the table, will leave one side preferring the alternative. So, are the sides deadlocked in an infinite recursion of trying to outsmart each other? No. They have the option of choosing a strategy randomly.

Suppose the attacker decides to toss a coin, and go east if it lands heads, and west for tails. Suppose also that the defender, with his mastery of psychology, manages to accurately predict the attacker's bizarre behavior. What can he do with this knowledge? He cannot predict the outcome of a coin toss, so he makes do with calculating the expected outcome for each of his (the defender's) strategies. And it can be easily seen that no matter what the defender does, the expectation is 0.

Similarly, suppose the defender consults his preferred (P)RNG so that he defends east with probability 2/3, west with probability 1/3, and suppose that the attacker anticipates this. Again, either of the attacker's strategies will give an expected gain of 0.

This randomized behavior, which is a combination of strategies from which one is randomly chosen with specified probabilities, is called a "mixed strategy". They are typically denoted as a vector listing the probability for choosing each pure strategy, so the defender's suggested strategy is (2/3, 1/3).

What we have seen here, is that by a clever choice of mixed strategy, each side can make sure they cannot be outsmarted! By constraining ourselves to rely on randomness for deciding on an action, we denied our opponent the ability to predict it and counter it. If yourself you don't know what you'll do, there's no way your opponent will know. We conclude that sometimes, acting randomly is the rational action to take.

As we've seen, for each player we have that if he uses his suggested mixed strategy, then it doesn't matter what the other player will do. The corollary is that if the attacker plays (1/2, 1/2) and the defender plays (2/3, 1/3), then no player will have a reason to switch to a different strategy - this is a Nash equilibrium!

Some more points of interest: "Just" randomizing won't do - you have to pick the right probabilities. The exact probabilities of the mixed strategy Nash equilibria, and the resulting payoff, depend on the specifics of the payoff matrix. In my example, the defender needs a high probability of defending east to prevent the attacker from exercising his advantage, but the symmetry is such that the attacker chooses with even odds. In games with more strategies per player, an equilibrium mixed strategy may be supported on (have positive probability for) all, several, or one of the pure strategies. If all players apply a Nash equilibrium, then any player can switch to any one of the pure strategies supporting his mixed strategy without changing his expected payoff; switching to any other strategy either decreases the payoff or leaves it unchanged.

Perhaps most interesting of all is Nash's theorem, saying that every finite game has a mixed strategy Nash equilibrium! Our original problem, that some games have no equilibrium, is solved completely once we move to mixed strategies.

One thing should be kept in mind. A Nash equilibrium strategy, much like a minimax strategy, is "safe". It makes sure your expected payoff won't be too low no matter how clever your opponent is. But what if you don't want to be safe - what if you want to win? If you have good reason to believe you are smarter than your opponent, that he will play a non-equilibrium strategy you'll be able to predict, then go ahead and counter that strategy. Nash equilibria are for smart people facing smarter people.

In fact, it is possible that some of the comments I called confused earlier were fully aware of these issues and coming from this "post-Nash" view. To them I apologize.

Examples where mixed strategies are crucial are plentiful. I'll give one more - the volunteer's dilemma. A group of people are about to suffer greatly from some misfortune, unless at least one of them volunteers to take a slightly inconvenient action. They cannot communicate to coordinate the volunteering. If everyone uses the same deterministic strategy, then either all or none will volunteer, neither of which is stable. But they have a mixed strategy equilibrium which, by carefully balancing the risks of needlessly volunteering and having nobody volunteer, leaves everyone with an expected penalty equal to just the inconvenience of volunteering. Not as good as having just one person volunteer, but at least it's stable.

For further reading, Wikipedia has many articles about game theory and Nash equilibria. Also, Perplexed made this related comment recently.

Game Theory

Frontpage

60

New Comment

Rendering 0/47 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 2:37 PM

Moderation Log

60 Mixed strategy Nash equilibrium

by Meni_Rosenfeld

16th Oct 2010

5 min read

60

	2: East	2: West
1: East	-1	2
1: West	1	-2

In fact, it is possible that some of the comments I called confused earlier were fully aware of these issues and coming from this "post-Nash" view. To them I apologize.

For further reading, Wikipedia has many articles about game theory and Nash equilibria. Also, Perplexed made this related comment recently.

Game Theory

Frontpage

60

Mentioned in

174References & Resources for LessWrong

New Comment

Rendering 0/47 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 2:37 PM

Moderation Log

More from Meni_Rosenfeld

Curated and popular this week

47Comments

Comment Permalink

Perplexed16y40

As an extra added bonus, I'll mention the "revelation principle" which applies in games where there is some uncertainty regarding the other guy's payoffs or constraints. (No math or proofs here - I just want to sketch the problem and assert the solution. For proofs and details, see a good game theory textbook, such as Myerson.)

For example, say that Player1 has arthritis and can't form a fist for "rock" without some pain. As a result, this player suffers a penalty of -1/3 whenever he plays "rock" - meaning he only has a net payoff of 2/3 against "scissors" and actually loses 1/3 against "rock" or "paper". Player2 has heard rumors about Player1's arthritis, but doesn't know whether the pain occurs when Player 1 plays "rock" or whether it is "paper" that causes the problem. She assigns a Bayesian prior probability of 0.5 to each kind of arthritis. Player 1 knows about these priors - that is, these priors are common knowledge.

Clearly, Player1 wants to keep secret from Player2 exactly which form of arthritis he suffers from. Harsanyi (1967 or so) invented the concept of "Bayesian equilibrium" to characterize and calculate solutions to these kinds of games where the players are unsure of each other's payoffs and beliefs.

Cousin_it just told us about how the basic concept of Nash equilibrium can sometimes be improved to "correlated equilibrium" if the players can find a mutually trusted 3rd person to assist them. Actually, they don't need a person - a mutually trusted machine will do just fine. So the question naturally arises whether a Bayesian equilibrium can be improved to a correlated equilibrium just as the simpler Nash equilibrium could be improved. The answer is yes - but there is a catch. The catch is called the "revelation principle".

The principle is that both players have to trust the 3rd person or machine with their secrets. It turns out that there always is a correlated equilibrium in which both players have an incentive to reveal their secrets honestly to the arbitrator (though not to each other) and both players have an incentive to carry out the instructions of the arbitrator.

The algorithms used by the arbitrator in a correlated equilibrium are not rocket science. They can't be. They have to be understandable and understood by both players. How else can they be convinced that they will do better by trusting the arbitrator than by just playing the game without arbitration?

So if the game players are two powerful AIs, they will want to use a very simple and secure computer system as the jointly trusted arbiter. I can't help but wonder whether the best way to avoid uFAIs that secretly seek to take over the world is to somehow make use of the revelation principle.

cousin_it16y10

I love the way Ken Binmore explains the revelation principle in "Fun and Games" in the context of auctions and mechanism design. Unfortunately a full explanation would take too much space here, but it's one of the really beautiful results in game theory. Here's an example of what it's good for: suppose you want to sell a house and you have a Bayesian prior over the prices that different buyers might be willing to pay. You're faced with the problem of setting up an auction that would maximize the expected selling price. Now there are many types of... (read more)

See in context