Why you should minimax in two-player zero-sum games

Nisan

17 Why you should minimax in two-player zero-sum games

by Nisan

17th May 2020

AI Alignment Forum

2 min read

1

17 Ω 4

Disclaimer: I'm surely reinventing the wheel here.

Last week I wrote that I was dissatisfied with von Neumann and Morgenstern's argument for playing the minimax strategy in a two-player zero-sum game. Here's an argument I like:

We have two players $A, B$ who play strategies $a, b$ . $A$ is trying to maximize $V (a, b)$ . Since it's a zero-sum game, $B$ is trying to minimize $V$ .

$A$ 's minimax strategy is $a^{*} := {argmax}_{a} {min}_{b} V (a, b)$
$B$ 's minimax strategy is $b^{*} := {argmin}_{b} {max}_{a} V (a, b)$
The minimax theorem gives us $v^{*} := {max}_{a} {min}_{b} V (a, b) = {min}_{b} {max}_{a} V (a, b)$

We assume that $A$ believes that no matter what $A$ does, $B$ will be epistemically and instrumentally rational. Epistemic rationality means that $B$ 's factual beliefs are accurate — i.e., $B$ 's beliefs conditional on playing $b$ are accurate if $b$ is the strategy that $B$ actually plays; but $B$ 's beliefs conditional on playing other strategies (his properly counterfactual beliefs) are unconstrained. Instrumental rationality means that if $B$ believes an action is better than all other actions, then $B$ will take that action.

I claim that given the above assumption, it's instrumentally rational for $A$ to play minimax. Here's why:

$B$ believes that if he plays minimax, $V (a, b) \leq v^{*}$ . This follows from the definitions of $b^{*}$ and $v^{*}$ .
If $B$ is epistemically rational, he knows the value of $V (a, b)$ . If $V (a, b) > v^{*}$ , then $B$ would be instrumentally irrational (because he knows he would be better off playing $b^{*}$ ). So if $B$ has both kinds of rationality, $V (a, b) \leq v^{*}$ .
Since $A$ believes $B$ is epistemically and instrumentally rational, $A$ believes that $V (a, b) \leq v^{*}$ , no matter what strategy $A$ plays.
Also, $A$ believes that if she plays minimax, $V (a, b) \geq v^{*}$ . This follows from the definitions of $a^{*}$ and $v^{*}$ .
Thus $A$ believes that minimax is at least as good as any other strategy.

Note that $A$ can still play something other than minimax, as long as she's sure that $V (a, b) = v^{*}$ .

This argument does not justify playing minimax in non-zero-sum games or games with more than 2 players. In those games it's not generally true that one player is trying to minimize another player's reward.

If you squint, this is basically the argument that von Neumann and Morgenstern use, with all the steps and assumptions written out. And if you formalize this argument, I guess you'd get Kripke semantics for games.