Comment Permalink

Which decision theory should we use? CDT? UDT? TDT? What exactly do we mean by a "better" decision theory?

To get some practice in answering this kind of question, lets look first at a simpler set of questions: Which play should I make in the game PSS? Paper? Stone? Scissors? What exactly do we mean by a better play in this game?

Bear with me on this. I think that a careful look at the process that game theorists went through in dealing with game-level questions may be very helpful in our current confusion about decision-theory-level questions.

The first obvious thing to notice about the PSS problem is that there is no universal "best" play in the game. Sometimes one play ("stone", say) works best; sometimes another play works better. It depends on what the other player does. So we make our first conceptual breakthrough. We realize we have been working on the wrong problem. It is not "which play produces the best results?". It is rather "which play produces the best expected results?" that we want to ask.

Well, we are still a bit puzzled by that new word "expected", so we hire consultants. One consultant, a Bayesian/MAXENT theorist tells us that the appropriate expectation is that the other player will play each of "paper", "stone", and "scissors" equally often. And hence that all plays on our part are equally good. The second consultant, a scientist, actually goes out and observes the other player. He comes back with the report that out of 100 PSS games, the other player will play "paper" 35 times, "stone" 32 times, and "scissors" 33 times. So the scientist recommends the play "scissors" as the best play. Our MAXENT consultant has no objection. "That choice is no worse than any other", says he.

So we adopt the strategy of always playing scissors, which works fine at first, but soon starts returning abysmal results. The MAXENT fellow is puzzled. "Do you think maybe the other guy found out about our strategy?" he asks. "Maybe he hired our scientist away from us. But how can we possibly keep our strategy secret if we use it more than once?" And this leads to our second conceptual breakthrough.

We realize that it is both impossible and unnecessary to keep our strategy secret (just as cryptographer knows that it is difficult and unnecessary to keep the encryption algorithm secret. But it is both possible and essential to keep the plays secret until they are actually made (just as a cryptographer keeps keys secret). Hence, we must have mixed strategies where the strategy is a probability distribution and a play is a one-point sample from that distribution.

Take a step back and think about this. Non-determinism of agents is an inevitable consequence of having multiple agents whose interests are not aligned (or more precisely, agents whose interests cannot be brought into alignment by a system of side payments). Lesson 1: Any decision theory intended to work in multi-agent situations must handle (i.e. model) non-determinism in other agents. Lesson 2: In many games, the best strategy is a mixed strategy.

Think some more. Agents whose interests are not aligned often should keep secrets from each other. Lesson 3: Decision theories must deal with secrecy. Lesson 4: Agents may lie to preserve secrets.

But how does game theory find the best mixed strategy? Here is where it gets weird. It turns out that, in some sense, it is not about "winning" at all. It is about equilibrium. Remember back when we were at the PSS stage where we thought that "Always play scissors" was a good strategy? What was wrong with this, of course, was that it induced the other player to switch his strategy toward "Always play stone" (assuming, of course, that he has a scientist on his consulting staff). And that shift on his part induces (assuming we have a scientist too) us to switch toward paper.

So, how is this motion brought to a halt? Well, there is one particular strategy you can choose that at least removes the motivation for the motion. There is one particular mixed strategy which makes your opponent not really care what he plays. And there is one particular mixed strategy that your opponent can play which makes you not really care what you play. So, if you both make each other indifferent, then neither of you has any particular incentive to stop making each other indifferent, so you both just stick to the strategy you are currently playing.

This is called Nash equilibrium. It also works on non-zero sum games where the two players' interests are not completely misaligned. The decision theory at the heart of Game Theory - the source of eight economics Nobel prizes so far - is not trying to "win". Instead, it is trying to stop the other player from squirming so much as he tries to win. Swear to God. That is the way it works.

Alright, in the last paragraph, I was leaning over backward to make it look weird. But the thing is, even though you no longer look like you are trying to win, you still actually do as well as possible, assuming both players are rational. Game theory works. It is the right decision theory for the kinds of decisions that fit into its model.

So, was this long parable useful in our current search for "the best decision theory"? I guess the answer to that must depend on exactly what you want a decision theory to accomplish. My intuition is that Lessons #1 through #4 above cannot be completely irrelevant. But I also think that there is a Lesson 5 that arises from the Nash equilibrium finale of this story. The lesson is: In any optimization problem with a multi-party optimization dynamics to it, you have to look for the fixpoints.

9

How can we compare decision theories?

9

9