Vitalik Buterin has a new post about an interesting theoretical attack against Bitcoin. The idea relies on the assumption that the attacker can credibly commit to something quite crazy. The crazy thing is this: paying out 25.01 BTC to all the people who help him in his attack to steal 25 BTC from everyone, but only if the attack fails. This leads to a weird payoff matrix where the dominant strategy is to help him in the attack. The attack succeeds, and no payoff is made.
Of course, smart contracts make such crazy commitments perfectly possible, so this is a bit less theoretical than it sounds. But even as an abstract though experiment about decision theories, it looks pretty interesting.
By the way, Vitalik Buterin is really on a roll. Just a week ago he had a thought-provoking blog post about how Decentralized Autonomous Organizations could possibly utilize a concept often discussed here: decision theory in a setup where agents can inspect each others' source code. It was shared on LW Discussion, but earned less exposure than I think it deserved.
EDIT 1: One smart commenter of the original post spotted that an isomorphic, extremely cool game was already proposed by billionaire Warren Buffett. Does this thing already have a name in game theory maybe?
EDIT 2: I wrote the game up in detail for some old-school game theorist friends:
The attacker orchestrates a game with 99 players. The attacker himself does not participate in the game.
Rules:
Each of the players can either defect or cooperate, in the usual game theoretic setup where they do announce their decisions simultaneously, without side channels. We call "aggregate outcome" the decision that was made by the majority of the players. If the aggregate outcome is defection, we say that the attack succeeds. A player's payoff consists of two components:
1. If her decision coincides with the aggregate outcome, the player gets 10 utilons.
and simultaneously:
2. if the attack succeeds, the attacker gets 1 utilons from each of the 99 players, regardless of their own decision.
There are two equilibria, but the second payoff component breaks the symmetry, and everyone will cooperate.
Now the attacker spices things up, by making a credible commitment before the game. ("Credible" simply means that somehow they make sure that the promise can not be broken. The classic way to achieve such things is an escrow, but so called smart contracts are emerging as a method for making fully unbreakable commitments.)
The attacker's commitment is quite counterintuitive: he promises that he will pay 11 utilons to each of the defecting players, but only if the attack fails.
Now the payoff looks like this:
Defection became a dominant strategy. The clever thing, of course, is that if everyone defects, then the attacker reaches his goal without paying out anything.
Nice. If we analyze the game using Vitalik's 2x2 payoff matrix, defection is a dominant strategy. But now I see that's not how game theorists would use this phrase. They would work with the full 99-dimensional matrix, and there defection is not a dominant strategy, because as you say, it's a bad strategy if we know that 49 other people are cooperating, and 49 other people are defecting.
There's a sleight of hands going on in Vitalik's analysis, and it is located at the phrase "regardless of one’s epistemic beliefs [one is better off defecting]". If my epistemic belief is that 49 other people are cooperating, and 49 other people are defecting, then it's not true that defection is my best strategy. Of course, Vitalik's 2x2 matrix just does not allow me to have such refined epistemic beliefs: I have to get by with "attack succeeds" versus "attack fails".
Which kind of makes sense, because it's true that I probably won't find myself in a situation where I know for sure that 49 other people are cooperating, and 49 other people are defecting, so the correct game theoretic definition of dominant strategy is probably less relevant here than something like Vitalik's "aggregate" version. Still, there are assumptions here that are not clear from the original analysis.
So, I did not forget about that particular case. In my particular brand of cryptoeconomic analysis, I try to decompose cooperation incentives into three types:
I often group (2) and (3) into one category, "altruism-prime", but here we can separate them.
The important point is that category 1 incentives are always present as long as the protocol specifies them, category 2 incentives are always present, but th... (read more)