EA Hangout Prisoners' Dilemma

Jeffrey Ladish

55 EA Hangout Prisoners' Dilemma

27th Sep 2021

4 min read

55

I ran the following experiment on Petrov’s day on the EA Hangouts Facebook page:

Hey everyone, it's Petrov day and I wanted to run a little scenario.
I'm going to donate $100 each to AMF and MIRI. Unless someone comments with "nuke MIRI" or "nuke AMF". In that case, I'll give people 20 minutes to respond by nuking the other org, and close the option to respond after that.
If neither orgs get nuked by midnight tonight, I'll give $100 to both orgs. If both orgs get nuked, they'll both get nothing. If only one org gets nuked, the nuked org will get nothing and the other org will get $300.
I'd prefer if no one nuked anyone, but I promise to honor the outcome of the experiment and donate accordingly. These are real donations that I wouldn't have made otherwise.

At 23:23, MIRI was nuked with no retaliation to AMF, and I donated $300 to AMF. Before this happened, there were a lot of interesting game dynamics and I learned more than I expected to!

I created this experiment because I always felt like Petrov day exercises didn’t involve much real incentive to launch an attack. The Cold War was not like this. There were real fears in both the USSR and the US that the other power represented an existential threat. After talking it through with some friends, I decided to go with a Prisoner’s Dilemma type game and see what happened.

Less than half an hour after I posted the original message, someone asked why anyone would ever counternuke. I replied that they might counternuke because they had precommited to do so, in order to deter someone from nuking their preferred organization. Immediately after this, they commented “nuke AMF”. Another commenter asked if it was possible to cancel an attack. I hadn’t thought of this possibility, so I decided that yes they could if they deleted or edited their comment within 20 minutes of posting it.

This led to an interesting dynamic. A few minutes later someone else commented “nuke MIRI” with a followup message that they would delete this comment if the original attacker deleted their “nuke AMF” comment. Both posts were then deleted by their authors.

This happened two more times, where someone made an initial launch post and someone else launched a counter attack, with a promise to abort the attack if the original attack was canceled. One of the attacks got very close to detonation, with ~10 seconds to spare.

Someone else posted a $100 bounty to AMF if someone nuked MIRI without retaliation, and someone added a $50 consolatory precommited donation to MIRI for the same scenario. Another person launched an attack against both orgs to try to extort more donations from group members, and aborted these launches with no successful extortions.

My favorite suggestion was one by Oliver Habryka and Avi Norowitz to pre-commit to nuking one organization at random, in a publicly verifiable way such as using the NIST public randomness beacon. This way, the EV would be $150 to both organizations, better than the $100 to both achieved by default, with the added benefit of no defection necessary. I was excited to see this suggested and endorsed by a number of people, but no one ended up trying it.

The final post by Alexandre Zani read:

I'm going to sleep. If tomorrow morning nobody has counter-nuked without cancelling, I will make a $50 donation (with an extra $50 matched by my employer for a total of $100) to MIRI.
Nuke MIRI.
Good night everyone!

No one responded, and I’m not sure if anyone saw it in time. I thought this final attack was clever for waiting til the clock had almost run out, implying a precommitment in the form of bedtime, and providing a disincentive for retaliation. That being said, I’m disappointed that a cooperative solution was not reached.

Specifying that people were allowed to recall their attacks definitely changed the game and made the negotiations more interesting. Part of me was tempted to make the launch aborts probabilistic. In fact someone simulated this by launching an attack and pre-committing to abort their launch with an 11/12 chance.

It seems worth nothing that while allowing recalls made the game more interesting to me, it did not seem particularly historically accurate. In the Cold War, ICBM attacks could not be recalled. This led to a more narrow margin for error, since decision makers like those who Petrov reported to had to decide whether to “use them or lose them” in the event of a probable attack, and data obtained after they were launched could not be used to avert a nuclear war. Modern long range missiles could certainly be programmed to be destroyed prior to their targets by command and control systems, but likely this capability would not be advertised if implemented, so it’s not clear if it’s been added.

Perhaps unsurprisingly in a group with 1.6k members, people were quite willing to take action and launch nukes. I counted 10 total launches with 9 of these aborted. There were 114 comments in total. A final takeaway for me was that while many people argued for cooperation, I saw more action taken towards clever ways to win money for one org or the other. Bounties were posted for outcomes where one side got nuked, but no bounties were posted for cases where neither side defected. Unilateral action was easier to take than trying to build a consensus such as the one Oli and Avi proposed, both offensively by launching first strikes and defensively by launching second strikes unless the first were recalled. If cooperation in these scenarios is desirable, then I think that active cooperation strategies are needed when aggressive action is rewarded by default.

Thank you Miranda Dixon-Luinenburg / lesswrong for editing help!

AltruismGame TheoryPetrov DayWorld Optimization

Frontpage

55

Mentioned in

19A list of Petrov buttons

EA Hangout Prisoners' Dilemma

New Comment

18 comments, sorted by

top scoring

Click to highlight new comments since: Today at 2:48 PM

[-]jimrandomh3y250

This is not a prisoner's dilemma. The true structure of this game, with the Petrov Day-themed labels stripped off, is: race to claim the prize, then the second mover has the option to destroy the prize. While the takeback options were used, everyone who made a move in this game would have been better off if they had precommitted not to.

[-]Jeffrey Ladish3y10

I think there are a lot of dynamics present here that aren't present in the classic prisoners dilemma, and some dynamics that are present (and some that are present in various iterated prisoner's dilemmas). The prize might be different for different actors, since actors place different value of "cooperative" outcomes. If you can trust people's precommitments, I think there is a race to commit OR precommit to an action.

E.g. if I wanted the game to settle with no nukes launched, then I could pre-commit to launching a retaliatory strike to either side if an attack was launched.

[-]Idan Arye3y10

If we are looking for a known game structure with a formal name, I'd say it's Battle of the Sexes: a defect-cooperate is preferable to both defect-defect and cooperate-cooperate, but each side would rather be the defector in that outcome.

[-]philh3y30

Even assuming simultaneous moves, it's not battle of the sexes. In my own classification it would be at the intersection between Too Many Cooks and Farmer's Dilemma: with $X + Y > 2 W$ . ( $W = 100, X = 0, Y = 150, Z = 0$ .)

I didn't mention BoS in that post because it doesn't fit the schema. But (suddenly obviously) it fits if you just relabel one player's moves. Then it has $X > Y > W \geq Z$ or $Y > X > W = Z$ , making it Anti-Coordination in either case.

[-]Idan Arye3y10

Why ? The participants may have a preference for one nonprofit over the other, but surely - all else being equal - they should prefer their less favorite nonprofit to get money over it getting nothing.

I'd go even farther - this is charity, so instead of a social outcome which is the sum of the players' utility the individual utilities here are applications of the players' value functions on the social outcome. Even if you prefer one nonprofit over the other - do you prefer it enough to relinquish these extra $100? Do you think your favorite charity can do with $100 more than your second favorite can do with $200?

I don't think so. We have $X > W$ here - and overall $Y > X > W > Z$ .

For most game it's clear what counts as cooperation and what counts as defecting. For BoS - no so much. Your classification relies on that labeling (otherwise you could switch W with Z and X with Y) and since we can't use them here I'll just fix $W > Z$ - that is cooperation is always the strategy that chosen by both players is better than the other strategy if chosen by both.

So - in BoS cooperation is doing what you were already wanting to do, and you hope for your spouse to defect. The order is $X > Y > W > Z$ , which is not exactly our case but closer than any other game I can think of.

[-]jimrandomh3y20

No, it's not the same game at all because Battle of the Sexes has simultaneous moves, while this game has sequential moves (you can check the comments for other players' moves before making your own).

[-]Idan Arye3y10

The ability to cancel launches make it effectively simultaneous, because they mean you can't commit (at least not under the explicit rules)

[-]jimrandomh3y20

The first mover can commit by just waiting 20 minutes, after which there's a time interval equal to the period between when the first and second movers took their turns, during which the second mover can undo but the first mover can't.

[-]Idan Arye3y10

Oh. Good point. Maybe it would be interesting to do a version where you can't retract a counter nuke after the original nuke's 20 minutes.

Either way, I think the 20 minute rule is important for even talking about precommitting. Without it, people can chat and make contracts all they want, only for someone completely uninvolved in the conversation to suddenly post a nuke comment.

[-]Jeffrey Ladish3y10

Another way to run this would be to have a period of time before launches are possible for people to negotiate, and then to not allow retracting nukes after that point. And I think next time I would make it so that the total of no-nukes would be greater than the total if only one side nuked, though I did like this time that people had the option of a creative solution that "nuked" a side but lead to higher EV for both parties than not nuking.

[-]Idan Arye3y10

You also need to only permit people who took part in the negotiations to launch nukes. Otherwise newcomers could just nuke without anyone having a chance to establish a precommittment to retaliate against them.

[-]Insub3y130

I’m disappointed that a cooperative solution was not reached

I think you would have had to make the total cooperation payoff greater than the total one-side-defects payoff in order to get cooperation as the final result. From a "maximize money to charity" standpoint, defection seems like the best outcome here (I also really like the "pre-commit to flip a coin and nuke" solution). You'd have to believe that the expected utility/$ of the "enemy" charity is less than 1/2 of the expected utility/$ of yours; otherwise, you'd be happier with the enemy side defecting than with cooperation. I personally wouldn't be that confident about the difference between AMF and MIRI.

[-]brendan.furneaux3y190

This is exactly right! It's a poor analogy for the Cold War both because the total payoff for defection was higher than the total payoff for cooperation, and because the reward was fungible. The cooperative solution is for one side to "nuke", in order to maximize the total donation to both organizations, and then to use additional donations to even out the imbalance if necessary. That's exactly what happened, and I'm glad the "nuking" framing didn't prevent EAs from seeing what was really happening and going for the optimal solution.

[-]Jeffrey Ladish3y10

I think the fungibility is a good point, but it seems like the randomizer solution is strictly better than this. Otherwise one side clearly gets less value, even if they are better off than they would have been had the game not happened. It's still a mixed motive conflict!

[-]jmh3y130

And I'm not entirely sure you should call it a defect. Perhaps more a cooperation outcome with a potential side payment. With the single defect and a $100 side payment by the remaining group to the nuked group you've accomplished a Pareto move to a superior outcome. Both organizations are at least as well off as if none were nuked. And if the nuked group just thinks the other is doing just as good work without the side payment they might think it's a wash who actually gets the additional $100.

What I would be really interested in is just how this outcome actually attained. Seems like everyone was pretty smart (and altruistic) to realize retaliation was not the right response. In short, was this a case of a Petrovian restraint in responding to the reported nuke attack in a sense.

[-]Jeffrey Ladish3y10

I'm not sure that anyone exercised restraint in not responding to the last attack, as I don't have any evidence that anyone saw the last response. It's quite possible people did see it and didn't respond, but I have no way to know that.

[-]Jeffrey Ladish3y10

Oh I should have specified, that I would consider the coin flip to be a cooperative solution! Seems obviously better to me than any other solution.

[-]philh3y50

Alyssa Vance ran the same game on her own page, inspired by this one. Someone nuked MIRI because AMF had been nuked in landfish's game. The nuke on AMF got cancelled after a counternuke was launched. The nuke on MIRI stayed and wasn't countered. (I can't see the original game, I don't know exactly what order things happened in.)

Moderation Log