I find the concept of Petrov day valuable, and the principle of the experiment relevant, but doesn't the difference of stakes undermine the experiment? The consequence of entering the codes here was meaningful and noticeable, but it was nothing irreversible, lasting, or consequential.
When I walk in the streets everyday, dozens of car drivers who have a clear shot at running me over abstain from hurting, maiming or even killing me, and not a single one does it. That's what I call consequential. I'll celebrate that.
Sorry, I got tricked:
petrov_day_admin_account September 26, 2020 11:26 AM Hello Chris_Leong,
You are part of a smaller group of 30 users who has been selected for the second part of this experiment. In order for the website not to go down, at least 5 of these selected users must enter their codes within 30 minutes of receiving this message, and at least 20 of these users must enter their codes within 6 hours of receiving the message. To keep the site up, please enter your codes as soon as possible. You will be asked to complete a short survey afterwards.
I think the lesson is that if you decide to launch the nukes it's better to claim incompetence rather than malice because then opinion of you among the survivors won't suffer as much.
I think we learned that when you tell people to not destroy the world they try to not destroy the world. How is [press this button and the world ends -> don't press button] different from [press this button or else the world ends -> press button]?
The asymmetry is the button itself. If I understand correctly, Chris got this message on a separate channel, and the button still looked the same; it still said "enter launch codes to destroy LessWrong". It was still clearly meant to represent the launch of nukes.
Stretching just a bit, I think you might be able to draw an analogy here, where real people who might actually launch nuclear weapons (or have done so in other branches of the multiverse) have thought they had reasons important enough to justify doing it. But in fact, the rule is not "don't launch nukes unless there seems to be sufficient reason for it", but rather "don't launch nukes".
What possible reason could Petrov or those in similar situations have had for not pushing the button? Maybe he believed that the US would retaliate and kill his family at home, and that deterred him. In other words, he believed his enemy would push the button.
Or maybe he just did not want to kill millions of people?
In Petrov's case in particular, the new satellite-based early warning system was unproven so he didn't completely trust it, and he didn't believe a US first strike would use only one missile, or later, only four more, instead of hundreds. Furthermore, ground radar didn't confirm. And, of course, attacking on a false alarm would be suicidal because he believed the Enemy would push the button, so striking first "just in case", failed his cost-benefit analysis.
It was not "just" a commitment to pacifism.
Agreed, this is probably the best lesson of all. If the buttons exist, they can be hacked or the decision makers can be socially engineered.
270 people might have direct access, but the entire world has indirect access.
Well, they did succeed, so for that they get points, but I think it was more due to a very weak defense on behalf of the victim rather than a very strong effort by petrov_day_admin_account.
Like, the victim could have noticed things like:
* The original instructions were sent over email + LessWrong message, but the phishing attempt was just LessWrong
* The original message was sent by Ben Pace, the latter by petrov_day_admin_account
* They were sent at different points in time, the latter of which was more correlated by the FB post that caused the phishing attempt
Moreover, the attacker even sent messages to two real LessWrong team members, which would have completely revealed the attempt had those admins not been asleep in a different time zone.
I am reminded by the ai unboxing challengs where part of the point was that any single trick that gets the job done can be guarded against but guarding against all stupid tricks is not about the tricks being particularly brilliant but just covering them all.
In millgrams experiment poeple are wiling to torture becuase a guy in a white jacket requested so. Here a person is ready to nuke the world because a accounts name incuded the word "admin".
EDIT: I now believe the below contains substantial errors, after reading this message from the attacker.
Maybe you want to do sleuthing on your own, if so don't read below. (It uses LessWrong's spoiler feature.)
I believe the adversary was a person outside of the EA and rationality communities. They had not planned this, and they did not think very hard about who they sent the messages to (and didn't realise Habryka and Raemon were admins). Rather, they saw a spur-of-the-moment opportunity to attack this system after seeing a Facebook post by Chris Leong (which solicited reasons for and against pressing the button). I believe this because they commented on that Chris Leong posted and say they sent the message.
In your other post, the only reason you indicated to not press the button is that other people would still be asleep and not have experienced the thing.
As such, it feels as if the "trick" by your friend just sped up what would have almost certainly happened anyway: you eventually pressing the button and nuking the site. It'd just have happened later in the day.
This seems plausible. I do want to note that your received message was timestamped 11:26 (local to you) and the button was pressed at 11:33:30 (The received message said the time limit was 30 minutes.), which doesn’t seems like an abundance of caution and hesitation to blow up the frontpage, as far as I can tell. :P
I know it wasn’t actual nukes, so fair to not put in the same effort, but I do hope if you ever do have nukes, you take full allotted time to think though it and discuss with anyone available (even if you think they’re unlikely to reply). ;)
To be clear, while there is obviously some fun intended in this tradition, I don't think describing it as "just a game" feels appropriate to me. I do actually really care about people being able to coordinate to not take the site down. It's an actual hard thing to do that actually is trying to reinforce a bunch of the real and important values that I care about in Petrov day. Of course, I can't force you to feel a certain way, but like, I do sure feel a pretty high level of disappointment reading this response.
Like, the email literally said you were chosen to participate because we trusted you to not actually use the codes.
So, I think it's important that LessWrong admins do not get to unilaterally decide that You Are Now Playing a Game With Your Reputation.
However, if Chris doesn't want to play, the action available to him is simply to not engage. I don't think he gets to both press the button and change the rules to decide what a button press means to other players.
So, I think it's important that LessWrong admins do not get to unilaterally decide that You Are Now Playing a Game With Your Reputation.
Dude, we're all always playing games with our reputations. That's, like, what reputation is.
And good for Habyka for saying he feels disappointment at the lack of thoughtfulness and reflection, it's very much not just permitted but almost mandated by the founder of this place —
https://www.lesswrong.com/posts/tscc3e5eujrsEeFN4/well-kept-gardens-die-by-pacifism
https://www.lesswrong.com/posts/RcZCwxFiZzE6X7nsv/what-do-we-mean-by-rationality-1
Here's the relevant citation from Well-Kept Gardens:
I confess, for a while I didn't even understand why communities had such trouble defending themselves—I thought it was pure naivete. It didn't occur to me that it was an egalitarian instinct to prevent chieftains from getting too much power.
This too:
I have seen rationalist communities die because they trusted their moderators too little.
Let's give Habryka a little more respect, eh? Disappointment is a perfectly valid thing to be experiencing and he's certainly conveying it quite mildly and graciously. Admins here did a hell of a job resurrecting this p...
Even after receiving that message, it still seems like the "do not engage" action is to not enter the codes?
Honestly, I kind of think that would be a straightforwardly silly thing to worry about, if one were to think about it for a few moments. (And I note that it's not Chris' stated reasoning.)
Like, leave aside that the PM was indistinguishable from a phishing attack. Pretend that it had come through both email and PM, from Ben Pace, with the codes repeated. All the same... LW just isn't the kind of place where we're going to socially shame someone for
Y'know, there was a post I thought about writing up, but then I was going to not bother to write it up, but I saw your comment here H and "high level of disappointment reading this response"... and so I wrote it up.
Here you go:
https://www.lesswrong.com/posts/scL68JtnSr3iakuc6/win-first-vs-chill-first
That's an extreme-ish example, but I think the general principle holds to some extent in many places.
The downvotes on this comment seem ridiculous to me. If I email 270 people to tell them I've carefully selected them for some process, I cannot seriously presume they will give up >0 of their time to take part in it.
Any such sacrifice they make is a bonus, so if they do give up >0 time, it's absurd to ask that they give up even more time to research the issue.
Any negative consequences are on the person who set up the game. Adding the justification that 'I trust you' does not suddenly make the recipient more obligated to the spammer.
It's not like we asked 270 random people. We asked 270 people, each one of which had already invested many hundreds of hours into participating on LessWrong, many of which I knew personally and considered close friends. Like, I agree, if you message 270 random people you don't get to expect anything from them, but the whole point of networks of trust is that you get to expect things from each other and ask things from each other.
If any of the people in that list of 270 people had asked me to spend a few minutes doing something that was important to them, I would have gladly obliged.
It doesn't matter whether you'd have been hypothetically willing to do something for them. As I said on the Facebook thread, you did not consult with them. You merely informed them they were in a game, which, given the social criticism Chris has received, had real world consequences if they misplayed. In other words, you put them in harm's way without their consent. That is not a good way to build trust.
Just a datapoint on variety of invitees: I was included in the 270, and I've invested hundreds of hours into LW. while I don't know you personally outside the site, I hope you consider me a trusted acquaintance, if not a friend. I had no clue this was anything but a funny little game, and my expectation was that there would be dozens of button presses before I even saw the mail.
I had not read nor paid attention to the petrov day posts (including prior years). I had no prior information about the expectations of behavior, the weight put on the outcome, nor the intended lesson/demonstration of ... something that's being interpreted as "coordination" or "trust".
I wasn't using the mental model that indicated I was being trusted not to do something - I took it as a game to see who'd get there first, or how many would press the button, not a hope that everyone would solemnly avoid playing (by passively ignoring the mail). I think without a ritual for joining the group (opt-in), it's hard to judge anyone or learn much about the community from the actions that occurred.
I don't think that there was no change in framing. Last year:
Every Petrov Day, we practice not destroying the world. One particular way to do this is to practice the virtue of not taking unilateralist action.
It’s difficult to know who can be trusted, but today I have selected a group of LessWrong users who I think I can rely on in this way. You’ve all been given the opportunity to show yourselves capable and trustworthy.
This Petrov Day, between midnight and midnight PST, if you, ChristianKl, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours.
Personalised launch code: ...
I hope to see you on the other side of this, with our honor intact.
Yours, Ben Pace & the LessWrong 2.0 Team
This year:
...On Petrov Day, we celebrate and practice not destroying the world.
It's difficult to know who can be trusted, but today I have selected a group of 270 LessWrong users who I think I can rely on in this way. You've all been given the opportunity to not destroy LessWrong.
This Petrov Day, if you, ChristianKl, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours, removing a resource thousands of people view every day. Each entrusted user has
Two thoughts —
(1) Some sort of polling or surveying might be useful. In the Public Goods Game, researchers rigorously check whether participants understand the game and its consequences before including them in datasets. It's quite possible that there's incredibly divergent understandings of Petrov Day among the user population. Some sort of surveying would be useful to understand that, as well as things like people's sentiments towards unilateralist action, trust, etc no? It'd be self-reported data but it'd be better than nothing.
(2) I wonder how Petrov Day setup and engagement would change if the site went down for a month as a consequence.
Right now it seems like the Nash equilibrium is pretty stable at everyone not pressing the button. Maybe we can simulate adding in some lower-priority yet still compelling pressure to press the button, analogous to Petrov’s need to follow orders or the US’s need to prevent Russians from stationing nuclear missiles in Cuba.
Yep, seems like the Nash Equlibrium is pretty stably at everyone not pressing the button. Really needed some more incentives, I agree.
Yeah I was off base there. The Nash Equilibrium is nontrivial because some players will challenge themselves to “win” by tricking the group with button access to push it. Plus probably other reasons I haven’t thought of.
Here's how to survive the LessWrong metaphorical end of the world (upvote this comment so others can see):
https://www.lesswrong.com/posts/evDZoYG4p6ZkQQkDw/surviving-petrov-day
I'm curious why this was designed to be non-anonymous? It feels more in the spirit of "be aware I could destroy something, and choosing not to" if it doesn't have cost to me, beyond awareness that destruction is sad
For next year: Raise $1,000 and convert the money to cash. Setup some device where the money burns if a code is entered, and otherwise the money gets donated to the most effective charity. Have a livestream that shows the cash and will show the fire if the code is entered.
I was thinking you should do a game like Hofstadter describes in "The Tale of Happiton" and the Platonia dilemma. Avoid destruction by cooperation, even if it is without coordination.
Can the nuked front page link this post for "what happened" and also the petrov day post for context, instead of just the petrov day post, so people coming in actually know why there is no front page.
(I guess I appreciate being thought of but it does seem like somewhat undermining your point to tag people who haven't used the site in checks seven-almost-eight years.)
During Thursday 26th September (midnight to midnight Pacific Time), we will practice the skill of sitting together and not pressing harmful buttons.
Happy Thursday
Should I press the button or not? I haven't pressed the button at the current time as it would be disappointing to people if they received the email, but someone pressed it while they were still asleep.
Just after midnight last night, 270 LessWrong users received the following email.
Not Destroying the World
Stanislav Petrov once chose not to destroy the world.
As a Lieutenant Colonel of the Soviet Army, Petrov manned the system built to detect whether the US government had fired nuclear weapons on Russia. On September 26th, 1983, the system reported five incoming missiles. Petrov’s job was to report this as an attack to his superiors, who would launch a retaliative nuclear response. But instead, contrary to the evidence the systems were giving him, he called it in as a false alarm, for he did not wish to instigate nuclear armageddon. (He later turned out to be correct.)
During the Cold War, many other people had the ability to end the world – presidents, generals, commanders of nuclear subs from many countries, and so on. Fortunately, none of them did. As humanity progresses, the number of people with the ability to end the world increases, and so too does the standard to which we must hold ourselves. We lived up to our responsibilities in the cold war, but barely. (The Global Catastrophic Risks Institute has compiled this list of 60 close calls.)
In 2007, Eliezer named September 26th Petrov Day, and the rationality community has celebrated the holiday ever since. We celebrate Petrov's decision, and we ourselves practice not destroying things, even if it is pleasantly simple to do so.
The Big Red Button
Raymond Arnold has suggested many ways of observing Petrov Day.
You can discuss it with your friends.
You can hold a quiet, dignified ceremony with candles and the beautiful booklets Jim Babcock created.
And you can also play on hard mode: "During said ceremony, unveil a large red button. If anybody presses the button, the ceremony is over. Go home. Do not speak."
This has been a common practice at Petrov Day celebrations in Oxford, Boston, Berkeley, New York, and in other rationalist communities. It is often done with pairs of celebrations, each whose red button (when pressed) brings an end to the partner celebration.
So for the second year, at midnight, I emailed personalized launch codes to 270 LessWrong users. This is over twice the number of users I sent codes to last year (which was 125), and includes a lot more users who use a pseudonym and who I've never met. If any users do submit a set of launch codes, then (once the site is back up) we'll publish their username, and whose unique launch codes they were.
During Saturday 26th September (midnight to midnight Pacific Time), we will practice the skill of sitting together and not pressing harmful buttons.
Relating to the End of Humanity
Humanity could have gone extinct many times.
Petrov Day is a celebration of the world not ending. It's a day where we come together to think about how one man in particular saved the world. We reflect on the ways in which our civilization is fragile and could have ended already, we feel grateful that it has not, and we ask ourselves how we could also save the world.
If you would like to participate in the tradition of Petrov Day on LessWrong this year, and if you feel up to talking directly about it, then you're invited to write a comment and share your own feelings about humanity, extinction, and how you relate to it. There's a few prompts below to help you figure out what to say. Note that not all people are in a position in their lives to focus on preventing an existential catastrophe.
Finally, if you’d like to participate in a Petrov Day Ceremony today, check out Ray’s Petrov event roundup, especially the online New York mega-meetup.
To all, I wish you a safe and stable Petrov Day.