In a world where dangerous technology is widely available, the greatest risk is unilateralist action.
What Stanislav Petrov did was just as unilateralist as any of the examples linked in the OP. We must remember that when he chose to disregard the missile alert (based off his own intuition regarding the geopolitics of the world), he was violating direct orders. Yes, in this case everything turned out great, but let's think about the counterfactual scenario where the missile attack had been real. Stanislav Petrov would potentially have been on the hook for more deaths than Hitler and the utter destruction of his nation.
A unilateral choice not to act is as much of a unilateral choice as a unilateral choice to act.
If one nation is confident that a rival nation will not retaliate in a nuclear conflict, then the selfish choice is to strike. By refusing orders, Petrov was being the type of agent who would not retaliate in a conflict. Therefore, in a certain sense, by being that type of agent, he arguably raised the risk of a nuclear strike. [Note: I still think his decision to not retaliate was the correct choice]
Petrov's choice was obviously the correct one in hindsight. What I'm questioning is whether Petrov's choice was obviously correct in foresight. The rationality community takes as a given Petrov's assertion that it was obviously silly for the United States to attack the Soviet Union with a single ICBM. Was that actually as silly as Petrov suggested? There were scenarios where small numbers of ICBMs were launched in a surprise attack against an unsuspecting adversary in order to kill leadership, and disrupt command and control systems. How confident was Petrov that this was not one of those scenarios?
Another assumption that the community makes is that Petrov choosing to report the detection would have immediately resulted in a nuclear "counterattack" by the Soviet Union. But Petrov was not a launch authority. The decision to launch or not was not up to him, it was up to the Politburo of the Soviet Union. We have to remember that when he chose to lie about the detection, by calling it a computer glitch when he didn't know for certain that it was one, Petrov was defecting against the system. He was deliberately feeding false data to his superiors, betting that his model of the world was more accurate than his commanders'. Is that the sort of behavior we really want to lionize?
But Petrov was not a launch authority. The decision to launch or not was not up to him, it was up to the Politburo of the Soviet Union.
This is obviously true in terms of Soviet policy, but it sounds like you're making a moral claim. That the Politburo was morally entitled to decide whether or not to launch, and that no one else had that right. This is extremely questionable, to put it mildly.
We have to remember that when he chose to lie about the detection, by calling it a computer glitch when he didn't know for certain that it was one, Petrov was defecting against the system.
Indeed. But we do not cooperate in prisoners' dilemmas "just because"; we cooperate because doing so leads to higher utility. Petrov's defection led to a better outcome for every single person on the planet; assuming this was wrong because it was defection is an example of the non-central fallacy.
Is that the sort of behavior we really want to lionize?
If you will not honor literally saving the world, what will you honor? If we wanted to make a case against Petrov, we could say that by demonstrably not retaliating, he weakened deterrence (but deterrence would have ...
If you will not honor literally saving the world, what will you honor?
I find it extremely troubling that we're honoring someone defecting against their side in a matter as serious as global nuclear war, merely because in this case, the outcome happened to be good.
(but deterrence would have helped no one if he had launched)
That is exactly the crux of my disagreement. We act as if there were a direct lever between Petrov and the keys and buttons that launch a retaliatory counterstrike. But there wasn't. There were other people in the chain of command. There were other sensors. Do we really find it that difficult to believe that the Soviets would not have attempted to verify Petrov's claim before retaliating? That there would not have been practiced procedures to carry out this verification? From what I've read of the Soviet Union, their systems of positive control were far ahead of the United States' as a result of the much lower level of trust the Soviet Politburo had in their military. I find it exceedingly unlikely that the Soviets would have launched without conducting at least some kind of verification with a secondary system. They knew the consequences of nuclear attack
...I'm not entirely sure we can ever have a correct choice in foresight.
With regard to Petrov, he did seem to make a good, and reasoned call: The US launching a first strike with 5 missiles just does not make much sense without some very serious assumptions that don't seem to be merited.
I do like the observation that Petrov was being just as unilateralist as what is feared in this thread.
Do we want to lionize such behavior? Perhaps. You argument seems to lend itself to the lens of an AI problem -- and Petrov's behavior then a control on that AI.
To quote Stanislav himself:
I imagined if I'd assume the responsibility for unleashing the third World War...
...and I said, no, I wouldn't. ... I always thought of it. Whenever I came on duty, I always refreshed it in my memory.
I don't think it's obvious that Petrov's choice was correct in foresight, I think he didn't know whether it was a false alarm - my current understanding is that he just didn't want to destroy the world, and that's why he disobeyed his orders. It's a fascinating historical case where someone actually got to make the choice, and made the right one. Real world situations are messy and it's hard to say exactly what his reasoning process is and how justifiable it was - it's really bad like decisions like these have to be made, and it doesn't seem likely to me there's some simple decision rule that gets the right answer in all situations (or even most). I didn't make any explicit claim about his reasoning in the post. I simply celebrate that he managed to make the correct choice.
The rationality community takes as a given Petrov's assertion that it was obviously silly for the United States to...
I think we can celebrate that Petrov didn't want to destroy the world and this was a good impulse on his part. I think if we think it's doubtful that he made the correct decision, or that it's complicated, then we should be very, very upfront about that (your comment is upfront, the OP didn't make this fact stick with me). The fact the holiday is named after him made me think (implicitly if not explicitly) that people (including you, Ben) generally endorsed Petrov's reasoning/actions/etc. and so I did take the whole celebration as a claim about his reasoning. I mean, if Petrov reasoned poorly but happened to get a good result, we should celebrate the result yet condemn Petrov (or at least his reasoning). If Petrov reasoned poorly and took actions there were poor in expectation, doesn't that mean something like in the majority of world's Petrov caused bad stuff to happen (or at the algorithm which is Petrov generally would)?
. . .
I think it is extremely extremely weird to make a holiday about avoiding unilateralist's curse and name it after who did exactly that. I hadn't thought about it, but if Quanticle is right, then Petrov was taking ...
Indeed.
Perhaps the key problem with attempts to lift the unilateralist's curse, is that it's very easy to enforce dangerous conformity - 'conformity' being a term I made sure not to use in the OP. It's crucial to be able to not do the thing that you're being told to do under the threat of immediate and strong social punishment, especially when there's a long time scale before finding out if your action is actually the right one. Consistently going against the grain because it's better in the long run, not because it brings immediate reward, is very difficult.
Both being able to think and act for yourself, and yet also not disregard others enough to not break things, is a delicate balance, and many people end up too far on one end or the other. They find themselves punished for unilateralist action, and never speak up again; or they find that others are stopping them from being themselves, and then ignore all the costs they're imposing on their community. My current sense is that most people lean towards conformity, but also that the small number of unilateralists have caused an outsized harm.
(Then again, failures from conformity are often more silent, so I have wide error bars around the magnitude of their cost.)
It seems like the official story as you for example find it on Wikipedia says that the system detected five ICBMs.
Oh this is wild. This generated a strange emotion.
Anyone here know the word "Angespannt"? One of my team members taught, German word with no exact English equivalent. We talked about it —
https://www.ultraworking.com/podcast/big-project-angespannt
"It's a mix of tense and alert in a way. It's like the feeling you get before you go on stage."
Like, why should I care? I'm obviously not going to press the damn thing. And yet, simply knowing the button is there generates some tension and alertness.
Fascinating. Thank you for doing this.
(Well, sort of thank you, to be more precise...)
If any users do submit a set of launch codes, tomorrow I’ll publish their identifying details.
If we make it through this, here are some ideas to make it more realistic next year:
1) Anonymous codes.
2) Karma bounty for the first person to press the button.
1+2) Randomly and publicly give some people the same code as each other, and give a karma bounty to everyone who had the code that took down the site.
3) Anyone with button rights can share button rights with anyone, and a karma bounty for sharing with the most other people that only pays out if nobody presses the button.
Well, it seems that no one has launched anything. However, skimming through the comments seems to indicate that this may at least partly be due to folks simply not having had enough time to coordinate any agreements about launching for some quid pro quo, or blackmail, or whatever. And, for that matter, not everyone has time to visit the site daily—I’d wager that at least some of the people who had launch codes, simply didn’t have time to go to Less Wrong all day, or forgot, etc.
Perhaps, next time, there can be more warning? Send out the launch codes a week in advance, let’s say (though maintain only a one-day window for actually using them).
That way, we can be more certain of whether the outcome was due entirely to trustworthiness, self-restraint, and a cooperative spirit, or whether it was instead due to indecisiveness and the limitations of people’s busy schedules.
the temptation, the call to infamy
button shining, yearning to be pressed
can we endure these sinuous fingers coiled?
only the hours know our hearts
I was not aware of this story and happy to hear it. While I think having the day of celebration and rememberance should be done, I wonder about the exercise with the button.
First, just not pushing the button and bring the page down for a day seems not to fit the problem. The button should be shutting down someone else's site with the realization that they will have some knowledge of that coming and have a button that shuts your page down. Perhaps next year the game could include other sites, and particularly sites whose members do not really see eye-to-eye on things.
Second, it doesn't really tell others much about avoiding such situations. Reading Eliezer's post the critical insight for me seems to be that of remaining calm and taking the time available to think a bit rather than merely react and follow instructions of a mindless process. That Petrov realized that launching 5 missiles just made no sense, so came to the conclusion that there was a system error/false positive is critical here.
We had some original plans of coordinating with the EA Forum people on this, but didn't end up having enough time available to make all of that happen. Agree that the ideal reenactment scenario would include two forums (though with mutually assured destruction in the later parts of the cold war, the outcome is ultimately the same).
In the comments of Ray’s post, Zvi asked the following question (about a variant where a cake gets destroyed):
I still don’t understand, in the context of the ceremony, what would cause anyone to push the button. Whether or not it would incinerate a cake, which would pretty much make you history’s greatest monster.
There are several obvious reasons why someone might push the button.
Reason one: spite. Pure, simple spite, nothing more. A very compelling reason, I assure you. (See also: “Some men just want to watch the world burn.”)
Reason two: desire for infamy. “History’s greatest monster” is much better (for many people) than being a nobody.
Reason three: personal antipathy for people who would be harmed.
I could think of more potential reasons, I suppose, but I think three examples are enough. Remember that being incapable of imagining why someone would do a bad thing, is a weakness and a failure. Strive to do better.
All your reasons look like People Are Bad. I think it suffices that The World is Complex and Coordination is Hard.
Consider, for example:
Well, I could note that reactive spite is game-theoretically correct; this is well-documented and surely familiar to everyone here.
But that would not be the important reason. In fact I take spitefulness to be a terminal value, and as a shard of godshatter which is absolutely critical to what humans are (and, importantly, what I take to be the ideal of what humans are and should be).
It is not always appropriate, of course; nor even usually, no. Someone who is spiteful all or most of the time, who is largely driven by spite in their lives—this is not a pleasant person to be around, and nor would I wish to be like this. But someone who is entirely devoid of spite—who does not even understand it, who has never felt it nor can imagine feeling spite—I must wonder whether such a one is fully human.
There is an old Soviet animated short, called “Baba Yaga Is Opposed” (which you may watch in its entirety on YouTube; link to first of three episodes; each is ~10 minutes).
The plot is: it’s the 1980 Olympics in Moscow. Misha the bear has been chosen as the event’s mascot. Baba Yaga—the legendary witch-crone of Russian folklore—is watching the announcement on TV. “Why him!” she exclaims; “why him
...[EDIT: two people with codes below have objected, so I'm not up for this trade anymore, unless we figure out a way to make a broader poll]
I have launch codes. Would anyone be interested in offering counterfactual donations to https://www.givewell.org/charities/amf? I could also be interested in counterfactual donations to nuclear war-prevention organizations.
Since the day is drawing to a close and at this point I won’t get to do the thing I wanted to do, here are some scattered thoughts about this thing.
First, my plan upon obtaining the code was to immediately repeat Jeff’s offer. I was curious how many times we could iterate this; I had in fact found another person who was potentially interested in being another link in this chain (and who was also more interested in repeating the offer than nuking the site). I told Jeff this privately but didn’t want to post it publicly (reasons: thought it would be more fun if this was a surprise; didn’t think people should put that much weight on my claimed intentions anyway; thought it was valuable for the conversation to proceed as though nuking were the likely outcome).
(In the event that nobody took me up on the offer, I still wasn’t going to nuke the site.)
Other various thoughts:
I would like to add that I think this is bad (and have the codes). We are trying to build social norms around not destroying the world; you are blithely defecting against that.
I thought you were threatening extortion. As it is, given that people are being challenged to uphold morality, this response is still an offer to throw that away in exchange for money, under the claim that it's moral because of some distant effect. I'd encourage you to follow Jai's example and simply delete your launch codes.
Agreed. I have launch codes and will donate up to $100 without writing it in my EA budget if that prevents the nuke from being launched.
Nooooo you're a good person but you're promoting negotiating with terrorists literally boo negative valence emotivism to highlight third-order effects, boo, noooooo................
Did you consider the unilateralist curse before making this comment?
Do you consider it to be a bad idea if you condition the assumption that only one other person with launch access who sees this post in the time window choose to say it was a bad idea?
(others have said part of what I wanted to say, but didn't quite cover the thing I was worried about)
I see two potential objections:
My immediate thoughts are mostly about the second argument.
I think it's quite dangerous to leave oneself vulnerable to the second argument (for reasons Julia discusses on givinggladly.com in various posts). Yes, you can reflect upon whether every given cup of coffee is worth the dead-child-currency it took to buy it. But taken naively this is emotionally cognitively exhausting. (It also pushes people towards a kind of frugality that isn't actually that beneficial). The strategy of "set aside a budget for charity, based on your values, and don't feel pressure to give more after that" seems really important for living sanely while altruistic.
(I don't have a robustly satisfying answer on how to deal with that exactly, but see this comment of mine for some more expanded thoughts of mine on this)
Now, additional counterfactual don...
The strategy of "set aside a budget for charity, based on your values, and don't feel pressure to give more after that" seems really important for living sanely while altruistic.
But this situation isn't like that.
I agree you don't want to always be vulnerable to the second argument, for the reasons you give. I don't think the appropriate response is to be so hard-set in your ways that you can't take advantage of new opportunities that arise. You can in fact compare whether or not a particular trade is worth it if the situation calls for it, and a one-time situation that has an upside of $1672 for ~no work seems like such a situation.
As a meta point directed more at the general conversation than this comment in particular, I would really like it if people stated monetary values at which they would think this was a good idea. At $10, I'm at "obviously not", and at $1 million, I'm at "obviously yes". I think the range of uncertainty is something like $500 - $20,000. Currently it feels like the building of trust is being treated as a sacred value; this seems bad.
My sense is that it's very unlikely to be worth it at anything below $10k, and I might be a bit tempted at around $50k, though still quite hesitant. I agree that at $1M it's very likely worth it.
Firm disagree. Second-order and third-order effects go limit->infinity here.
Also btw, I'm running a startup that's now looking at — best case scenario — handling significant amounts of money over multiple years.
It makes me realize that "a lot of money" on the individual level is a terrible heuristic. Seriously, it's hard to get one's mind around it, but a million dollars is decidedly not a lot of money on the global scale.
For further elaboration, this is relevant and incredibly timely:
https://slatestarcodex.com/2019/09/18/too-much-dark-money-in-almonds/
LW frontpage going down is also not particularly bad, so you don't need much money to compensate for it.
If you wanted to convince me, you could make a case that destroying trust is really bad, and that in this particular case pressing the button would destroy a lot of trust, but that case hasn't really been made.
LW frontpage going down is also not particularly bad [...] If you wanted to convince me, you could make a case that destroying trust is really bad
Umm, respectfully, I think this is extremely arrogant. Dangerously so.
Anyways, I'm being blunt here, but I think respectful and hopefully useful. Think about this. Reasoning follows —
The instructions if you got launch codes (also in the above post) were as such (emphasis added with underline) —
"Every Petrov Day, we practice not destroying the world. One particular way to do this is to practice the virtue of not taking unilateralist action.
It’s difficult to know who can be trusted, but today I have selected a group of LessWrong users who I think I can rely on in this way. You’ve all been given the opportunity to show yourselves capable and trustworthy.
[...]
This Petrov Day, between midnight and midnight PST, if you, {{username}}, enter the launch codes below on LessWrong, the Frontpage will go down for 24 hours.
I hope to see you on the other side of this, with our honor intact."
So, to Ben Pace at least (the developer who put in a tremendous amount of hours and thought into putting this together), ...
Thanks for writing this up. It's pretty clear to me that you aren't modeling me particularly well, and that it would take a very long time to resolve this, which I'm not particularly willing to do right now.
I'll give anyone 10:1 odds this is cited in a mainstream political science journal within 15 years, which are read by people who both set and advise on policy
I'll take that bet. Here's a proposal: I send you $100 today, and in 15 years if you can't show me an article in a reputable mainstream political science journal that mentions this event, then you send me an inflation-adjusted $1000. This is conditional on finding an arbiter I trust (perhaps Ben) who will:
Which part of the two statements? That destroying trust is really bad, or that the case hasn't been made?
But this is not a one-time situation. If you're a professional musician, would you agree to mess up at every dress rehearsal, because it isn't the real show?
More indirectly... the whole point of "celebrating and practicing our ability to not push buttons" is that we need to be able to not push buttons, even when it seems like a good idea (or necessary, or urgent that we defect while we can still salvage the the percieved situation). The vast majority of people aren't tempted by pushing a button when pushing it seems like an obviously bad idea. I think we need to take trust building seriously, and practice the art of actually cooperating. Real life doesn't grade you on how well you understand TDT considerations and how many blog posts you've read on it, it grades you on whether you actually can make the cooperation equilibrium happen.
Rohin argues elsewhere for taking a vote (at least in principal). If 50% vote in favor, then he has successfully avoided "falling into the unilateralist's curse" and has gotten $1.6k for AMF. He even has some bonus for "solved the unilateralist's curse in a way that's not just "sit on his hands". Now, it's probably worth subtracting points for "the LW team asked them not to blow up the site and the community decided to anyway." But I'd consider it fair play.
It seems to me like the algorithm people are following is: if an action would be unilateralist, and there could be disagreement about its benefit, don't take the action. This will systematically bias the group towards inaction. While this is fine for low-stakes situations, in higher-stakes situations where the group can invest effort, you should actually figure out whether it is good to take the action (via the two-step method above). We need to be able to take irreversible actions; the skill we should be practicing is not "don't take unilateralist actions", it's "take unilateralist actions only if they have an expected positive effect after taking the unilateralist curse into account".
I don’t disagree with this, and am glad to see reminders to actually evaluate different courses of action besides the one expected of us. my comment was more debating your own valuation as being too low, it not being a one-off event once you consider scenarios either logically or causally downstream of this one, and just a general sense that you view the consequences of this event as quite isolated.
jkaufman's initial offer was unclear. I read it (incorrectly) as "I will push the button (/release the codes) unless someone gives AMF $1672 counterfactually", not as "if someone is willing to pay me $1672, I will give them the codes". Read in the first way, Raemon's concerns about "pressure" as opposed to additional donations made on the fly may be clearer; it's not about jkaufman's opportunity to get $1672 in donations for no work, it's about everyone else being extorted for an extra $1672 to preserve their values.
Perhaps a nitpick, but I feel like the building of trust is being treated less as a sacred value, and more as a quantity of unknown magnitude, with some probability that that magnitude could be really high (at least >$1672, possibly orders of magnitude higher). Doing a Fermi is a trivial inconvenience that I for one cannot handle right now; since it is a weekday, maybe others feel much the same.
I noticed after playing a bunch of games of a mafia-type game with some rationalists that when people made edgy jokes about being in the mob or whatever, they were more likely to end up actually being in the mob.
Can't tell if joking, but they probably mean that they were "actually in the mafia" in the game, so not in the real-world mafia.
(I have launch codes and am happy to prove it to you if you want.)
Hmmm, I feel like the argument "There's some harm in releasing the codes entrusted to me, but not so much that it's better for someone to die" might prove too much? Like, death is really bad, I definitely grant that. But despite the dollar amount you gave, I feel like we're sort of running up against a sacred value thing. I mean, you could just as easily say, "There's some harm in releasing the codes entrusted to me, but not so much that it's better for someone to have a 10% chance of dying" - which would naïvely bring your price down to $167.20.
If you accept as true that that argument should be equally 'morally convincing', then you end up in a position where the only reasonable thing to do is to calculate exactly how much harm you actually expect to be done by you pressing the button. I'm not going to do this because I'm at work and it seems complicated (what is the disvalue of harm to the social fabric of an online community that's trying to save the world, and operates largely on trust? perhaps it's actually a harmless game, but perhaps it's not, hard to know - seems like the majority of effects would happen down the line).
Additionally, I could just counter-offer a $1,672 counterfactual donation to GiveWell for you to not press the button. I'm not committing to do this, but I might do so if it came down to it.
This whole thread is awesome. This is the maybe the best thing that's happened on LessWrong since Eliezer more-or-less went on hiatus.
Huge respect to everyone. This is really great. Hard but great. Actually it's great because it's hard.
I'm leaning towards this not being a good trade, even though it's taxing to type that.
In the future, some people will find themselves in situations not too unlike this, where there are compelling utilitarian reasons for pressing the button.
Look, the system should be corrigible. It really, really should; the safety team's internal prediction market had some pretty lopsided results. There are untrustworthy actors with capabilities similar to or exceeding ours. If we press the button, it probably goes better than if they press it. And they can press it. Twenty people died since I started talking, more will die if we don't start pushing the world in a better direction, and do you feel the crushing astronomical weight of the entire future's eyes upon us? Even a small probability increase in a good outcome makes pressing the button worth it.
And I think your policy should still be to not press the button to launch a singleton from this epistemic state, because we have to be able to cooperate! You don't press buttons at will, under pressure, when the entire future hangs in the balance! If we can't even cooperate, right here, right now, under much weaker pressures, what do we expect of the "untrustworthy actors"?
So how about people instead donate to charity in celebration of not pressing the button?
ETA I have launch codes btw.
Pro: It reinforces the norm of actually considering consequences, and not holding any value too sacred.
Not an expert here, but my impression was sometimes it can be useful to have "sacred values" in certain decision-theoretic contexts (like "I will one-box in Newcomb's Problem even if consequentialist reasoning says otherwise"?) If I had to choose a sacred value to adopt, cooperating in epistemic prisoners' dilemmas actually seems like a relatively good choice?
Jeff does conveniently have a blogpost on this: https://www.jefftk.com/p/what-should-counterfactual-donation-mean
It seems extremely unfortunate that the terminology apparently shifted from "counterfactually valid" (which means the right thing) to "counterfactual" (which means almost the opposite of the right thing).
"Additional donation" seems like the obvious choice in place of "counterfactual donation", since we just mean "additional to what you would have donated anyway", right? (The very obviousness makes me think maybe there's a downside to the term that I'm not seeing, or I'm confused in some other way.)
I.
Clicking on the button permanently switches it to a state where it's pushed-down, below which is a prompt to enter launch codes. When moused over, the pushed-down button has the tooltip "You have pressed the button. You cannot un-press it." Screenshot.
(On an unrelated note, on r/thebutton I have a purple flair that says "60s".)
Upon entering a string of longer than 8 characters, a button saying "launch" appears below the big red button. Screenshot.
II.
I'm nowhere near the PST timezone, so I wouldn't be able to reliably pull a shenanigan whereby if I had the launch codes I would enter or not enter them depending on the amount of counterfactual money pledged to the Ploughshares Fund in the name of either launch-code-entry-state, but this sentence is not apophasis.
III.
Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.
IV. (Update)
I can no longer see the button when I am logged in. Could this mean that I have won?
Conspiracy theory: There are no launch codes. People who claim to have launch codes are lying. The real test is whether people will press the button at all. I have failed that test. I came up with this conspiracy theory ~250 milliseconds after pressing the button.
Oh no! Someone is wrong on the internet, and I have the ability to prove them wrong...
To make sure I have this right and my LW isn't glitching: TurnTrout's comment is a Drake meme, and the two other replies in this chain are actually blank?
(This thread is our collective reenactment of the conversations about nuclear safety that happened during the cold war.)
Well, at least we have a response to the doubters' "why would anyone even press the button in this situation?"
How did you implement the button? I run a small site, love the idea, and would like to do something similar.
Can we have a recap from the mods of how Petrov Day went? How many people pressed the button, how many people tried entering anything in the launch code field, how many people tried the fake launch code posted on Facebook in particular?
Generic feedback:
I had launch codes. I had hidden the map previously in my settings, which also had the effect of hiding the button, which in turn was enough to screen off any buttons should be pressed and would this really work? temptations.
I did keep checking the site to see if it went down, though.
I have the launch codes. I'll take the site down unless Eliezer Yudkowsky publicly commits to writing a sequel chapter to HPMoR, in which I get an acceptably pleasant ending, by 9pm PST.
The enemy is smart.
"The enemy knew perfectly well that you'd check whose launch codes were entered, especially since the nukes being set off at all tells us that someone can appear falsely trustworthy." Ben shut his eyes, thinking harder, trying to put himself into the enemy's shoes. Why would he, or his dark side, have done something like - "We're meant to conclude that the enemy has the launch codes. But that's actually something the enemy can only do with difficulty, or under special conditions; they're trying to create a false appearance of omnipotence." Like I would. "Later, hypothetically, the nukes actually get fired. We think it was Quirinus_Quirrell firing it, but really, it was just someone firing it independently."
"Unless that is precisely what Quirinus_Quirrell expects us to think," said Jim Babcock, his brow furrowed in concentration. "In which case he does have the launch codes, as well as the other person."
"Does Quirinus_Quirrell really use plots with that many levels of meta -"
"Yes," said Habryka and Jim.
Ben nodded distantly. "Then this could be a setup to either make...
The site will go down for a full 24 hours after the button was pressed and correct launch codes entered (not that that is the most important aspect of this situation, but I figured I would clarify anyways)
I don't see the big shiny red button on the front page. If I visit LW in private mode, it's there. I have the map turned off. I haven't tried logging out or turning the map back on. I'm guessing that when Ben says it's "over the frontpage map" that means it's implemented in a way that makes it disappear if the map isn't there. That seems a bit odd, though it probably isn't worth the effort of fixing.
(I have a launch code but hereby declare my intention not to use it. I am intrigued by the discussions of tra...
Rot13 comment, if you have launch codes, recommend you wait until tomorrow to read this eh?
(1) V'z phevbhf ubj znal crbcyr jvgu ynhapu pbqrf pyvpxrq gur ohggba "gb purpx vg bhg" jvgubhg ragrevat ynhapu pbqrf. V qvqa'g qb fb, npghnyyl, fb V pna bayl cerfhzr lbh'q unir gb ragre pbqrf.
(2) V jbaqre vs gur yvfg bs anzrf jnf znqr choyvp vs crbcyr jbhyq or zber yvxryl be yrff yvxryl gb cerff vg. Anvir nafjre vf yrff yvxryl, ohg vg zvtug unir n fgenatr "lbh pna'g pbageby zr ivn funzr" serrqbz rssrpg sbyybjrq ol xnobbz.
(3) Qr...
So far, LW is still online. It means:
a) either nobody used their launch codes, and you can trust 125 nice & smart individuals not to take unilateralist action - so we can avoid armageddon if we just have coordinated communities with the right people;
b) nobody used their launch codes, because these 125 are very like-minded people (selection bias), there's no immediate incentive to blow it up (except for some offers about counterfactual donations), but some incentive to avoid it (honor!... hope? Prove EDT, UDT...?). It doesn't model the proble...
I hovered over the button thinking that the button appearing means I am one of the chosen ones. Afterwards it seemed I was reckless. I was curious and thought that I can just choose not press my mouse button (I did manage that). One the other hand I was hazy on the mechanics on how things work and I knew moving the mouse over the button means lower distance between bad things and present. The tooltip popup was unexpected and somewhat startled me. It could have been possible to have a mechanism go off with that and I was not considering that. Full smuchbait
...
Just after midnight last night, 125 LessWrong users received the following email.
Unilateralist Action
As Nick Bostrom has observed, society is making it cheaper and easier for small groups to end the world. We’re lucky it requires major initiatives to build a nuclear bomb, and that the world can’t be destroyed by putting sand in a microwave.
However, other dangerous technologies are becoming widely available, especially in the domain of artificial intelligence. Only 6 months after OpenAI created the state-of-the-art language-modelling GPT-2, others created similarly powerful versions and released them to the public. They disagreed about the dangers, and, because there was nothing stopping them, moved ahead.
I don’t think this example is at all catastrophic, but I worry what this suggests about the future, when people will still have honest disagreements about the consequences of an action but where those consequences will be much worse.
And honest disagreements will happen. In the 1940s, the great physicist Niels Bohr met President Roosevelt and Prime Minister Churchill, to persuade them to give the instructions for building the atomic bomb to Russia. He wanted to bring in a new world order and establish global peace, and thought this would be necessary - he believed strongly that it would prevent arms race dynamics, if only everyone just shared their science. (Churchill did not allow it.) Our newest technologies technologies do not yet have the bomb’s ability to transform the world in minutes, but I think it’s likely we’ll make powerful discoveries in the coming decades, and that publishing those discoveries will not require the permission of a president.
And then it will only take one person to end the world. Even in a group of well-intentioned people, natural disagreements will mean someone will think that taking a damaging action is actually the correct choice — Nick Bostrom calls this the “unilateralist’s curse”. In a world where dangerous technology is widely available, the greatest risk is unilateralist action.
Not Destroying the World
Stanislav Petrov once chose not to destroy the world.
As a Lieutenant Colonel of the Soviet Army, Petrov manned the system built to detect whether the US government had fired nuclear weapons on Russia. On September 26th, 1983, the system reported multiple such attacks. Petrov’s job was to report this as an attack to his superiors, who would launch a retaliative nuclear response. But instead, contrary to all the evidence the systems were giving him, he called it in as a false alarm. This later turned out to be correct.
(For a more detailed story of how Stanislav Petrov saved the world, see the original LessWrong post by Eliezer, which started the tradition of Petrov Day.)
During the Cold War, many other people had the ability to end the world - presidents, generals, commanders of nuclear subs from many countries, and so on. Fortunately, none of them did. As the number of people with the ability to end the world increases, so too does the standard to which we must hold ourselves. We lived up to our responsibilities in the cold war, but barely. (The Global Catastrophic Risks Institute has compiled an excellent list of 60 close calls.)
Petrov Day
On Petrov Day, we try to live to up to this responsibility - we celebrate by not destroying the world.
Raymond Arnold has suggested many ways of observing Petrov Day. You can discuss it with your friends. You can hold a quiet, dignified ceremony (for example, with the beautiful booklet Jim Babcock created). But you can also play on hard mode: "During said ceremony, unveil a large red button. If anybody presses the button, the ceremony is over. Go home. Do not speak."
In the comments of Ray's post, Zvi asked the following question (about a variant where a cake gets destroyed):
To which I replied:
So this year on LessWrong, I thought we'd build ourselves a big red button. Instead of making everyone go home, this button (which you can find over the frontpage map) will shut down the Less Wrong frontpage for 24 hours.
Now, this isn't a button for anyone. I know there are people with an internet access who will happily press buttons that do bad things. So today, I've emailed personalised launch codes to 125 LessWrong users, for us to practice the art of sitting together and not pressing harmful buttons[1]. If any users do submit a set of launch codes, tomorrow I’ll publish their username, and whose launch codes they were.
During Thursday 26th September, we will see whether the people with the codes can be trusted to not, unilaterally, destroy something valuable.
To all here on LessWrong today, I wish you a safe and stable Petrov Day.
Footnotes
[1] I picked the list quickly on Tuesday, mostly leaving out users I don’t really know, and a few people who I thought would press it (e.g. someone who has said in the past that they would). If this goes well we may do it again next year, with an expanded pool or more principled selection criteria. Though I think this is still a representative set - out of the 100+ users with over 1,000 karma who've logged in to LessWrong in the past month, the list includes 53% of them.
Added: Follow-Up to Petrov Day, 2019.