I mostly endorse the "Them" as a representation of my views (not necessarily everything, e.g. idk anything specific about the author of the Doomsday Machine, though I do endorse the heuristic of "stories you hear are likely sensationalized").
Me: Why would that be an update? We already know that state bioweapons programs have killed thousands of people with accidental releases, and there's no particular reason that they couldn't cause worse disasters, and that international regulation has failed to control that.
Them: [inaudible. I don’t know how to rephrase the thing that people say at this point in the conversation.]
In my case, it's something like "the lack of really bad failures so far implies there's some filter that prevents really bad failures from being likely. We both agree that if such a filter exists, we don't know what it is. When you say 'there's no particular reason why they couldn't cause worse disasters', I say that that's a reflection of our map, but in the territory there probably is a filter that is a particular reason why they couldn't cause worse disasters, even if we don't know what it is. If we found out that COVID was a failure of biosecurity, that would imply the filter is much less strong than we might otherwise have guessed, leading to a substantial update."
the lack of really bad failures so far implies there's some filter that prevents really bad failures from being likely. We both agree that if such a filter exists, we don't know what it is.
I can think of two broad categories here: selection effects ("if almost everyone dies, I probably died too and am not around to observe it") and all other effects. I think the selection effect filter is significant enough that I would be reluctant to think 'all other effects' are large without doing some very careful math.
There also haven't been anthropogenic risks that killed 10% of humans. The selection effect update on "10% of people killed" is pretty small. (World War 2 killed ~2% and feels like the strongest example against the "Them" position.)
You could believe that most risks are all-or-nothing, in which case I agree the "Them" position is considerably weaker due to selection effects.
I agree that selection effects from smaller risks are much smaller; I also suspect that most risks so far haven't been all-or-nothing risks. I think my main objection was that there's a potentially big difference between "likely (in a forward-looking sense)" and "likely (in a backward-looking sense)" and was worried that the two quoted sentences don't make that clear to the reader.
Okay, fair enough, though I want to note that the entire disagreement in the post is about the backward-looking sense (if I'm understanding you correctly). Like, the question is how to interpret the fact that there were a lot of near misses, but no catastrophes (and for the class of nukes / bio etc. a 10% catastrophe seems way more likely than a 90-100% catastrophe).
Okay, fair enough, though I want to note that the entire disagreement in the post is about the backward-looking sense (if I'm understanding you correctly).
Oh interesting! I suspect you understand me correctly and we disagree. To elaborate:
If it means something for humans to be "good at coordination", it's that there's some underlying features that cause humans to succeed rather than fail at coordination challenges. If I said someone was "good at winning poker games", I don't just mean that they happened to win once, but that there's some underlying fact that caused them to win in the past and also likely to win in the future. If I just want to observe that someone won last week's poker game, but I ascribe this to chance instead of skill, I say that person is lucky.
But, of course, we can only infer whether or not someone is good at poker, rather than being able to observe it directly. So discussions about the inference necessarily have a backward-looking quality to them, because it's about what observations led us to the epistemic state we're at.
That all sounds right and I'm not sure where you expected me to disagree.
discussions about the inference necessarily have a backward-looking quality to them
This was exactly my point, I think? Since we're looking backward when making inferences, and since we didn't expect full extinction or even 90% extinction in the past, our inferences don't need to take selection effects into account (or more accurately selection effects would have a relatively small effect on the final answer).
When I read the 'Buck' points, most of them feel like they're trying to be about 'how humans are', or the forward-likeliness. Like, this here:
But I still feel that my overall worldview of “people will do wild and reckless things” loses fewer Bayes points than yours does.
Importantly, "wild and reckless" is describing the properties of the actions / underlying cognitive processes, not the outcomes. And later:
Why would that be an update? We already know that state bioweapons programs have killed thousands of people with accidental releases, and there's no particular reason that they couldn't cause worse disasters, and that international regulation has failed to control that.
At least in this presentation of Buck vs. Them, there's a disagreement over something like "whether scope matters"; Buck thinks no ('what damage happens to a toddler depends on how dangerous their environment is, since the toddler doesn't know what to avoid and so can't be scope-sensitive') and Them thinks yes ('sure, humanity has screwed up lots of things that don't matter, but that's because effort is proportional to how much the thing matters, and so they're rationally coping with lots of fires that would be expensive to put out.').
This feels like it's mostly not about bets on whether X happened or not, and mostly about counterfactuals / reference class tennis ("would people have taken climate change more seriously if it were a worse problem?" / "is climate change a thing that people are actually trying to coordinate on, or a distraction?").
At least in this presentation of Buck vs. Them, there's a disagreement over something like "whether scope matters"
I agree this could be a disagreement, but how do selection effects matter for it?
This feels like it's mostly not about bets on whether X happened or not, and mostly about counterfactuals / reference class tennis
Seems plausible, but again why do selection effects matter for it?
----
I may have been a bit too concise when saying
the entire disagreement in the post is about the backward-looking sense
To expand on it, I expect that if we fix a particular model of the world (e.g. coordination of the type discussed here is hard, we have basically never succeeded at it, the lack of accidents so far is just luck), Buck and I would agree much more on the forward-looking consequences of that model for AI alignment (perhaps I'd be at like 30% x-risk, idk). The disagreement is about what model of the world we should have (or perhaps what distribution over models). For that, we look at what happens in the past (both in reality and counterfactually), which is "backward-looking".
some very careful math.
Note that I have no idea what math to do here. The actual thing I'd do is try to figure out the reference class of 'things that could be major disasters', look how well the situation around them was handled (carefully, coordinated, sloppily, clumsily, etc) and then after getting close to the territory in that way, reflect loads on anthropics and wtf to update about it. I don't know how to really do math on either.
The other camp says “No nuclear weapons have been used or detonated accidentally since 1945. This is the optimal outcome, so I guess this is evidence that humanity is good at handling dangerous technology.”
When I look at that fact and Wikipedia's list of close calls, the most plausible explanation doesn't seem to be "it was unlikely for nuclear weapons to be used" or "it was likely for nuclear weapons to be used, yet we got lucky" but rather "nuclear weapons were probably used in most branches of the multiverse, but those have significantly fewer observers, so we don't observe those worlds because of the survivorship bias."
This requires that MW is true, that the part of anthropic reasoning is correct, and that a usage of nuclear weapons does, indeed, decrease the number of observers significantly. I'm not sure about the third, but pretty sure about the first two. The conjunction of all three seems significantly more likely than either of the two alternatives.
I don't have insights on the remaining part of your post, but I think you're admitting to losing Bayes points that you should not, in fact, be losing. [Edit: meaning you should still lose some but not that many.]
I don't really know how to think about anthropics, sadly.
But I think that it's pretty likely that nuclear war could have not killed everyone. So I still lose Bayes points compared to the world where nukes were fired but not everyone died.
Nuclear war doesn't have to kill everyone to make our world non-viable for anthropic reasons. It just has to render our world unlikely to be simulated.
To be clear, after I made it, I thought more about it and I'm not sure it's correct. I think I'd have to actually do the math, my intuitions aren't coming in loud and clear here. The reason I'm unsure is that even if for some reason post-apocalyptic worlds rarely get simulated (and thus it's very unsurprising that we find ourself in a world that didn't suffer an apocalypse, because we're probably in a simulation) it may be that we ought to ignore this, since we are trying to act as if we are not simulated anyway, since that's how we have the most influence or something.
if for some reason post-apocalyptic worlds rarely get simulated
To draw out the argument a little further, the reason that post-apocalyptic worlds don't get simulated is because most (?) of the simulations of our era are a way to simulate super intelligences in other parts of the multiverse, to talk or trade with.
(As in the basic argument of this Jan Tallinn talk)
If advanced civilization is wiped out by nuclear war, that simulation might be terminated, if it seems sufficiently unlikely to lead to a singularity.
Yep. What I was thinking was: Maybe most simulations of our era are made for the purpose of acausal trade or something like it. And maybe societies that are ravaged by nuclear war make for poor trading partners for some reason. (e.g. maybe they never rebuild, or maybe it takes too long to figure out whether or not they eventually rebuild that it's not worth the cost of simulating them, or maybe they rebuild but in a way that makes them poor trading partners.) So then the situation would be: Even if most civilizations in our era nuke themselves in a way that doesn't lead to extinction, the vast majority of people in our era would be in a civilization that didn't, because they'd be in a simulation of one of the few civilizations that didn't.
What I'm confused about right now is what the policy implications are of this. As I understand it, the dialectic is something like:
A: Nuclear war isn't worth worrying about because we've survived it for 60 years so far, so it must be very unlikely.
B: But anthropics! Maybe actually the probability of nuclear war is fairly high. Because of anthropics we'd never know; dead people aren't observers.
A: But nuclear war wouldn't have killed everyone; if nuclear war is likely, shouldn't we expect to find ourselves in some post-apocalyptic civilization?
Me: But simulations! If post-apocalyptic civilizations are unlikely to be simulated, then it could be that nuclear war is actually pretty likely after all, and we just don't know because we're in one of the simulations of the precious few civilizations that avoided nuclear war. Simulations that launch nukes get shut down.
Me 2: OK, but... maybe that means that nuclear war is unlikely after all? Or at least, should be treated as unlikely?
Me: Why?
Me 2: I'm not sure... something something we should ignore hypotheses in which we are simulated because most of our expected impact comes from hypotheses in which we aren't?
Me: That doesn't seem like it would justify ignoring nuclear war. Look, YOU are the one who has the burden of proof; you need to argue that nuclear war is unlikely on the grounds that it hasn't happened so far, but I've presented a good rebuttal to that argument.
Me 2: OK let's do some math. Two worlds. In World Safe, nuclear war is rare. In World Dangerous, nuclear war is common. In both worlds, most people in our era are simulations and moreover there are no simulations of post-apocalyptic eras. Instead of doing updates, let's just ask what policy is the best way to hedge our bets between these two worlds... Well, what the simulations do doesn't matter so much, so we should make a policy that mostly just optimizes for what the non-simulations do. And most of the non-simulations with evidence like ours are in World Safe. So the best policy is to treat nukes as dangerous.
OK, that felt good. I think I tentatively agree with Me 2.
[EDIT: Lol I mean "treat nukes as NOT dangerous/likely" what a typo!]
Hm, interesting. This suggests that, if we're in a simulation, nuclear war is relatively more likely. However, all such simulations are likely to be shortlived, so if we're in a simulation, we shouldn't care about preventing nuclear war for longtermist reasons (only for short-termist ones). And if we think we're sufficiently likely to be outside a simulation to make longterm concerns dominate short-termist ones (obligatory reference), then we should just condition on not being in a simulation, and then I think this point doesn't matter.
Yeah, the "we didn't observe nukes going off" observation is definitely still some evidence for the "humans are competent at handling dangerous technology" hypothesis, but (if one buys into the argument I'm making) it's much weaker evidence than one would naively think.
Seems like it's "much weaker" evidence if you buy something like SIA, and only a little weaker evidence if you buy something like SSA.
To expand: imagine a probability distribution over the amount of person-killing power that gets released as a consequence of nukes. Imagine it's got a single bump well past the boundary where total extinction is expected. That means worlds where more people die are more likely[1].
If you sample, according to its probability mass, some world where someone survived, then our current world is quite surprising.
If instead you upweight the masses by how many people are in each, then you aren't that surprised to be in our world
[1]: Well, there might be a wrinkle here with the boundary at 0 and a bunch of probability mass getting "piled up" there.
Disagree. SIA always updates towards hypotheses that allow more people to exist (the Self Indication Assumption is that your own existence as an observer indicates that there are more observerss), which makes for an update that nuclear war is rare, since there will exist more people in the multiverse if nuclear accidents are rare. This exactly balances out the claim about selection effects – so SIA corresponds to the naive update-rule which says that world-destroying activities must be rare, since we haven't seen them. The argument about observer selection effects only comes from SSA-ish theories.
Note that, in anthropic dilemmans, total consequentialist ethics + UDT makes the same decisions as SIA + CDT, as explained by Stuart Armstrong here. This makes me think that total consequentialists shouldn't care about observer selection effects.
This is complicated by the fact that infinities breaks both anthropic theories and ethical theories. UDASSA might solve this. In practice, I think UDASSA behaves a bit like a combination of SSA and SIA, but that it is a bit closer to SIA, but I haven't thought a lot about this.
I think you misread which direction the ‘“much weaker” evidence’ is supposed to be going, and that we agree (unless the key claim is about SIA exactly balancing selection effects)
There's probably some misunderstanding, but I'm not immediately spotting it when rereading. You wrote:
Seems like it's "much weaker" evidence [[for X]] if you buy something like SIA, and only a little weaker evidence if you buy something like SSA.
Going by the parent comment, I'm interpreting this as
I think that
Which seems to contradict what you wrote?
Yep, sorry, looks like we do disagree.
Not sure I'm parsing your earlier comment correctly, but I think you say "SIA says there should be more people everywhere, because then I'm more likely to exist. More people everywhere means I think my existence is evidence for people handling nukes correctly everywhere". I'm less sure what you say about SSA, either "SSA still considers the possibility that nukes are regularly mishandled in a way that kills everyone" or "SSA says you should also consider yourself selected from the worlds with no observers".
Do I have you right?
I say, "SIA says that if your prior is '10% everyone survives, 20% only 5% survive, 70% everyone dies', and you notice you're in a 'survived' world, you should think you are in the 'everyone survives' world with 90% probability (as that's where 90% of the probability-weighted survivors are)".
Using examples is neat. I'd characterize the problem as follows (though the numbers are not actually representative of my beliefs, I think it's way less likely that everybody dies). Prior:
Assume we are in a finite multiverse (which is probably false) and take our reference class to only include people alive in the current year (whether the nuclear war happened or not). (SIA doesn't care about reference classes, but SSA does.) Then:
Note that we only care about the number of people surviving after a nuclear accident because we've included them in SSA's reference class. But I don't know why people want to include those in the reference class, and nobody else. If we include every human who has ever been alive, we have a large number of people alive regardless of whether C is true or not, which makes SSA give relatively similar predictions as SIA. If we include a huge number of non-humans whose existence aren't affected by whether C is true or not, SSA is practically identical to SIA. This arbitrariness of the reference class is another reason to be sceptical about any argument that uses SSA (and to be sceptical of SSA itself).
Really appreciate you taking the time to go through this!
To establish some language for what I want to talk about, I want to say your setup has two world sets (each with a prior of 50%) and six worlds (3 in each world set). A possible error I was making was just thinking in terms of one world set (or, one hypothesis: C), and not thinking about the competing hypotheses.
I think in your SSA, you treat all observers in the conditioned-on world set as "actually existing". But shouldn't you treat only the observers in a single world as "actually existing"? That is, you notice you're in a world where everyone survives. If C is true, the probability of this, given that you survived, is (0.7/0.9)/(0.7/0.9 + 0.2/0.9) = 7/9.
And then what I wanted to do with SIA is to use a similar structure to the not-C branch of your SSA argument to say "Look, we have 10/11 of being in an everyone survived world even given not-C. So it isn't strong evidence for C to find ourselves in an everyone survived world".
It's not yet clear to me (possibly because I am confused) that I definitely shouldn't do this kind of reasoning. It's tempting to say something like "I think the multiverse might be such that measure is assigned in one of these two ways to these three worlds. I don't know which, but there's not an anthropic effect about which way they're assigned, while there is an anthropic effect within any particular assignment". Perhaps this is more like ASSA than SIA?
Copied from a comment above:
There haven't been anthropogenic risks that killed 10% of humans. The anthropic update on "10% of people killed" is pretty small. (World War 2 killed ~2% and feels like the strongest example against the "Them" position.)
You could believe that most risks are all-or-nothing, in which case I agree the "Them" position is considerably weaker due to anthropic effects.
This argument sounds like it's SSA-ish (it certainly doesn't work for SIA). I haven't personally looked into this, but I think Anders Sandberg uses SSA for his analysis in this podcast, where he claims that taking taking observer selection effects into account changes the estimated risk of nuclear war by less than a factor of 2 (search for "not even twice"), because of some mathy details making use of near-miss statistics. So if one is willing to trust Anders to be right about this (I don't think the argument is written up anywhere yet?) observer selection effects wouldn't matter much regardless of your anthropics.
Me: I agree I would have lost money on that bet. But I still feel that my overall worldview of “people will do wild and reckless things” loses fewer Bayes points than yours does. If we’d bet not just on outcomes but on questions like “will someone build a doomsday machine” or “will countries take X measure to reduce the probability of accidental nuclear war”, I would have won money off you from almost all of those. My worldview would have won most of the bets.
I don't think this is correct. We got lots of treaties and actions to reduce stockpiles, we successfully coordinated on the NPT, SALT and SALT-II, we have the IAEA which has prevented lots of countries from going nuclear, we have a coordinated inspections regime, and we even got Libya, several former soviet republics, and South Africa to renounce their nukes!
So if we had bet every year that no such risk-reduction actions would have occurred / been agreed to that year, you'd win in many years, but not all, but you seem to think you could give me odds and still come out ahead. I think that's wrong. Of course, all of this is retrodiction, so it's hard to be fair about what odds we'd have agreed to, but it's less lopsided than you seem to claim.
Taking the question of the title at face value: my model for this says that humans are pretty good at forming groups to coordinate, but that the coordination ability of that group decays over time once formed. This agrees with observations of the rise and fall of empires, religions, and corporations. Germane to the body of the post, it also predicts the decay of nuclear safety over time as exemplified by events like mass cheating scandals in the nuclear force, mass resignation of nuclear safety engineers at a national lab, and withdrawing from nuclear arms control agreements.
How good is humanity at coordination?
This title doesn't reflect the article. Humanity is AMAZING at coordination. I daily use dozens of items that I have no possibility of making by myself. I have literally never spent a day foraging for sustenance. The famous https://en.wikipedia.org/wiki/I,_Pencil describes a masterpiece of coordination, in a product so mundane that most people don't even notice it. No other species comes close.
What humanity is bad at is scaling of risk assessment. We just don't have the mechanisms to give the proper weight to x-risks. That's not coordination, that's just scope insensitivity.
No other species comes close.
Ok, but the relevant standard might not be "do they coordinate more than mice?", but some absolute threshold like "do they wear masks during a respiratory pandemic?" or "do they not fabricate scientific data relevant to what medications to deploy?".
What humanity is bad at is scaling of risk assessment. We just don't have the mechanisms to give the proper weight to x-risks. That's not coordination, that's just scope insensitivity.
Classic 'tragedies of the commons' happen not because of improper risk assessment, but because of concentrated benefits and distributed costs. Humans aren't cooperate-bot; they have sophisticated mechanisms for coordinating on whether or not to coordinate. And so it is interesting to ask: what will those mechanisms say when it comes to things related to x-risks, in the absence of meaningful disagreement on risks?
And then, when we add in disagreeing assessments, how does the picture look?
Ok, but the relevant standard might not be "do they coordinate more than mice?", but some absolute threshold like "do they wear masks during a respiratory pandemic?" or "do they not fabricate scientific data relevant to what medications to deploy?".
There certainly _are_ cases where cooperation is limited due to distributed benefits and concentrated costs - humans aren't perfect, just very good. Humans do at least as well as mice on those topics, and far better on some other topics.
But x-risk isn't a concentrated cost, it's a distributed infinite cost. And a disputed probability and timeframe, but that's about risk assessment and scope insensitivity, not about imposing costs on others and not oneself.
But x-risk isn't a concentrated cost, it's a distributed infinite cost.
This depends on how 'altruistic' your values are. For some people, the total value to them of all other humans (ever) is less than the value to them of their own life, and so something that risks blowing up the Earth reads similarly to their decision-making process as something that risks just blowing up themselves. And sometimes, one or both of those values are negative. [As a smaller example, consider the pilots who commit suicide by crashing their plane into the ground--at least once with 150 passengers in the back!]
That said, I made a simple transposition error, it's supposed to be "concentrated benefits and distributed costs."
I want to note that at least for me I'm mostly arguing about the strength of the update. Like, for a claim of the form "An accident with powerful technology X will directly cause human extinction", knowing nothing else I'd probably start with some pretty low prior (it's quite concrete, naming a specific technology X, which upper bounds the average probability across Xs at 1/X, and humans tend to try not to go extinct). Let's say 0.1%, which feels high for an average X, but maybe AI is really powerful even among powerful technologies.
From that point you update on 1) arguments for technical AI risk and 2) arguments for failure of humanity's response. In my case (the "Them" side), this gets to ~10%, which in log-odds (base 10) is an update of about +2. In contrast, if Buck's position were 50% risk, that would be an update of about +3, and at 90% risk, an update of about +4.
I'm not claiming that I started with some prior and decided on some evidence likelihood and computed the update -- I didn't -- I more want to illustrate that we all agree on the sign of the initial update, and agree that the update should be strong, so I think when arguing against the "Them" position it's important to note that you're arguing for an even stronger update, which requires a correspondingly higher evidence likelihood. When people give me single examples of things going wrong, I feel tempted to say "where do you think the 10% comes from?"
Them: I think state bioweapons programs are another example of something where nothing very bad has happened.
The 2001 anthrax attacks in the US seemed to come out of US bioweapons stockpiles. It wasn't as bad as them being used in war, but it still seems bad to me.
"very bad" = "a significant fraction of humanity died, or something comparable to that", because we're talking about x-risk, and the key claim is something like "humanity does better at higher-stakes problems".
I also apply this view mainly to accidents and not to conflict -- we didn't prevent World War 2, which killed ~2% of humans. (The reason for the distinction is simple -- to a first approximation, it's in everyone's interest to prevent accidents; it may not be in everyone's interest to prevent conflicts.)
Seems like the hinge is the payoff distributions? Like there is disagreement about how often there are steep walls and how steep they actually are as one example of 'ways payoff distribution intuitions are varying'.
In the case of the cold war, the name of the game was deterrence. You don't actually want to launch any nukes, but at the same time, you want to make it look like you are on a hair trigger. Judging from the history, this seems to be what happened. I suspect that there were enough smart people in the decision loop to make this tradeoff fairly well.
Relevant evidence: survey about the impact of COVID on biorisk. I found the qualitative discussion far more useful than the summary table. I think overall the experts are a bit more pessimistic than would be predicted by my model, which is some evidence against my position (though I also think they are more optimistic than would be predicted by Buck's model). Note I'm primarily looking at what they said about natural biorisks, because I see COVID as a warning shot for natural pandemics but not necessarily deliberate ones.
(Similarly, on my model, warning shots of outer alignment failures don't help very much to guard against inner alignment failures.)
Them: Climate change isn’t very important, it’s only going to make the world a few percent worse off.
Me: I agree, but firstly I don’t think politicians know that, and secondly they’re still doing much less than would be optimal.
Politicians get informed about effects of policy by lobbyists. I think you can generally count on those people making public policy to have all the knowledge that's out there where there's an interest of well-funded lobbyists for the policy makers to have the knowledge.
Why is climate change not very important?
It's something I am personally very worried about.
If well-funded lobbies make sure that policy makers are well-informed, then we cannot expect them to be well-informed on the state of the climate and how to adapt. Sustainable development has become a field that is capable of lobbying. It is based on the assumption that the situation is still salvageble and that doing so is compatible with our current economic systems. It is therefore more appealing to policy-makers and easier to invest in than the more radical environmentalist position that the situation is not salvageable and that we should focus on collectively preparing for/adapting to massively disruptive climate change. The more catastrophic views on climate change are anti-capitalist at their core. continued increase in production and consumption, even if done more efficiently, cannot be maintained in the future scenarios they deem most likely, and thus they do not have industry on their side or well-funded lobbyists.
I believe that it is unlikely that policy-makers have grasped the situation accurately.
It is based on the assumption that the situation is still salvageble and that doing so is compatible with our current economic systems.
There are many ways to do geoengeneering that allow us to change the temperature of earth how we desire. Given how cheap some approaches of geoengenieering are and how easy they can be done unilaterally by a single country.
The more catastrophic views on climate change are anti-capitalist at their core.
Basically, there's a politically ideology where people who are already anti-capitalist make up a scenarios around climate changes that distinct from the scientific predictions. It's just another group of people who don't believe in the science and care more about their politics.
continued increase in production and consumption, even if done more efficiently, cannot be maintained in the future scenarios they deem most likely
Asteroid mining in addition to a Dyson sphere around the sun allows for hundreds of years of increased production and consumption.
When EAs look at the history of nuclear weapons, their reactions tend to fall into two camps.
The first camp (which I am inclined towards) is “Man, what a total mess. There were so many near misses, and people involved did such clearly terrible and risky things like setting up the dead hand system and whatever else. I guess that humans probably can’t be trusted to handle extremely dangerous technology.”
The other camp says “No nuclear weapons have been used or detonated accidentally since 1945. This is the optimal outcome, so I guess this is evidence that humanity is good at handling dangerous technology.”
This mostly comes up because people from the other camp tend to give numbers for the probability of AI x-risk that are 1-10%, and people from my camp tend to give numbers that are like 40%-80%. I think both camps are roughly equally represented among people who work on x-risk prevention, though the optimists have recently been doing a much more thorough job of arguing for their AI x-risk probabilities than the pessimists have.
When I talk to people from the other camp, I often have a conversation that goes like this:
Me: Okay, but what about all these crazy stories from The Doomsday Machine about extreme recklessness and risk?
Them: I don’t trust stories. It’s really hard to know what the actual situation was. The Doomsday Machine is just one book written by an activist who probably isn’t that reliable (eg see his massively exaggerated statements about how dangerous nuclear winter is). There will always be people telling you that something was a disaster. I prefer to look at unambiguous and unbiased evidence. In this particular case, the unbiased, unambiguous questions that we could have bet on in 1945 are things like “How many nuclear weapons will be fired in anger in the next fifty years? How many people will die from nuclear weapons? How many buildings will be destroyed?” And the answer to all of these is zero. Surely you agree that you would have lost money if you’d bet on these with me in 1945?
Me: I agree I would have lost money on that bet. But I still feel that my overall worldview of “people will do wild and reckless things” loses fewer Bayes points than yours does. If we’d bet not just on outcomes but on questions like “will someone build a doomsday machine” or “will countries take X measure to reduce the probability of accidental nuclear war”, I would have won money off you from almost all of those. My worldview would have won most of the bets.
Them: Except for the only bet that is unambiguously connected to the thing we actually care about.
Me: Yeah, but I don’t know if I care about that? Like, maybe I would have assigned 30% to “no nuclear weapons would have been fired”, but it’s not that bad to have something 30% likely happen. Whereas I feel you would have assigned numbers like 5% to a bunch of reckless things that I would have assigned 30% to, which is a much more egregious mistake.
Them: If you read actual writers at the time, like Bertrand Russell, they seem to imply very small probabilities of the outcome which actually happened; I think you’re being a bit overly generous about how well you would have done.
Me: Fair.
Them: I feel like your worldview suggests that way more bad things should have happened as a result of coordination failures than have actually happened. Like, I don’t think there are really examples of very bad things happening as a result of coordination failures.
Me: ...what? What about climate change or state bioweapons programs or the response to covid?
Them: Climate change isn’t very important, it’s only going to make the world a few percent worse off.
Me: I agree, but firstly I don’t think politicians know that, and secondly they’re still doing much less than would be optimal.
Them: I think we’d do better on problems with actual big stakes.
Me: I don’t see any reason to believe that this is true. It doesn’t seem that we did noticeably better on nuclear weapons than on lower-stakes coordination problems.
Them: I think state bioweapons programs are another example of something where nothing very bad has happened.
Me: What about if covid turns out to have been accidentally released from a bioweapons lab?
Them: That will be an update for me.
Me: Why would that be an update? We already know that state bioweapons programs have killed thousands of people with accidental releases, and there's no particular reason that they couldn't cause worse disasters, and that international regulation has failed to control that.
Them: [inaudible. I don’t know how to rephrase the thing that people say at this point in the conversation.]
Me: Do you have any criticisms of me that you want to finish up with?
Them: Yeah. I think you’re overly focused on looking at the worst examples of coordination failures, rather than trying to get a balanced sense of our overall strengths and weaknesses. I also think you’re overly focused on stories where things sound like they should have gone terribly, and you’re updating insufficiently on the fact that for some reason, it always seems to go okay in the end; I think that you should update towards the possibility that you’re just really confused about how dangerous things are.
I feel very confused here.