The number of people who actually had the deep technical skills and knowledge to evaluate the risk of ignition of the atmosphere from nuclear weapons was very small, and near completely overlapped with the people developing the very same weapons of mass destruction that were the source of that risk.
The number of people who have the deep technical knowledge, skills, and talent necessary to correctly evaluate the actual risk of AGI doom is probably small, and probably overlaps necessarily with the people most capable of creating it.
I'm curious how this fits into the context. Regardless of whether or not one believes it's true, doesn't it seem reasonable and intuitively right - so the opposite of what is asked for?
Historical examples of things that once sounded ridiculous but turned out to be true:
It's harder to know what qualifies as false examples since they do (now) have good counterarguments, but maybe something like these:
Examples of ideas with less certain status:
Does spherical earth count? I couldn't find any sources saying the idea was seen as ridiculous, especially around the time that they actually discovered it was round via physical measurements.
Maybe the risk of nuclear war during the Cold War? This 1961 argument by Bertrand Russell is cogent enough to sound correct, it would have probably sounded like a pretty wild prediction to most people, and all considered, it was indeed kinda false, though modest steps along the lines of option 3 did happen (we didn't get a literal political union but we did get the fall of the Iron Curtain and globalization entangling the world into mutual economic interest, as well as the UN). Anyone betting on the status quo fallacy on a gut instinct against Russell there would have won the bet.
Yeah that definitely seems very analogous to the current AI x-risk discourse! I especially like the part where he says the UN won't work:
Any pretended universal authority to which both sides can agree, as things stand, is bound to be a sham, like UN.
Do you know what the counter-arguments were like? I couldn't even find any.
Oh, but one weakness is that this example has anthropic shadow. It would be stronger if there was an example where "has a similar argument structure to AI x-risk, but does not involve x-risk".
So like a strong negative example would be something where we survive if the argument is correct, but the argument turns out false anyways.
That being said, this example is still pretty good. In a world where strong arguments are never wrong, we don't observe Russell's argument at all.
That Malthusianism is wrong (for predicting the future). Prior to the demographic transition, arguments in favor of this view could be basically summarized as "somehow it will be fine".
Doom skeptics have the daunting task of trying to prove a negative. It is very hard to conclusively prove anything is safe in a generalized, unconditional way.
As to arguments that were seen as intuitively false yet were solid, there is a huge class of human rights topics: racial and gender equality, gay rights, even going back to giving peasants the vote and democratic government. All of these were intuitively ridiculous, yet it's hard to make credible counter-arguments.
There were some counter-arguments against democracy that seemed pretty good. Even the founding fathers were deeply worried about them. They aren't seen as credible today, but back then they were viewed as quite strong.
Biological evolution is the origin of species.
Simple mathematical laws govern the behavior of everything, everywhere, all the time.
Negative, imaginary, irrational, and transcendental numbers, as well as infinite sets and infinitesimals, are logically coherent and useful.
(Many of the above are things many people still don't believe).
The Roman Empire is large, powerful, and well organized, and will have no problem dealing with the new Christian religion.
Did you notice:? You can take these statements and make the case it’s stupid, for one reasonable set of priors, or profound, for a different but somewhat possible set of priors. And that’s true for each negation. As well as for several forms of negation. Bohr called that deep truth, or something like that.
Not an answer to your question, but I think there are plenty of good counterarguments against doom. A few examples:
To clarify, I'm thinking mostly about the strength of the strongest counter-argument, not the quantity of counter-arguments.
But yes, what counts as a strong argument is a bit subjective and a continuum. I wrote this post because of the counter-arguments I know I know of are strong enough to be "strong" by my standards.
Personally my strongest counter-argument is "humanity actually will recognize the x-risk in time to take alignment seriously, delaying the development of ASI if necessary", but even that isn't backed up by too much evidence (the only previous example I know of is when we avoided nuclear holocaust).
What do you think are the strongest arguments in that list, and why are they weaker than a vague "oh maybe we'll figure it out"?
Hmm, Where I agree and disagree with Eliezer actually has some pretty decent counter-arguments, at least in the sense of making things less certain.
However, I still think that there's a problem of "the NN writes a more traditional AGI that is capable of foom and runs it".
The strongest argument against AI doom I can imagine runs as follows:
AI can kill all humans for two main reasons: to (a) prevent a threat to itself and (b) to get human's atoms.
But:
(a)
AI will not kill humans as a threat before it creates powerful human-independent infrastructure (nanotech) as in that case, it will run out of electricity etc.
AI will also not kill humans after it creates nanotech, as we can't destroy nanotech (even with nukes).
Thus, AI will not kill humans to prevent the threat neither before, nor after nanotech, – so it will never happens for this reason.
(b)
Human atoms constitute 10E-24 of all atoms in the Solar system.
Humans may have small instrumental value for trade with aliens, for some kinds of work or as training data sources.
Even a small instrumental value of humans will be larger than the value of their atoms, as the value of atoms is very-very small.
Humans will not be killed for atoms.
Thus humans will not be killed either as a threat or for atoms.
But there are other ways how AI catastrophe can kill everybody: wrongly aligned AI performs wireheading, Singleton halts, or there will be war between several AIs. Each of this risk is not necessary outcome.But together they have high probability mass.
Other reasons to kill humans:
Also, the AI may destroy human civilization without exterminating all humans, e.g. by taking away most of our resources. If the civilization collapses because the cities and factories are taken over by robots, most humans will starve to death, but maybe 100000 will survive in various forests as hunters and gatherers, with no chance to develop civilization again in the future... that's also quite bad.
It all collapses to the (2) "atoms utility" vs "human instrumental utility." Preventing starvation or pollution effect for a large group of humans is relatively cheap. Just put all them on a large space station, may be 1 km long.
But disempowerment of humanity and maybe even Earth-destruction are far more likely. Even if we will get small galactic empire of 1000 stars, but will live there as pets devoted any power about Universe future, it is not very good outcome.
These don’t seem very relevant counterarguments, I think literally all are from people who believe that AGI is an extinction-level threat soon facing our civilization.
Perhaps you mean “>50% of extinction-level bad outcomes” but I think that the relevant alternative viewpoint that would calm someone is not that the probability is only 20% or something, but is “this is not an extinction-level threat and we don’t need to be worried about it”, for which I have seen no good argument for (that engages seriously with any misalignment concerns).
Well, I was asking because I found Yudkowsky's model of AI doom far more complete than any other model of the long term consequences of AI. So the point of my original question is "how frequently is a model that is far more complete than it's competitors wrong?".
But yeah, even something as low as 1% chance of doom demands very large amount of attentions from the human race (similar to the amount of attention we assigned to the possibility of nuclear war).
(That said, I do think the specific value of p(doom) is very important when deciding which actions to take, because it effects the strategic considerations in the play to your outs post.)
there doesn't seem to be very good counter-arguments
you might be mindkilled by the local extinctionists. There are plenty of good arguments (not necessarily correct or or universally persuasive), at least good enough to be very much uncertain of the validity of either side, unless you spend thousands of hours diving into it professionally.
you might be mindkilled by the local extinctionists
+1(agreement) and -1(tone): you are correct that good arguments against doom exist, but this way of writing about it feels to me unnecessarily mean to both OP and to the 'extinctionists'
I think (anti-)extinctionism is a better term than notkilleveryoneism. I agree that "mindkill" is a bit stronger connotation than necessary, though. My annoyance with a one-sided view matching Eliezer's in many of the newbie posters comes through.
I don't see very good arguments that cover all scenarios. I see good arguments against a Yud-style fast takeoff and FOOM with nanites, but that's just a slice of the existential risk pie.
Neither do I , but that's not a problem because there's no reason why complete extinction, and 0% risk are the only possi bilities.
I think one of the biggest reasons I am worried about AI doom is that there doesn't seem to be very good counter-arguments. Most of them are just bad, and the ones that aren't a bit vague (usually something along the lines of "humanity will figure it out" or "maybe LLMs will scale in nice ways").
However, I'm curious as to how accurate this heuristic is. My question: What examples in the past are there of "argument is widespreadly seen as ridiculous and intuitively false, but the argument is pretty solid and the counter-arguments aren't". (Sorry if that's a bit vague, use your best judgement. I'm looking specifically for examples that are similar to the AI x-risk debate.) And did they turn out true or false? Try to include reasons why the argument was so strong, a typical counter-argument, and the strongest counter-argument.
Please use spoiler text for the final answer so that I can try to predict it before seeing it!