This is not going to be a popular post here, but I wanted to articulate precisely why I have a very low pDoom (2-20%) compared to most people on LessWrong.
Every argument I am aware of for pDoom fits into one of two categories: bad or weak.
Bad arguments make a long list of claims, most of which have no evidence and some of which are obviously wrong. Examples include A List of Lethalities, which is almost the canonical example. There is no attempt to organize the list into a single logical argument, and it is built on many assumptions (analogies to human evolution, assumption of fast takeoff, ai opaqueness) which are in conflict with reality.
Weak arguments go like this: "AGI will be powerful. Powerful systems can do unpredictable things. Therefore AGI could doom us all." Examples of these arguments include each of the arguments on this list.
So the line of reasoning I follow is something like this;
- I start with a very low prior of AGI doom (for the purpose of this discussion, assume I defer to consensus).
- I then completely ignore the bad arguments,
- finally, I give 1 bit of evidence collectively for the weak arguments (I don't consider them independent, most are just rephrasing the example argument)
So even if I assume no one betting on Manifold has ever heard of the argument "AGI might be bad actually", I only get from 13% -> 30% with that additional bit of evidence.
In the comments: if you wish to convince me, please propose arguments that are neither bad nor weak. Please do not argue that I am using the wrong base-rate or that the examples that I have already given are neither bad nor weak.
EDIT:
There seems to be a lot of confusion about this, so I thought I should clarify what I mean by a "strong good argument"
Suppose you have a strongly-held opinion, and that opinion disagrees from the expert-consensus (in this case, the Manifold market or expert surveys showing that most AI experts predict a low probability of AGI killing us all). If you want to convince me to share your beliefs, you should have a strong good argument for why I should change my beliefs.
A strong good argument has the following properties:
- it is logically simple (can be stated in a sentence or two)
- This is important, because the longer your argument, the more details that have to be true, and the more likely that you have made a mistake. Outside the realm of pure-mathematics, it is rare for an argument that chains together multiple "therefore"s to not get swamped by the fact that
- Each of the claims in the argument is either self-evidently true, or backed by evidence.
- example of a claim that is self-evidently true would be: if AGI exists, it will be made out of atoms
- example of a claim that is not self-evidently true: if AGI exists, it will not share any human values
To give an example completely unrelated to AGI. The expert consensus is that nuclear power is more expensive to build and maintain than solar power.
However, I believe this consensus is wrong because: The cost of nuclear power is artificially inflated by the regulation which mandates nuclear be "as safe as possible", thereby guaranteeing that nuclear power can never be cheaper than other forms of power (which do not face similar mandates).
Notice that even if you disagree with my conclusion, we can now have a discussion about evidence. You might ask, for example "what fraction of nuclear power's cost is driven by regulation?" "Are there any countries that have built nuclear power for less than the prevailing cost in the USA?" "What is an acceptable level of safety for nuclear power plants?"
I should also probably clarify why I consider "long lists" bad arguments (and ignore them completely).
If you have 1 argument, it's easy for me to examine the argument on it's merits so I can decide whether it's valid/backed by evidence/etc.
If you have 100 arguments, the easiest thing for me to do is to ignore them completely and come up with 100 arguments for the opposite point. Humans are incredibly prone to cherry-picking and only noticing arguments that support their point of view. I have absolutely no reason to believe that you the reader have somehow avoided all this and done a proper average over all possible arguments. The correct way to do such an average is to survey a large number of experts or use a prediction market, not whatever method you have settled upon.
Right now, every powerful intelligence (e.g. nation-states) is built out of humans, so the only way for such organizations to thrive is to make sure the constituent humans thrive, for instance by ensuring food, clean air and access to accurate information.
AI is going to loosen up this default pull. If we are limited to reflex-based tool AIs like current LLMs, probably we'll make it through just fine, but if we start doing wild adversarial searches that combine tons of the tool-like activities into something very powerful and autonomous, these can determine ~everything about the world. Unless all winners of such searches actively promote human thriving in their search instead of just getting rid of humanity or exploiting us for raw resources, we're doomed.
There's lots of places where we'd expect adversarial searches to be incentivized, most notably:
The current situation for war/national security is already super precarious due to nukes, and I tend to reason by an assumption that if a nuke is used again then that's going to be the end of society. I don't know whether that assumption is true, but insofar as it is reasonable, it becomes natural to think of AI weapons as Nuke 2.0.
The situation with nukes suggests that maybe sometimes we can have an indefinite holdoff on using certain methods, but again it already seems quite precarious here, and it's unclear how to generalize this to other case. For instance, outlawing propaganda would seem to interfere with free speech, and enforcing such laws without tyrannizing people who are just confused seems subtle.
So a plausible model seems to me to be, people are gradually developing ways of integrating computers with the physical world, by giving them deeper knowledge of how the world works and more effective routines for handling small tasks. Right now, this interface is very brittle and so breaks down when even slight pressure is applied to it, but as it gets more and more robust and well-understood, it becomes more and more feasible to run searches over it to find more powerful activities. In non-adversarial circumstances, such searches don't have to ensure robustness or completeness and thus can just "do the thing" you're asking them to, but in adversarial circumstances, the adversaries will exploit your weakness and so you actually have to do it similar to a dangerous utility-maximizer.
This is not how learning any (even slightly complex) topic works.