Vote on the arguments here! Copying the structure from the substack polling, here are the four options (using similar reacts).
The question each time is: do you find this argument compelling?
I really wish that I could respond more quantitatively to these. IMO all these arguments are cause for concern, but of substantially different scale.
Perhaps something like this?
"For each argument, does this alone move your odds of an existential catastrophe up to 0.01%, 0.1%, 1%, 10% 50%, 90%, or not at all?"
At the very least it would need to be on a logistic scale (i.e. in odds form). If something has 2:1 odds, then if I am at 50% then that's a 25% probability mass movement, but when I am at 99% only a ~1% probability movement.
But my guess is even odds are too hard to assign to stuff like this.
After chatting with our voices, I believe you misunderstood me as asking "How many points of probability does this argument move you?" whereas I meant "If you had not seen any of the other arguments, and you were presented with this argument, what would your absolute probability of an existential catastrophe from AI be?"
I don't stand by this as the ultimate bestest survey question of all time, and would welcome an improvement.
I'll say I think the alternative of "What likelihood ratio does this argument provide?" (e.g. 2x, 10x, etc) isn't great because I believe the likelihood ratios of all the arguments are not conditionally independent.
I'm concerned about AI, but none of these arguments are a very good explanation of why, so my vote is for "you're not done finding arguments".
"This argument also appears to apply to human groups such as corporations, so we need an explanation of why those are not an existential risk"
I don't think this is necessary. It seems pretty obvious that (some) corporations could pose an existential risk if left unchecked.
Edit: And depending on your political leanings and concern over the climate, you might agree that they already are posing an existential risk.
What do you think P(doom from corporations) is. I've never heard much worry about current non-AI corps?
I'm not confident that I could give a meaningful number with any degree of confidence. I lack expertise in corporate governance, bio-safety and climate forecasting. Additionally, for the condition to be satisfied that corporations are left "unchecked" there would need to be a dramatic Western political shift that makes speculating extremely difficult.
I will outline my intuition for why (very large, global) human corporations could pose an existential risk (conditional on the existential risk from AI being negligible and global governance being effectively absent).
1.1 In the last hundred years, we've seen that (some) large corporations are willing to cause harm on a massive scale if it is profitable to do so, either intentionally or through neglect. Note that these decisions are mostly "rational" if your only concern is money.
Copying some of the examples I gave in No Summer Harvest:
1.2 Some corporations have also demonstrated they're willing to cut corners and take risks at the expense of human lives.
2. Without corporate governance, immoral decision making and risk taking behaviour could be expected to increase. If the net benefit of taking an action improves because there are fewer repercussions when things go wrong, they should reasonably be expected to increase in frequency.
3. In recent decades there has been a trend (at least in the US) towards greater stock market concentration. For large corporations to pose and existential risk, this trend would need to continue until individual decisions made by a small group of corporations can affect the entire world.
I am not able to describe the exact mechanism of how unchecked corporations would post an existential risk, similar to how the exact mechanism for an AI takeover is still speculation.
You would have a small group of organisations responsible for deciding the production activities of large swaths of the globe. Possible mechanism include:
I think if you're already sold on the idea that "corporations are risking global extinction through the development of AI" it isn't a giant leap to recognise that corporations could potentially threaten the world via other mechanisms.
Concretely, what does it mean to keep a corporation "in check" and do you think those mechanisms will not be available for AIs?
what does it mean to keep a corporation "in check"
I'm referring to effective corporate governance. Monitoring, anticipating and influencing decisions made by the corporation via a system of incentives and penalties, with the goal of ensuring actions taken by the corporation are not harmful to broader society.
do you think those mechanisms will not be available for AIs
Hopefully, but there are reasons to think that the governance of a corporation controlled (partially or wholly) by AGIs or controlling one or more AGIs directly may be very difficult. I will now suggest one reason this is the case, but it isn't the only one.
Recently we've seen that national governments struggle with effectively taxing multinational corporations. Partially this is because the amount of money at stake is so great, multinational corporations are incentivized to invest large amounts of money into hiring teams of accountants to reduce their tax burden or pay money directly to politicians in the form of donations to manipulate the legal environment. It becomes harder to govern an entity as that entity invest more resources into finding flaws in your governance strategy.
Once you have the capability to harness general intelligence, you can invest a vast amount of intellectual "resources" into finding loopholes in governance strategies. So while many of the same mechanisms will be available for AI's, there's reason to think they might not be as effective.
The selected counterarguments seem consistently bad to me, like you haven't really been talking to good critics. Here are some selected counterarguments I've noticed that I've found to be especially resilient, in some cases seeming to be stronger than the arguments in favor (but it's hard for me to tell whether they're stronger when I'm embedded in a community that's devoted to promoting the arguments in favor).
2. Humans won’t figure out how to make systems with goals that are compatible with human welfare and realizing human values
This argument was a lot more compelling when natural language understanding seemed hard, but today, pre-AGI AI models have a pretty sophisticated understanding of human motivations, since they're trained to understand the world and how to move in it through human writing, and then productized by working as assistants for humans. This is not very likely to change.
catastrophic tools argument
New technologies that are sufficiently catastrophic to pose an extinction risk may not be feasible soon, even with relatively advanced AI
I don't have a counterargument here but even if that's all you could find, I don't think it's a good practice to give energy to bad criticism that no one would really stand by. We already have protein predictors. The idea that AI wont have applications in bioweapons is very very easy to dismiss. Autonomous weapons and logistics are another obvious application.
Powerful black boxes
Which is to say
multi-agent dynamics
I'm surprised that the least compelling argument here is Expert opinion.
Anyone want to explain to me why they dislike that one? It looks obviously good to me?
Depends on what exactly on what one means by "experts", but at least historically expert opinion as defined to mean "experts in AI" seems to me to have performed pretty terribly. They mostly dismissed AGI happening, had timelines that seemed often transparently absurd, and their predictions were extremely framing dependent (the central result from the AI Impacts expert surveys is IMO that experts give timelines that differ by 20 years if you just slightly change the wording of how you are eliciting their probabilities).
Like, 5 years ago you could construct compelling arguments of near expert consensus against risks from AI. So clearly arguments today can't be that much more robust, unless you have a specific story for why expert beliefs are now a lot smarter.
Sure, but still experts could not agree that AI is quite risky, and they do. This is important evidence in favour, especially to the extent they aren't your ingroup.
I'm not saying people should consider it a top argument, but I'm surprised how it falls on the ranking.
AI isn't dangerous because of what experts think, and the arguments that persuaded the experts themselves are not "experts think this". It would have been a misleading argument for Eliezer in 2000 being among the first people to think about it in the modern way, or for people who weren't already rats in maybe 2017 before GPT was in the news and when AI x-risk was very niche.
I also have objections to its usefulness as an argument; "experts think this" doesn't give me any inside view of the problem by which I can come up with novel solutions that the experts haven't thought of. I think this especially comes up if the solutions might be precise or extreme; if I was an alignment researcher, "experts think this" would tell me nothing about what math I should be writing, and if I was a politician, "experts think this" would be less likely to get me to come up with solutions that I think would work rather than solutions that are compromising between the experts coalition and my other constituents.
So, while it is evidence (experts aren't anticorrelated with the truth), there's better reasoning available that's more entangled with the truth and gives more precise answers.
Expert opinion is an argument for people who are not themselves particularly informed about the topic. For everyone else, it basically turns into an authority fallacy.
It would be interesting to see which arguments the public and policymakers find most and least concerning.
Note: This might be me who is not well-informed enough on this particular initiative. However, at this point, I'm still often confused and pessimistic about most communication efforts about AI Risk. This confusion is usually caused by the content covered and the style with which it is covered, and the style and content here does not seem to veer off a lot from what I typically identify as a failure mode.
I imagine that your focus demographic is not lesswrong or people who are already following you on twitter.
I feel confused.
Why test your message there? What is your focus demographic? Do you have a focus group? Do you plan to test your content in the wild? Have you interviewed focus groups that expressed interest and engagement with the content?
In other words, are you following well grounded advice on risk communication? If not, why?
A utilitarian, a deep ecologist, and a Christian might agree on policy in the present world, but given arbitrary power their preferred futures might be a radical loss to the others. <...> People who broadly agree on good outcomes within the current world may, given much more power, choose outcomes that others would consider catastrophic
I think that many people do not intend for their preferred policy to be implemented everywhere; so, at least they could be satisfied with a small region of universe. Though, AI-as-tool is quite likely to be created under control of those who want to have every power source and (an assumption) thus also want to steer most of the world; it's unclear if AI-as-agent would have strong preferences about the parts of world it doesn't see.
"Everyone who only cares about their slices of the world coordinates against those who want to seize control of the entire world" seems like it might be one of those stable equilibria.
Which is why, since the beginning of the nuclear age, the running theme of international relations is "a single nation embarked on multiple highly destructive wars of conquest, and continued along those lines until no nations that could threaten it remained".
Humans won’t figure out how to make systems with goals that are compatible with human welfare and realizing human values
This is a very interesting risk, but in my opinion an overinflated one. I feel that goals without motivations, desires or feelings are simply a means to an end. I don't see why we wouldn't be able to make programmed initiatives in our systems that are compatible with human values.
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2025. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
This is a snapshot of a new page on the AI Impacts Wiki.
We’ve made a list of arguments[1] that AI poses an existential risk to humanity. We’d love to hear how you feel about them in the comments and polls.
Competent non-aligned agents
Summary:
Selected counterarguments:
People who have favorably discussed[2] this argument (specific quotes here): Paul Christiano (2021), Ajeya Cotra (2023), Eliezer Yudkowsky (2024), Nick Bostrom (2014[3]).
See also: Full wiki page on the competent non-aligned agents argument
Second species argument
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Joe Carlsmith (2024), Richard Ngo (2020), Stuart Russell (2020[4]), Nick Bostrom (2015).
See also: Full wiki page on the second species argument
Loss of control via inferiority
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Paul Christiano (2014), Ajeya Cotra (2023), Richard Ngo (2024).
See also: Full wiki page on loss of control via inferiority
Loss of control via speed
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Joe Carlsmith (2021).
See also: Full wiki page on loss of control via speed
Human non-alignment
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Joe Carlsmith (2024), Katja Grace (2022), Scott Alexander (2018).
See also: Full wiki page on the human non-alignment argument
Catastrophic tools
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Dario Amodei (2023), Holden Karnofsky (2016), Yoshua Bengio (2024).
See also: Full wiki page on the catastrophic tools argument
Powerful black boxes
Summary:
Selected counterarguments:
See also: Full wiki page on the powerful black boxes argument
Multi-agent dynamics
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Robin Hanson (2001)
See also: Full wiki page on the multi-agent dynamics argument
Large impacts
Summary:
Selected counterarguments:
People who have favorably discussed this argument (specific quotes here): Richard Ngo (2019)
See also: Full wiki page on the large impacts argument
Expert opinion
Summary:
Selected counterarguments:
This is a snapshot of an AI Impacts wiki page. For an up to date version, see there.
Each 'argument' here is intended to be a different line of reasoning, however they are often not pointing to independent scenarios or using independent evidence. Some arguments attempt to reason about the same causal pathway to the same catastrophic scenarios, but relying on different concepts. Furthermore, 'line of reasoning' is a vague construct, and different people may consider different arguments here to be equivalent, for instance depending on what other assumptions they make or the relationship between their understanding of concepts.
Nathan Young puts 80% that at the time of the quote the individual would have endorsed the respective argument. They may endorse it whilst considering another argument stronger or more complete.
Superintelligence, Chapter 8
Human Compatible: Artificial Intelligence and the Problem of Control