We (Connor Leahy, Gabriel Alfour, Chris Scammell, Andrea Miotti, Adam Shimi) have just published The Compendium, which brings together in a single place the most important arguments that drive our models of the AGI race, and what we need to do to avoid catastrophe.
We felt that something like this has been missing from the AI conversation. Most of these points have been shared before, but a “comprehensive worldview” doc has been missing. We’ve tried our best to fill this gap, and welcome feedback and debate about the arguments. The Compendium is a living document, and we’ll keep updating it as we learn more and change our minds.
We would appreciate your feedback, whether or not you agree with us:
- If you do agree with us, please point out where you think the arguments can be made stronger, and contact us if there are ways you’d be interested in collaborating in the future.
- If you disagree with us, please let us know where our argument loses you and which points are the most significant cruxes - we welcome debate.
Here is the twitter thread and the summary:
The Compendium aims to present a coherent worldview about the extinction risks of artificial general intelligence (AGI), an artificial intelligence that exceeds that of humans, in a way that is accessible to non-technical readers who have no prior knowledge of AI. A reader should come away with an understanding of the current landscape, the race to AGI, and its existential stakes.
AI progress is rapidly converging on building AGI, driven by a brute-force paradigm that is bottlenecked by resources, not insights. Well-resourced, ideologically motivated individuals are driving a corporate race to AGI. They are now backed by Big Tech, and will soon have the support of nations.
People debate whether or not it is possible to build AGI, but most of the discourse is rooted in pseudoscience. Because humanity lacks a formal theory of intelligence, we must operate by the empirical observation that AI capabilities are increasing rapidly, surpassing human benchmarks at an unprecedented pace.
As more and more human tasks are automated, the gap between artificial and human intelligence shrinks. At the point when AI is able to do all of the tasks a human can on a computer, it will functionally be AGI and able to conduct the same AI research that we can. Should this happen, AGI will quickly scale to superintelligence, and then to levels so powerful that AI is best described as a god compared to humans. Just as humans have catalyzed the holocene extinction, these systems pose an extinction risk for humanity not because they are malicious, but because we will be powerless to control them as they reshape the world, indifferent to our fate.
Coexisting with such powerful AI requires solving some of the most difficult problems that humanity has ever tackled, which demand Nobel-prize-level breakthroughs, billions or trillions of dollars of investment, and progress in fields that resist scientific understanding. We suspect that we do not have enough time to adequately address these challenges.
Current technical AI safety efforts are not on track to solve this problem, and current AI governance efforts are ill-equipped to stop the race to AGI. Many of these efforts have been co-opted by the very actors racing to AGI, who undermine regulatory efforts, cut corners on safety, and are increasingly stoking nation-state conflict in order to justify racing.
This race is propelled by the belief that AI will bring extreme power to whoever builds it first, and that the primary quest of our era is to build this technology. To survive, humanity must oppose this ideology and the race to AGI, building global governance that is mature enough to develop technology conscientiously and justly. We are far from achieving this goal, but believe it to be possible. We need your help to get there.
To respond to this comment, I'll give a view on why I think the answer to coordination might be easier for AIs than for people, and also explain why AI invention likely breaks a lot of the social rules we are used to.
For example, one big difference I think that impacts coordination for AIs is that an AI model is likely to be able to copy itself millions of times, given current inference scaling, and in particular you can distribute fine-tunes to those millions as though they were a single unit.
This is a huge change for coordination, because humans can't copy themselves into millions of humans that share very similar values just by getting more compute, say.
Merging might also be much easier, and it is easier to merge and split two pieces of data of an AI than it is to staple two human brains.
These alone let you coordinate to an extent we haven't really seen in history, such that it makes more sense to treat the millions or billions of AI instances as 1 unified agent than it is to treat a nation as 1 unified agent.
To answer this question:
While this argument is indeed invalid if that was all there was to it, there is an actual reason why the current rules of society mostly stop working with AIs, because of one big issue:
When this happens, you can't rely on the property that the best way to make yourself well off is to make others well off, and indeed the opposite is the case if we assume that their labor is net-negative economic value.
The basic reason for this is that if your labor has 0 or negative economic value, then your value likely comes from your land and capital, and there is 0 disincentive, and at least a weak incentive to steal your capital and land to fuel their growth.
In essence, you can't assume that violent stealing of property is not incentivized, and a lot of the foundations of comparative advantage and our society don't work when you allow workers that are duplicable and very low cost.
This means if you survive and still have property, it will be because of alignment to your values, not economic reasons, because you cannot exclude bad outcomes like stealing property through violence via economics anymore.
I like these comments on the subject:
https://www.lesswrong.com/posts/2ujT9renJwdrcBqcE/the-benevolence-of-the-butcher
https://www.lesswrong.com/posts/2ujT9renJwdrcBqcE/the-benevolence-of-the-butcher#BJk8XgpsHEF6mjXNE
To address the human enhancement point: I agree that humans will likely be cognitively and physically enhanced to a level and pace of change that is genuinely ludicrously big compared to the pre-AI automation era.
However, there are 2 problems that arise here:
1, Most people that do work today do so because it's necessary to have a life, not for reasons like intrinsically liking work, so by default in an AI automation future where a company can choose an AI over a human, and the human's not necessary for AI to go well, I'd predict 80-90%+ humans would voluntarily remove themselves from the job market over the course of at most 10-20 years.
2. Unless humans mass upload and copy, which is absolutely possible but also plausibly harder than just having AIs for work, the coordination costs for humans would be a big barrier, because it's way easier for AIs to productively coordinate than humans due to sharing basically the same weights, combined with very similar values due to copy/pasting 1 AI being quite likely as a strategy to fulfill millions of jobs.
To be clear, I'm not stating that humans will remain unchanged, they will change rapidly. Just not as fast as AI changes.
Finally, one large reason on why human laws become mostly irrelevant is that if you have AIs that are able to serve in robotic armies, and do automated work, it becomes far too easy to either slowly change the laws such that people are ultimately closer to pets in status, or to do revolts, and critically once AI controls robotic armies and does all of the economic work, then any social system that the human controlling the AI, or the AI itself opposes is very easy to destroy/remove.