I wrote this as part of an application for the Chicago Symposium on Transformative AI, where I try & sketch out what takeoff might look like. I’m making a lot of claims, across a range of domains, so I’d expect there to be many places I’m wrong. But on the whole, I hope this is more well-thought-out than not.[1]
Many thanks to Nikola Jurković & Tao Burga for thoughtful comments on this writeup. Any errors are, of course, my own.
Prompt: Please outline an AGI takeoff scenario in detail, noting your key uncertainties
What will AGI look like?
Uncertainty about near-term upper-bound capabilities. There is a vast difference between ‘models that can automate 90% of white-collar labor’, and EY-esque, Bayesian hyperoptimizers.
If it’s closer to the latter, I think we’re doomed. Why? Tl;dr: too much optimization pressure is bad; and if we crank it up high enough, we goodhart ourselves out of existence.[2]
That’s not a very nice scenario to analyze. So I’m focusing on worlds in which we achieve the former & not the latter.[3]
These would likely have a similar architecture to frontier models today (ie. LLMs with reasoning), especially if we reach AGI in the next ~3 years.
Why? I suspect there are many different architectures with which we could theoretically reach AGI. There’s a lot of momentum behind the current paradigm, so I’d expect progress to come from variations on top of the existing architectures.
But if we don’t reach AGI by then, I’m uncertain about what architecture it would take. If progress slows, I’d expect the momentum behind LLMs to cool off. Working on alternative approaches will become more attractive – and I’d expect we’ll stumble onto something else like the transformer.
Who will make it?
Probably close-sourced, western, US-based labs.
This is based off scale + precedent. The overwhelming majority of innovations have been coming from the US. They also have the lion’s share of funding & access to talent.
They’ve got more access to compute – due to both greater funding and export controls.
There might well be another DeepSeek making big strides. But it’s unlikely that lower-resourced companies would be able to leverage this progress into a ‘path to AGI’. What seems likelier is that existing labs swoop in, and piggyback off these improvements.
They’re also likely to be more secretive about innovations, so public perception of progress is probably lagging.
The winner will have a lead time of o(months).
Ie. as opposed to multiple actors converging upon AGI simultaneously. Why? I think it’s far likelier that we ‘stumble onto’ AGI – given that progress has often been empirically (and not theory) driven.[4]
Concrete example? OpenAI publically released o1-preview in September. Other reasoning models started coming out November onwards.
Unless, of course, we have some big open-source discoveries (eg. the DeepSeek release) that open a lot of very obvious paths to the top. Even in that case, due to variation in approaches, and long training times, I think the assumption is appropriate – but there would be significantly more uncertainty.
When will they make it?
Originally expected this to be within the next ~3 years, but I’ve gotten more uncertain about that. I’d put 1:2 odds on it.
I’ve updated away from 3-year timelines with the release of GPT 4.5. Primarily, it seems like we’re getting diminishing returns from pre-training compute.[5]
That being said, there are a lot of smart people pushing capabilities right now. I think it’s likely that there’s some way they’ll get around the scaling wall. (Concrete example? The new paradigm of reasoning models, and how they raised the ceiling in the past.)
For one, DeepSeek made a lot of efficiency gains and open-sourced all their research. Labs will catch up with these gains shortly. While this might not raise the floor of capabilities, it will make ‘access to intelligence’ much cheaper, and likely speed up internal progress at labs.[6]
Also, RL is incredibly powerful, and we’re only starting to harness it with today’s reasoning models. It doesn’t seem like we’ve got a principled understanding of how to do RL well, and it seems like there’s a fairly high ceiling here.[7]
There is also a decent chance we don’t have any warning signs before this takes place.
Very high secrecy in major AI labs; reasonable to assume we get little notice about what’s going on.
I’m also fairly uncertain about this claim for multiple reasons. Maybe one concerned insider could raise the alarm? Maybe the government exercises oversight? Maybe even the model weights get exfiltrated – either autonomously, or courtesy of a malicious actor? Multiple ways we could have some warning signs, but I also wouldn’t be too surprised if there weren’t any.
Also, if models start getting solely used as ‘intermediate systems’ to produce synthetic data, it’s unlikely that we’ll see much public-facing output!
Takeoff itself
There’ll be a lot of internal churn – likely to use AGI for (some degree of) recursive self-improvement.
Assuming our AGI-achieving actor is a Western, private lab, their primary goal would be to secure a decisive advantage & ensure nobody else is able to threaten their position.
This would likely involve some degree of putting oneself beyond the reach of the competition.
This is a particularly risky phase! Labs would want to seal their advantage, which might cause them to trade off against safety.
There will probably be some sign of achieving AGI, before the lab is ready to publically announce it.
It seems like internal compartmentalization will be hard.[8] Once enough employees get wind of it, it’s likely that one of them will blow the whistle.
There might also be other signs of increased intensity – later-than-usual hours, public absence of top researchers, etc.[9]
This, however, is conditional on employees being mostly in-the-loop by then. It might well be that a lot of this work gets automated, necessitating little human contact with these systems.
What comes next?
Government takeover likely
I’d expect there to be warning shots between now and the development of AGI. There’s a lot of time between now and then, with models only getting increasingly capable. A single unilateral bad actor could shift the Overton window.
Government control might weirdly even fit with the admin’s current narrative? Once “we’ve beaten China” in the race (with the help of deregulation), we can now swoop in and take charge of these systems to protect American interests.
In fact, even the lab might be incentivized to support limited regulation – conditional on the government intervening and preventing others from developing AGI as well. After all, it’s to their advantage to be the sole actor in charge of the model.
Foreign actors stealing model weights
There doesn’t seem to be much concrete progress toward providing government-level infosec to AI labs today.
As mentioned above, I think it’s likely that one of these actors ‘stumbles upon’ AGI. It seems tough to do infosec retroactively well.
Given the benefits of AGI, it’d be foolish for foreign actors to not even try and exfiltrate model weights.
Biggest uncertainty? Exfiltrating weights might be tougher than I’m anticipating – eg. you could have bandwidth limits on the servers, to ensure large quantities of data (ie. uncompressible weights) aren’t being transmitted.
A pretty big uncertainty here would be the role Elon Musk takes.
It seems unlikely that he’d just sit back and watch somebody else achieve AGI (unless it’s xAI doing so).
Might push for government takeover so he might have some more control?
I don’t have many plausible examples of these, but they seem pretty likely on my inner sim. Not very confident about this point, and would love to hear other thoughts.
Epistemic Status: Exploratory
I wrote this as part of an application for the Chicago Symposium on Transformative AI, where I try & sketch out what takeoff might look like. I’m making a lot of claims, across a range of domains, so I’d expect there to be many places I’m wrong. But on the whole, I hope this is more well-thought-out than not.[1]
Many thanks to Nikola Jurković & Tao Burga for thoughtful comments on this writeup. Any errors are, of course, my own.
Prompt: Please outline an AGI takeoff scenario in detail, noting your key uncertainties
What will AGI look like?
If it’s closer to the latter, I think we’re doomed. Why? Tl;dr: too much optimization pressure is bad; and if we crank it up high enough, we goodhart ourselves out of existence.[2]
That’s not a very nice scenario to analyze. So I’m focusing on worlds in which we achieve the former & not the latter.[3]
Who will make it?
Ie. as opposed to multiple actors converging upon AGI simultaneously. Why? I think it’s far likelier that we ‘stumble onto’ AGI – given that progress has often been empirically (and not theory) driven.[4]
When will they make it?
I’ve updated away from 3-year timelines with the release of GPT 4.5. Primarily, it seems like we’re getting diminishing returns from pre-training compute.[5]
For one, DeepSeek made a lot of efficiency gains and open-sourced all their research. Labs will catch up with these gains shortly. While this might not raise the floor of capabilities, it will make ‘access to intelligence’ much cheaper, and likely speed up internal progress at labs.[6]
Also, RL is incredibly powerful, and we’re only starting to harness it with today’s reasoning models. It doesn’t seem like we’ve got a principled understanding of how to do RL well, and it seems like there’s a fairly high ceiling here.[7]
Takeoff itself
It seems like internal compartmentalization will be hard.[8] Once enough employees get wind of it, it’s likely that one of them will blow the whistle.
There might also be other signs of increased intensity – later-than-usual hours, public absence of top researchers, etc.[9]
What comes next?
First time posting here – any feedback is appreciated!
Or at least, that’s the default outcome with our progress as it stands today. I think that’s what folks are worried about with the sharp left turn.
Or at least, there’s some moderate amount of time before we go from labor-automators to paperclip-maximizers.
This isn’t a claim about when this would happen / how much compute would be required / etc. But it will likely be unexpected, whenever it may be.
I think this post raises some good points.
I think Gwern had a good take on this.
I’m uncertain about this claim. I don’t have in-depth knowledge, but this is my impression.
Labs still have to run evals / implement safety measures / etc – ie. perform tasks that need contact time with the system.
I don’t have many plausible examples of these, but they seem pretty likely on my inner sim. Not very confident about this point, and would love to hear other thoughts.