I think it's somewhat likely there's no speedup even with R&D automation, if that automation happens through scaling of LLMs with current methods, and there's no breakthrough that lets LLMs learn deep skills faster than they build next versions of models (using the current cookbook, updating deep skills primarily via RLVR). If they can't learn quickly, and don't invent a method for learning quickly, then being very fast at reasoning doesn't help, because they can only reason with the deep skills they have in the current version of the model, which only update in the next version, which takes significant time even if it's produced autonomously by the current version of the LLMs themselves, and it's possible to do general learning this way. This still means RSI and AGI in the central senses of the terms, but not a takeoff (or superintelligence).
Furthermore, whatever current speedup AI R&D might be experiencing will mostly go away once scaling of compute stops being as rapid, and this slowdown still happens after the slow-learning automation of R&D with LLMs. The low-hanging fruit enabled by more ambitious (compute-intensive) experiments will be picked, and less straightforward research will proceed at more or less the usual background pace (primarily modified by more people working on AI), because the slow-learning RSI process of automated LLM-building doesn't contribute to it in a crucial way, and so Amdahl's law reasserts the low speed of research progress in the longer term (if humans still learn novel ideas faster, in the course of solving the kinds of problems that take humanity an unpredictable number of years).
Takeoff is still possible in this scenario at any moment (upon invention of a method for learning deep skills quickly), but it doesn't happen as a result of some predictable AI-driven process of progress-grinding, other than via a slight extension in the current period of rapid compute scaling as a result of a larger TAM becoming immediately accessible if LLMs go into the slow-learning RSI. Another unknown is how smart the maximally scaled LLMs get (say, 100T active params, 2,500T total params, which is feasible in 2030 at the cost per token that GPT-4.5 had in 2025, though the compute for pretraining such models might only arrive a bit later). It doesn't seem like LLMs are likely to get very far beyond human genius level (if they are merely scaled using the current cookbook; it's plausible they don't even reach that level). And the hobbling of being unable to quickly learn deep ideas seems sufficiently crippling that they don't necessarily contribute to the speed of progress in non-routine research substantially (while the more routine kind of research soon runs out of the low-hanging fruit). So the AIs have a shot at solving fast learning for AIs, but not a prospect of predictable progress towards it (if it doesn't happen right away). And the higher annual probability of starting a takeoff mostly goes away after rapid scaling ends (though that probability is still substantial, I'd give 50% for a takeoff starting by 2032-2033, meaning an AGI that learns faster than human researchers and starts actually accelerating all the non-routine aspects of R&D progress a lot).
It sounds to me like you are describing a world where AIs don't fully automate AI R&D but can maybe 98% automate AI R&D. As in, these AIs aren't able to automate the task of generally making AIs smarter and finding new paradigms/methods, but can automate 99%+ of things within the current paradigm / approach. I tend to think the true extremes/limits of the current paradigm would be very extreme (even if weak in some ways) such that you'd be able to bootstrap if you actually reached this extreme. I also think that getting to full automation of the current paradigm would effectively require AIs to at least be OK at figuring out significant advances in general. (In the same way as I think that for AIs to be able to automate the job of research engineer at AI companies, they'd probably need to be at least OK at automating most SWE jobs in general.)
No, the AIs do fully automate R&D, AI and otherwise. But the speed with which they do R&D depends not just on the speed of token generation, but also on the speed at which they learn deep skills, and the latter is much lower for LLMs built with the current methods (they only learn deep skills in new model releases).
Token generation speed gives an anchor of maybe 200x serial speedup compared to humans, plus very scalable parallel labor, minus real world constraints from needing experimental feedback (which don't even apply to some forms of theory). But the need to learn deep skills makes this anchor misleading for the forms of labor that require many serial steps of learning new things that couldn't be all learned in advance (in parallel). This is mostly R&D, and there the anchor of massive speedup just doesn't apply, even though the work does get fully automated. Thus this is AGI, but it's not a very fast AGI, because the AGI is the automated model-building process (that creates successive versions of LLMs), rather than the token-generating LLMs themselves.
So for some R&D things, these self-building AIs might be somewhat faster than humans, and for others slower. They are AGI in the same sense as humanity, capable of eventually causing a takeoff, but not very fast at getting there.
I haven't yet found a nice and clean way to model this in isolation
Not sure if this is what you want, but:
Assume that all that doubling compute does is double labor
This isn't what I want because: (1) doubling parallel labor is extremely conservative and (2) doubling overall labor effort is both probably conservative and also totally unprincipled, like why not 10x? (Tt will depend on returns from training compute and experiment of more compue right?)
It doesn't compound further bc there's no feedback loop to more experiments like there is to more labor
I don't understand, the whole point of the experiments is to get us better labor.
This isn't what I want because
yeah i adjust for this in my other comment
I don't understand, the whole point of the experiments is to get us better labor.
yeah, so they do - a doubling of cumulative experiments drives 0.7 doublings of software. And then that better software does more cognitive work to improve software further still. But it doesn't increase the amount of compute available for experiments, so the feedback loop doesn't go full circle.
For "cognitive labour" we have: more compute -> more cog labour -> beter software -> more cog labour -> better software... So you get and initial boost from extra compute which then ratchets up with the software feedback loop
But for "experimental compute" we have: more compute -> more experiments -> better software -> more cog labour -> better software...
So while it feeds into the software feedback loop, it doesn't loop back to "more experimental compute".
I'm not sure if my math full priced this in. I'll think about that more.
In fact the above is conservative in assuming one compute doubling yields just one labor doubling. You get that by running more copies. But you'll train smarter models
If one compute doubling yields two labor doublings (holding software constant) then, rerunning the above analysis:
How much faster would ai progress be compared to today?
Of course, compute may be growing more slowly
I thought about this a bit more.
tldr: Under a simple model and reasonable assumptions, if we automate AI R&D and compute growth stays constant then the pace of AI software progress is 3-5X faster. This means the pace of overall AI progress would be 2-3X faster.
Assume AI software R&D is Cobb Douglas:
In this system, before automating away humans, it turns out that:
When you automate AI R&D, L = S. In this situation, it turns out that:
In other words, even absent an SIE, automating AI R&D boosts the standard growth rates by a factor of 1 / (1 - alpha * r) due to the fizzling feedback loop of "better software -> better AI researchers -> better software".
This model allows us to ballpark how much faster overall AI progress would be in a regime with full automation but no SIE.
That regime causes two changes:
Concretely, let's make the following assumptions:
Then today the total pace of software progress is :
g_S = progress due to growing labour + progress due to growing compute
= g_E * alpha * r + g_E * beta * r
= g_E * r
And after AI R&D automation g_L is 2X faster and everything gets boosted by a factor of 3. The pace of software progress is:
g_S = progress due to growing labour + progress due to growing compute
= (2 * g_E * alpha * r + g_E * beta * r)*3
= 4.5 * g_E * r
So that's 4.5x faster software progress, holding the rate of compute growth constant! If software and compute contribute equally to AI progress, that's a bit under 3x faster total AI progress.
I think not too surprising, given we're using an r*alpha value close to 1.
If we used r*alpha =0.5, our boost factor shrinks to 2 and we'd get 3x faster software progress and 2x faster total AI progress.
I'm not sure I get it. What are the automated researchers doing that makes frontier progress move so much faster? They're coming up with experiment designs and inventions which are so much better than the human teams designs and inventions that they need half or a quarter the number of experiments per success? But they've (ex hypothesi) only just got to automated researcher level. Where is that taste coming from?
I think you get taste almost entirely from experience (or second-hand records) of relevantly-similar experiments. You can also reason to better experiment design, but that looks to have steeply diminishing returns both empirically and theoretically. So it's almost entirely about experimental compute supply and sample-efficiency of taste-accumulation.
I do think there's likely to be something like a speed/efficiency boost (perhaps explosion... unsure) from automated coding (again dependent on experimental verification, but this is one where you can extrapolate decently well from the prior art of compute optimisation). So maybe that's enough to massively increase experimental compute budget, effectively multiplying experimental throughput. Perhaps this alone is enough to get the overall rate of progress up - returns to experimental throughput are pretty good?
Additionally, maybe once enough context is logged and that training pipeline is in place, taste-accrual is just better even if sample-efficiency is worse, because the AI sees all the context of all experiments, as opposed to human teams, which have some discount for needing to communicate between people and such.
On TEDAI - what are the automated researchers automating? How does that end up colonising all other capabilities? Where are they getting the data?
I think your questions are mostly better answered by How quick and big would a software intelligence explosion be? and AI Futures Model.
On TEDAI - what are the automated researchers automating? How does that end up colonising all other capabilities? Where are they getting the data?
They are making methods for creating general intelligence combined with methods for generally making AIs better at specific downstream domains (e.g., mechanical engineering). This requires data collection in that downstream domains, but I'd guess this data collection is pretty easy (e.g., you can deploy the AI to learn huge amounts in parallel in addition to learnings as much as possible in sim and you can compensate for weakness with additional time/effort or with specific superhuman abilities in many cases).
while 4x more compute at current margins is pretty close to as good as getting compute that's 4x serially faster
Why?
Current AI algorithms are highly parallelizable for training (though obviously not perfectly parallelizable). This would be more important for inference speed, but at least you can do 2x faster inference with 4x more compute at current margins. So, it's as good as somewhere between 2x and 4x serially faster compute, I'd guess a bit over 3x. Another relevant point here: we seem to have ended up in a paradigm that's very good at absorbing parallel training compute relative to benefiting from serial speed at the same FLOP count, so more parallel compute is pretty reasonable (like we already needed to have ~extremely parallelizable training/experiment algorithms for anything to work at all).
There is a gradual boost setting that smooths out the automation returns over a longer period, but I think this period is unrealistically long such that you don't see one-time speed-up effects
Would it help if we added another param controlling how many years the boost occurs over?
Sure, it would help some (though I'd want to think about whether the earlier trajectory is very plausible). To be clear, this isn't at all a blocker for me or something, I mostly wanted to make the higher level point in this post and ran into this issue. I could have just edited and run the code locally (which is pretty easy these days...).
If some fraction of the AI labor doing R&D is misaligned, the same r that speeds up the lab also gives that AI r-times more wall-clock to act before humans can re-check. Is this already priced into r, or treated as independent of the capability speedup?
In my estimates and discussion, I'm assuming the AIs aren't intentionally doing poorly on AI capabilities R&D (and I'm mostly assuming they are reasonably elicited, though this assumption could easily be relaxed, it mostly just shifts the timing rather than the takeoff I think).
I suspect the AI Futures Model modestly underestimates takeoff speeds and one-time acceleration effects due to effectively acting as though AI speed and quantity don't matter outside of coding automation.
On the other hand, doesn't the Forethought model assume a Cobb-Douglas production function? If the goal is to figure out how far a one-time speedup from automating AI R&D takes you, the AI Futures Model seems better since it models (automated labor, experiment compute) complementarities in more detail?
This is a somewhat technical note.
By "software-only singularity", I mean that, after full automation of AI R&D, progress gets faster and faster due to smarter AIs driving increasingly fast rates of improvement in algorithms (overcoming diminishing returns), and that this lasts long enough to yield a large amount of progress (e.g. at least 4 years of progress in 1 year). The equivalent statement in jargon is: r is significantly greater than 1 (implying progress is getting faster and faster) and this remains the case for long enough to get large amounts of progress. For context, see How quick and big would a software intelligence explosion be?
Even without a "software-only singularity", I think full automation of AI R&D probably greatly speeds up progress for two main reasons:
We can also analyze this by looking at an example trajectory in the AI Futures Model that barely misses a software-only singularity and seeing how fast progress is after full automation of AI R&D. This trajectory involves a little over 2 years of progress in the year after full automation of AI R&D (SAR). This corresponds to going from full automation of AI R&D (SAR) to Top-human-Expert-Dominating AI (TEDAI) [3] in a bit less than a year, which is a lot of progress. (Quantitatively, it involves going from a 24x AI R&D software acceleration to a 270x AI R&D software acceleration in a year.) I suspect the AI Futures Model modestly underestimates takeoff speeds and one-time acceleration effects due to effectively acting as though AI speed and quantity don't matter outside of coding automation. [4]
There are other (indirect) reasons AI progress might speed up around when AIs automate AI R&D:
One important caveat is that by the time AIs automate AI R&D, the rate of compute scaling may be substantially lower than it is today. Thus, the default/trend rate of AI progress may be lower, so the corresponding acceleration would be relative to a lower baseline. This is directly applicable for the "further compute has increased returns" argument and maybe has a modest effect on the size of the one-time speed up (the size of the one-time speed up is sensitive to how much returns from further labor effort have diminished at a given level of compute).
If I remember correctly, this model effectively acts as though you go from no automation acceleration directly to full automation, while in practice earlier AIs will substantially accelerate AI R&D, meaning that returns to effort will already have substantially diminished by the point you reach full automation. As in, full automation will be a large acceleration relative to a human-only baseline, but a relatively smaller acceleration relative to AIs that existed 6 months before full automation, so much of the low-hanging fruit will already be plucked. You can model this in an ad hoc way by reducing the initial speed-up parameter such that it corresponds to the speed-up over AIs that existed 8 months prior to full automation; with my parameter guesses, this yields around 2.5 years of progress in the first year. (There is a gradual boost setting that smooths out the automation returns over a longer period, but I think this period is unrealistically long such that you don't see one-time speed-up effects.) ↩︎
Historically, progress has been driven by both scaling up compute and scaling up labor. However, I expect scaling up labor has been a small fraction of the effect in recent years. Compute for algorithms and training has been scaled up by around 4x per year while company employee count has 3x'd each year. But employee count 3x'ing is way worse than making all employees operate 3x faster due to a diminishing labor pool, (mostly one-time) onboarding costs, and parallelization penalties (while 4x more compute at current margins is pretty close to as good as getting compute that's 4x serially faster). I think the discount from a diminishing labor pool and from onboarding makes the 3x increase in the number of employees roughly as good as a "free" 2x increase in employee count at equal quality. Then, the parallelization penalty further reduces this 2x increase to being as valuable as having existing employees operate ~1.3x faster. Thus, I expect the labor increase is much less important than a 4x increase in compute. So it's fair to model the large majority of recent progress as being driven by increases in compute, where the value mostly comes from being able to run more experiments. ↩︎
TEDAI: AIs which strictly dominate top human experts in virtually all cognitive tasks (i.e., doable via remote work). ↩︎
This is in part because it doesn't model shifting to research directions that are more effective in the low-compute but plentiful-labor regime. ↩︎
Fully automated AI R&D makes moderate advantages more likely to be stable/predictable because now the labor part of AI R&D is likely commoditized and similar between companies (reducing variance). However, maintaining a lead ultimately requires maintaining a compute advantage (a large software lead can probably be converted into a compute advantage): if a trailing company had more compute and was able to hold on to a compute advantage (despite the potentially decisive advantages of the leading company), we should expect them to eventually catch up and overtake because labor is commoditized after full automation. I suspect it will be hard for significantly trailing companies to maintain a compute advantage if the leading company pulls far ahead on software due to speed ups from AI R&D. In the most extreme case, the leading company (or the AIs of the leading company) might literally take over the world, neutralizing prior compute advantages of trailing companies. ↩︎
Investors might be incentivized to pressure the trailing company to sell their compute to the leading company even if the leadership of the company isn't inclined to do this. Investors have limited power so this isn't clearly sufficient, but a deal could be designed to give the leadership of the trailing company additional power or possibly financial upside, so that they are incentivized to sell. Also, the leading company might just end up being extremely powerful, in the limit literally fully taking over the world. ↩︎