"Culmination" suggests a subsequent decline. In 2025, scaling of RLVR delivered a lot of capabilities, and late 2025 was the first time since 2023 that 10x-20x scaling in pretraining compute (compared to original Mar 2023 GPT-4) has finally made a full appearance (probably in Gemini 3 Pro and Opus 4.5). There is 100x-400x more in scaling of compute by 2029-2031 (compared to the current models), and much more in the low-hanging fruit of doing things well rather than prototyping the first thing that sort of works. The only low-hanging fruit that likely mostly ran out in 2025 is raw scaling of RLVR (in proportion to pretraining), and even that probably still has a bit to go. Setting up better tasks and RL environments will plausibly be more impactful than some feasible amount of further scaling of RLVR relative to pretraining. Then there's continual learning that might be quite impactful in 2026-2027.
I expect some sort of culmination in 2027-2032 (assuming no AGI), when scaling of compute slows down and there have been at least 1-2 years to learn to make better use of it. The first stage of compute slowdown probably follows 2026, if 2028 won't see 5 GW training systems (which would be on trend for training system compute growth in 2022-2026, but currently doesn't seem to be happening). The second probably follows 2028-2030, when funding to secure ever more training compute mostly stops growing, and so compute mostly falls back to growth according to its price-performance.
I did attempt to preempt this kind of response with "some guardrails on effective and ethical use of those galaxies" in my comment. There exists a reasonable middle ground where individual people get significant say on what happens with some allotment of resources, more say than the rest of humanity does. Disagreements on values have a natural solution in establishing scopes of optimization and boundaries between them, even if these are not absolute boundaries (let alone physical boundaries), rather than mixing everything together and denying meaningful individual agency to everyone.
The reasonable question is which portions should get how much weight from individual agency, and which portions must be shared, governed jointly with others. But starting from the outset with little shared resources and (obviously) allowing establishment of shared projects using their resources by agreement between the stakeholders doesn't seem much different from some de novo process of establishing such shared projects with no direct involvement of individuals, if the stakeholders would indeed on reflection prefer to establish such projects and cede resources to them.
(I'm of course imagining a superintelligent level of agency/governance, to match the scale of resources. But if humans are to survive at all and get more than a single server rack of resources, ability to grow up seems like the most basic thing, and governing a galaxy at the level of a base human does seem like failing at effective use. Preferences extrapolated by some external superintelligence seem like a reasonable framing for temporary delegation, but these preferences would fail to remain legitimate if the person claimed as their originator grows up and asks for something different. So ultimately individual agency should have greater authority than any abstraction of preference, provided the individual had the opportunity to get their act sufficiently together.)
Within the hypothetical of ASIs that won't just largely ignore humanity (leading to its demise or permanent disempowerment), there is a claim about distribution of resources according to AI company stock. A lot of the post argues with the hypothetical rather than the claim-within-the-hypothetical, which as a rhetorical move is friction for discussing hypotheticals. This move ends up attempting to prove more than just issues with the claim, while ignoring the claim, even as it's not its intent.
(The claim isn't ignored in other parts of the post, but what seems wrong with the framing of the post is the parts that are about the hypothetical rather than the claim. To illustrate this, I've attempted to defend coherence of the hypothetical, importance of the issue of distribution of resources within it, and the framing of individual humans ending up with galaxy-scale resources.)
Dogs pee on trees to mark their territory; humans don't respect that. Humans have contracts; ASIs won't respect those either.
There are billions of reachable galaxies, and billions of humans. Different people will develop different values, and property rights are just what it means for each person to have the freedom to pursue such values. The form of property rights will be different, AI company stock might not survive as an anchor for distribution of the cosmic wealth, and some guardrails on effective and ethical use of those galaxies likely make sense. Shares in future compute probably make more sense as units of property than actual physical galaxies. But individual humans ending up with galaxy-scale resources is a reasonable way of operationalizing the outcome where ASIs didn't permanently disempower the future of humanity.
On current trajectory, ASIs seem likely to largely or fully ignore the future of humanity, but possible alternatives to this are not as a whole category simply meaningless, a sign of ignorance and lack of imagination. Splitting the physical resources becomes necessary once the world is past the current regime of seemingly unbounded growth and runs into the limits of accelerating expansion of the universe. For example, there is probably a tradeoff between running a lot of things in parallel, which would then run out of useful compute relatively early, and stockpiling matter for fewer things to survive into much deeper time, long after the galaxy clusters can no longer communicate with each other. There might be disagreements about which way to go with this between different people.
If you only update weights in LoRAs, they can be like KV-caches, different for each request and taking up only a reasonable amount of memory, preserving all the benefits of batching and keeping the users isolated from each other.
I'm thinking of simply greater salience, compared to a more bearish trajectory with no continual learning (where chatbots are the new Google but people aren't losing jobs all over the place). If there are objective grounds for a public outcry, more people will pay more attention, including politicians. What they'll do with that attention is unclear, but I think continual learning has the potential for bringing significantly more attention in 2027-2028 compared to its absence, without yet an existential catastrophe or a straightforwardly destructive "warning shot".
Next word prediction RLVR is massively more compute hungry per unit of data (than pretraining), so it's both likely impractical at current levels of compute, and plausibly solves text data scarcity at 2028-2030 levels of compute if it's useful. The benefit is generality of the objective, the same as with pretraining itself, compared to manual construction of RL environments for narrow tasks. Given pretraining vs. RLVR capability gap, it's plausibly a big deal if it makes RL-level capabilities as general as the current shallow pretraining-level capabilities.
The fact that Łukasz Kaiser (transformer paper co-author, currently at OpenAI) is talking about it in Nov 2025 is strong evidence AI companies couldn't yet rule out that it might work. The idea itself is obvious enough, but that's less significant as evidence for its prospects.
The point is the distinction between pretraining and RL (in the level of capability), and between manual jagged RLVR and hypothetical general RL (in the generality of capability). I think observing Opus 4.5 and Gemini 3 Pro is sufficient to be somewhat confident that even at 2026 compute levels pretraining itself won't be sufficient for AGI (it won't train sufficiently competent in-context learning behavior to let AIs work around all their hobblings), while IMO gold medal results (even with DeepSeek-V3 model size) demonstrate that RLVR is strong enough to get superhuman capabilities in the narrow skills it gets applied to (especially when it's 1-4T active param models rather than DeepSeek-V3). So in the current regime (until 2029-2031, when yet another level of compute becomes available) AGI requires some kind of general RL, and continual learning doesn't necessarily enable it on its own, even if it becomes very useful for the purposes of on-the-job training of AI instances.
This is more of a claim that timelines don't get shorter within the 2026-2028 window because of continual learning, even if it's understood as something that significantly increases AI adoption and secures funding for 5+ GW training systems by 2028-2030 (as well as rouses the public via job displacement). That is, starting with timelines without continual learning (appearing as its own thing, rather than an aspect of AGI), where AGI doesn't appear in 2026-2028, I think adding continual learning (on its own) doesn't obviously give AGI either, assuming continual learning is not actual automated RLVR (AIs applying RLVR automatically to add new skills to themselves). After 2029-2031, there are new things that could be attempted, such as next word prediction RLVR, and enough time will pass that new ideas might get ready, so I'm only talking very near term.
Continual learning might wake the world up to AGI, without yet bringing the dangers of AGI.
Pretraining gives shallow intelligence that is general, RL gives deep creative intelligence in a narrow skill, but it used to be very hard to make it work well for most skills. RL with pretrained models, which is RLVR, makes RL robustly applicable to a wide variety of narrow skills. But it still needs to be applied manually, the skills it trains are hand-picked before deployment, and so deep creative intelligence from RLVR remains jagged, compared to the more general shallow intelligence from pretraining.
Continual learning has now been vaguely announced for 2026 by both Anthropic and GDM. If it merely provides adaptation for AI instances to the current situation or job at the shallow level of pretraining, it might still be significantly more economically valuable than the current crop of models, leading to more visible job displacement, waking up the public and the politicians to more of the importance of AI. Yet it doesn't necessarily make direct progress in automating RLVR or introducing some other way of turning the creativity of RL general, and so the AIs won't necessarily get notably more dangerous than today at the level of disempowerment or extinction.
Kinda, but there won't be enough natural text data at the higher end of this range (using 2028-2030 compute) to just scale pretraining on text (you'd need more than 1,000 trillion tokens with repetition, maybe 200-300 trillion unique tokens), something else would need to happen instead or you start losing efficiency and compute ends up being less useful than it would be if there was enough text.
The steps of scaling take a long time, so only late 2025 models get to be shaped compute optimally for 2024 levels of pretraining compute, and run on hardware announced and first available in the cloud in 2024. This is just 2 years from 2022, when GPT-4 was trained, and the first of two 10x-20x steps at the 2022-2026 pace of scaling, with a third step remaining somewhere beyond 2026 if we assume $100bn per year revenue for an AI company (at that time). With 2026 compute, there just might be enough text data (with repetition) to say that scaling of pretraining is still happening in a straightforward sense, which brings the change from original Mar 2023 GPT-4 to 100x-400x (for models that might come out in 2027).
But this 100x-400x is also a confusing point of comparison, since between 2023 and 2027 there was the introduction of RLVR scaling (and test time reasoning), and also all the improvements that come from working on a product (as opposed to a research prototype) for 4 years. Continual learning might be another change complicating this comparison that happens before 2027 (meaning it might be a significat change, which remains uncertain; that it's coming in some form, at least as effective context extension, seems quite clear at this point).