The usual story is that AGI causes RSI, which becomes a takeoff, rapidly approaching technological maturity even in the software-only initial phase. But without breakthroughs, LLMs by themselves lead to a strange baseline scenario where they enable an RSI process that gives them general but slow learning. This RSI process then constitutes an AGI, without leading to a takeoff, because crucial aspects of learning only happen between model releases, which take a long time to make. Even very fast reasoning doesn't lead to fast learning if it doesn't invent a method for fast learning, and inventing difficult things requires a lot of serial steps of learning, which takes a long time without a method for fast learning.
RSI happens through automation of the more mundane aspects of AI R&D, implementing RSI in the central sense of the word, automatically developing the next better models. This is likely to happen because of the remaining scaling runway, which enables a system that could be called Mythos+3 (by maybe 2031-2033), where Sonnet could be called Mythos-2, in terms of the levels of capability resulting from scaling alone. Using RLVR to train that system to proficiency at all the steps in a model building cookbook should make it capable of building models automatically, including the steps necessary to develop new RLVR tasks and graders.
This capability of developing training data for RLVR is what makes the RSI process of automated development of models using LLMs an AGI. It's the process of model development that becomes an AGI, while the individual LLMs don't count as AGIs, since they can't learn novel deep skills. In contrast, in-context learning (that an individual LLM possesses) is too weak, and continual learning with anything resembling the current methods is likely to either follow the weakness of in-context learning, or to inherit the sample inefficiency of pretraining.
The step of using RLVR to develop the next model remains crucial in creating new deep s