Epistemic status: Briefer and more to the point than my model of what is going on with LLMs, but also lower effort. This is my most "combative"-toned post to date and I think a little too much so.

Here is the paper. The main reaction I am talking about is AI 2027, but also basically every lesswrong take on AI 2027. (EDIT: A couple of the authors of AI 2027 have pointed out that its predictions are not mainly based on METR's report, but I think the main points of this post still hold up). 

A lot of people believe in very short AI timelines, say <2030. They want to justify this with some type of outside view, straight-lines-on-graphs argument, which is pretty much all we've got because nobody has a good inside view on deep learning (change my mind). 

The outside view, insofar is that is a well-defined thing, does not justify very short timelines. 

If AGI were arriving in 2030, the outside view says interest rates would be very high (I'm not particularly knowledgeable about this and might have the details wrong but see the analysis here, I believe the situation is still similar), and less confidently I think the S&P's value would probably be measured in lightcone percentage points (?). We might also look at the track record of previous (wrong) predictions that AGI was near dating back as far as the 1960's or so. Also, we might use the predictions of experts and superforcasters. Only the last (and perhaps least pure) "outside view" seems to support short timelines at all, but depending on the way the question is posed it still seems to predict more like 2050

As far as I can tell, this is why everyone keeps using this plot instead of those other methods:

https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01df2684-de42-46f7-80ea-70270f929c8d_1564x933.png

I'm glad METR did this work, and I think their approach is sane and we should keep adding data points to this plot. Also, over the last couple of years this has been one of my only updates towards shorter timelines - almost everything else has pushed me towards longer timelines (I don't expect to be able to change anyone's intuitions here, but I look outside and do not see AGI reshaping the world, do not find chatbots particularly useful, etc. and it's clear that this is not a central prediction of my short-timelines-conditioned model). 

The plot doesn't say what people want it to say. There are three major problems. 

I'm not sure which is most serious, but I think it's probably that the line just doesn't have a lot of points on it. This is a boring objection which I don't expect to change anyone's mind, so I'll just move past it.

The second problem is that the plot does not exactly predict AGI will arrive by 2030. From the paper:

Naively extrapolating the trend in horizon length implies that AI will reach a time horizon of >1 month (167 work hours) between late 2028 and early 2031...

As far as I am concerned, AGI should be able to do any intellectual task that a human can do. I think that inventing important new ideas tends to take at least a month, but possibly the length of a PhD thesis. So it seems to be a reasonable interpretation that we might see human level AI around mid-2030 to 2040, which happens to be about my personal median.

However, the authors of AI 2027 predict pretty radical superintelligence before 2030, which does not seem to be justified by the plot. Arguably, since the plot is focused on software engineering tasks, the most relevant comparison is actually their prediction  for human level software engineers, which is I believe is around 2026-2028 (clearly inconsistent with the plot). 

Now, the counterargument goes something like this: We only need to focus on human level AI research and the rest will follow. Early AI systems will speed up software engineering by iterating quickly and this will increase the rate of progress drastically. 

This is no longer a straight-lines-on-graphs argument. Doesn't the current plot already factor in AI coding assistance over the 2023-2025 period or so?

But the deeper problem is that the argument is ultimately, subtly circular. Current AI research does look a lot like rapidly iterating and trying random engineering improvements. If you already believe this will lead to AGI, then certainly AI coding assistants which can rapidly iterate would expedite the process. However, I do not believe that blind iteration on the current paradigm leads to AGI (at least not anytime soon), so I see no reason to accept this argument. 

I bolded that because I think otherwise people won't absorb it. I think it ties into a wider disagreement; I believe intelligence is pretty sophisticated while others seem to think it's mostly brute force. This tangent would however require a longer discussion on the proper interpretation of Sutton's bitter lesson.

Others have argued that the line can't stay straight forever because eventually AI systems will "unlock" all the abilities necessary for long-horizon tasks. If you believe this, please explain to me why you believe it. It seems to require a sophisticated inside view on deep learning that no one has. 

I am least certain about the following third objection, because it involves a game of reference class tennis which I am not sure I win. Basically, the plot is conflating two different things: the performance of base (pretrained) models and reasoning models. The base models seem to have topped out their task length around 2023 at a few minutes (see on the plot that GPT-4o is little better than GPT-4).  Reasoning models use search to do better. We already have search-based AI algorithms that can perform longer horizon tasks above human level, for instance Go and Chess, and those would totally flatten the slope of the line if they were included (on the top left of the plot). However, I don't think this is a knock-down argument because there are some principled reasons not to include these points. We are only focusing on software engineering tasks here. Also, reasoning models seem importantly more general than even the most general game playing algorithms (AlphaZero, MuZero). Still, I think it's worth pointing out that a very specific choice has been made in order to get this line with this slope.

(EDIT: I no longer endorse this, since @AnthonyC reminded me distilling CoT will probably solve it.) The stronger conclusion I want to make from the last objection is that taking the plot seriously, it seems that further progress depends on scaling up inference-time compute. If so, we may not expect to suddenly summon an army of superhuman coders. The first superhuman coders might be very, very expensive to run, particularly when we need them to perform longer-horizon tasks. However, I have not analyzed this situation carefully.

This is important because though I am appropriately terrified of AGI arriving before or around 2030, acting as if this were near-certain sacrifices a lot of opportunities in worlds where AGI arrives around 2040 or later. For instance, we will look pretty crazy for setting off a false alarm. Also, on an individual level we may not prioritize gathering resources like career capital or gaining influence in slower moving established institutions.

Also, I am just surprised I seem to be the only one making this fairly obvious point (?), and it raises some questions about our group epistemics. 

New Comment
41 comments, sorted by Click to highlight new comments since:

I'm quite surprised that you're so critical of attempts to interpolate from the METR results (not enough data points), but A-okay with trying to read tea leaves from the interest rate, a single data point that is affected by all kinds of factors such as whether people expect Trump to crash the economy by bringing back mercantilism.

I'm not saying it's invalid to critique predictions based on METR, I just don't think you're applying consistent standards.

I don’t think either is reliable. It’s just an example of a reasonable baseline outside view.

I think the interest rate thing provides so little evidence either way that it’s misleading to even mention it. See the EAF comments on that post, and also Zvi’s rebuttal. (Most of that pushback also generalizes to your comment about the S&P.) (For context, I agree that AGI in ≤2030 is unlikely.)

Thanks for the links, I’ll look into it. 

I agree that the S&P is pretty much reading tea leaves, the author of the interest rates post @basil.halperin has separately argued it is not reliable.

However, the authors of AI 2027 predict pretty radical superintelligence before 2030, which does not seem to be justified by the plot. Arguably, since the plot is focused on software engineering tasks, the most relevant comparison is actually their prediction  for human level software engineers, which is I believe is around 2026-2028 (clearly inconsistent with the plot). 

Our rationale for why we extend the trend in the way that we do can be found in our timelines forecast. In short, we adjust for (a) the possible trend speedup to a ~4 month doubling time as in the 2024-2025 trend (b) the possibility of further superexponentiality (c) intermediate speedups from AIs that aren't yet superhuman coders. Fair if you disagree, but we do explain how we expect things to deviate from the plot you included.

I think it was a mistake to focus particularly on AI 2027 when the METR report does not seem to play a prominent role there.

Indeed, the content of AI 2027 was all but finalized before the METR report came out. Like Eli said if you want to know where our timelines are coming from, there's a page on the website for that.

Does the METR report then update you towards later timelines?

(I am not saying it necessarily should, since obviously I have some doubts about the strength of its evidence)

Indeed it did! I remember saying to someone (maybe Beth? I forget) that seeing that graph made me feel slightly relieved. When we started writing AI-2027, my median AGI arrival date was 2027; now it's 2028. (Though most of the update happened for other reasons besides the METR graph)

[-]lc*1815

The outside view, insofar is that is a well-defined thing...

It's not really a well-defined thing, which is why the standard on this site is to taboo those words and just explain what your lines of evidence are, or the motivation for any special priors if you have them.

If AGI were arriving in 2030, the outside view says interest rates would be very high (I'm not particularly knowledgeable about this and might have the details wrong but see the analysis here, I believe the situation is still similar), and less confidently I think the S&P's value would probably be measured in lightcone percentage points (?).

So, your claim is that interest rates would be very high if AGI were imminent, and they're not so it's not. The last time someone said this, if the people arguing in the comment section had simply made a bet on interest rates changing, they would have made a lot of money! Ditto for buying up AI-related stocks or call options on those stocks.

I think you're just overestimating the ability of the market to generalize to out of distribution events. Prices are set by a market's participants, and the institutions with the ability to move prices are mostly not thinking about AGI timelines at present. It wouldn't matter if AGI was arriving in five or ten or twenty years, Bridgewater would be basically doing the same things, and so their inaction doesn't provide much evidence. Inherent in these forecasts there are also naturally going to be a lot of assumptions about the value of money (or titles to partial ownership of companies controlled by Sam Altmans) in a post-AGI scenario. These are pretty well-disputed premises, to say the least, which makes interpreting current market prices hard.

As far as I am concerned, AGI should be able to do any intellectual task that a human can do. I think that inventing important new ideas tends to take at least a month, but possibly the length of a PhD thesis. So it seems to be a reasonable interpretation that we might see human level AI around mid-2030 to 2040, which happens to be about my personal median.

The issue is, ML research itself is composed of many tasks that do take less than a month for humans to execute. For example, on this model, sometime before "idea generation", you're going to have a model that can do most high-context software engineering tasks. The research department at any of the big AI labs would be able to do more stuff if it had such a model. So while current AI is not accelerating machine learning research that much, as it gets better, the trend line from the METR paper is going to curl upward.

You could say that the "inventing important new ideas" part is going to be such a heavy bottleneck, that this speedup won't amount to much. But I think that's mostly wrong, and that if you asked ML researchers at OpenAI, a drop in remote worker that could "only" be directed to do things that otherwise took 12 hours would speed up their work by a lot.

But the deeper problem is that the argument is ultimately, subtly circular. Current AI research does look a lot like rapidly iterating and trying random engineering improvements. If you already believe this will lead to AGI, then certainly AI coding assistants which can rapidly iterate would expedite the process. However, I do not believe that blind iteration on the current paradigm leads to AGI (at least not anytime soon), so I see no reason to accept this argument. 

It's actually not circular at all. "Current AI research" has taken us from machines that can't talk to machines that can talk, write computer programs, give advice, etc. in about five years. That's the empirical evidence that you can make research progress doing "random" stuff. In the absence of further evidence, people are just expecting the thing that has happened over the last five years to continue. You can reject that claim, but at this point I think the burden of proof is on the people that do. 

Thanks for engaging in detail with my post. It seems there were a few failures of communication that are worth clarifying.

It's (the outside view) not really a well-defined thing, which is why the standard on this site is to taboo those words and just explain what your lines of evidence are, or the motivation for any special priors if you have them.

I thought it was clear that I'm not confident in any outside view prediction of AGI timelines, from various statements/phrasings here (including the sentence you're quoting, which questions the well-definedness of "the outside view") and the fact that the central focus of the post is disputing an outside view argument. Apparently I did not communicate this clearly, because many commenters have objected to my vague references to possible outside views as if I were treating them as solid evidence, when in fact they aren't really a load bearing part of my argument here. Possibly the problem is that I don't think anyone has a good inside view either! But in fact I am just "radically" uncertain about AGI timelines - my uncertainty is ~in the exponent. 

Still, I find your response a little ironic since this site is practically the only place I've seen the term "outside view" used. It does seem to be less common over the last year or two, since this post which you're probably referring to. 

So, your claim is that interest rates would be very high if AGI were imminent, and they're not so it's not. The last time someone said this, if the people arguing in the comment section had simply made a bet on interest rates changing, they would have made a lot of money! Ditto for buying up AI-related stocks or call options on those stocks.

Interesting, but non-sequitur. That is, either you believe that interest rates will predictably increase and there's free money on the table, and you should just say so, or not, and this anecdote doesn't seem to be relevant (similarly, I made money buying NVDA around that time, but I don't think that proves anything). 

You could say that the "inventing important new ideas" part is going to be such a heavy bottleneck, that this speedup won't amount to much. But I think that's mostly wrong, and that if you asked ML researchers at OpenAI, a drop in remote worker that could "only" be directed to do things that otherwise took 12 hours would speed up their work by a lot.

Perhaps, but shouldn't LLMs already be speeding up AI progress? And if so, shouldn't that already be reflected in METR's plot? Are you predicting superexponential growth here?

It's actually not circular at all. "Current AI research" has taken us from machines that can't talk to machines that can talk, write computer programs, give advice, etc. in about five years. That's the empirical evidence that you can make research progress doing "random" stuff. In the absence of further evidence, people are just expecting the thing that has happened over the last five years to continue. You can reject that claim, but at this point I think the burden of proof is on the people that do. 

It seems to me that progress has been slowing for the last couple of years. If this trend continues, progress will stall. 

[-]lc20

Interesting, but non-sequitur. That is, either you believe that interest rates will predictably increase and there's free money on the table, and you should just say so, or not, and this anecdote doesn't seem to be relevant (similarly, I made money buying NVDA around that time, but I don't think that proves anything).

I am saying so! The market is definitely not pricing in AGI; doesn't matter if it comes in 2028, or 2035, or 2040. Though interest rates are a pretty bad way to arb this; I would just buy call options on the Nasdaq.

Perhaps, but shouldn't LLMs already be speeding up AI progress? And if so, shouldn't that already be reflected in METR's plot?

They're not that useful yet.

I am saying so! The market is definitely not pricing in AGI; doesn't matter if it comes in 2028, or 2035, or 2040. Though interest rates are a pretty bad way to arb this; I would just buy call options on the Nasdaq.

Hmm well at least you're consistent.

They're not that useful yet.

Certainly I can see why you expect them to become more useful, I still feel like there's some circularity here. Do you expect the current paradigm to continue advancing because LLM agents are somewhat useful now (as you said, for things like coding)? Unless that effect is currently negligible (and will undergo a sharp transition at some point) it seems we should expect it to already be reflected in the exponential growth rate claimed by METR.

I'm glad METR did this work, and I think their approach is sane and we should keep adding data points to this plot.

It sounds like you also think the current points on the plot are accurate? I would strongly dispute this, for all the reasons discussed here and here. I think you can find sets of tasks where the points fit on an exponential curve, but I don't think AI can do 1 hour worth of thinking on all, or even most, practically relevant questions.

I remember enjoying that post (perhaps I even linked it somewhere?) and I think it’s probably the case that the inefficiency in task length scaling has to do with LLMs having only a subset of cognitive abilities available. I’m not really committed to a view on that here though.

The links don’t seem to prove that the points are “inaccurate.” 

What would you expect the world to look like if AGI < 2030? Or put another way, what evidence would convince you that AGI < 2030?

Unfortunately, those are two importantly different questions. While it is certainly possible that AGI will arrive before 2030, I am not sure that it would be predictable in advance without a "textbook of the (far) future" on deep learning. Of course, it is much more likely to be predictable a couple of days in advance than a year in advance, but even the former may conceivably be quite challenging depending on situational awareness of near-human-level models in training. 

What would you expect the world to look like if AGI < 2030?

In many respects, similar to how it does look (e.g. AI passes the Turing test pretty easily / convincingly, is competent at some coding tasks, etc. etc.).

It's more "interesting" to discuss the departures:

  • Claude would probably be better at Pokemon
  • Maybe self-driving vehicles are more common / robust
  • I think the METR plot would look different. I don't think task length should be the bottleneck. I think that indicates that something isn't scalable. But I am not sure.
  • There would perhaps be a few novel insights or discoveries from LLMs in a few domains where they have an advantage over us. I am not convinced this is true and in fact absence of evidence is evidence of absence: https://www.lesswrong.com/posts/GADJFwHzNZKg2Ndti/have-llms-generated-novel-insights
  • I would weakly expect to find AI a little more useful, but more strongly, I would expect to be finding it increasingly useful over the last couple of years, and I don't.
  • "Hallucinations" (I think @Abram Demski has a better term I can't remember) would be easier to get rid of.

Each point is capable of shifting my opinion to a greater or lesser degree. Mainly seeing novel insights would change my mind. If METR's results hold up that will also change my mind.

Personally I think 2030 is possible but aggressive, and my timeline estimate it more around 2035. Two years ago I would have said 2040 or a bit later, and capabilities gains relevant to my own field and several others I know reasonably well have shortened that, along with the increase in funding for further development.

  • The Claude/Pokemon thing is interesting, and overall Pokemon-playing trend across Anthropic's models is clearly positive. I can't say I had any opinion at all about how far along an LLM would get at Pokemon before that result got publicized, so I'm curious if you did. What rate of progress on that benchmark would you expect in a short-timelines world? If there's an LLM agent that can beat Pokemon in six months, or a year, or two years?
  • Self-driving vehicles are already more of a manufacturing and regulatory problem than a technical one. For example, as long as the NHTSA only lets manufacturers deploy 2500 self-driving vehicles a year each in the US, broad adoption cannot happen, regardless of technical capabilities or willingness to invest and build.
  • I also don't think task length is a perfect metric. But it's a useful one, a lower bound on what's needed to be able to complete all human-complete intellectual tasks. Like everything else to date, there is likely something else to look at as we saturate the benchmark.
  • I agree novel insights (or more of them, I can't say there haven't been any) will be strong evidence. I don't understand the reason for thinking this should already be observable. Very, very few humans ever produce anything like truly novel insights at the forefront of human knowledge. "They have not yet reached the top <0.1% of human ability in any active research field" is an incredibly high bar I wouldn't expect to pass until we're already extremely close to AGI, and it should be telling that that late bar is on the short list of signs you are looking for. I would also add two other things: First, how many research labs do you think there are that have actually tried to use AI to make novel discoveries, given how little calendar time there has been to actually figure out how to adopt and use the models we do have? If Gemini 2.5 could do this today, I don't think we'd necessarily have any idea. And second, do you believe it was a mistake that two of the 2024 Nobel prizes went to AI researchers, for work that contributes to the advancement of chemistry and physics?
  • AI usefulness is strongly field dependent today. In my own field, it went from a useful supplementary tool to "This does 50-80% of what new hires did and 30-50% of what I used to do, and were scrambling to refactor workflows to take advantage of it."
  • Hallucinations are annoying, but good prompting strategy, model selection, and task definition can easily get the percentages down to the low single digits. In many cases the rates can easily be lower than those of a smart human given a similar amount of context. I can often literally just tell an LLM "Rewrite this prompt in such a way as to reduce the risk of hallucinations or errors, answer that prompt, then go back and check for and fix any mistakes" and that'll cut it down a good 50-90% depending on the topic and the question complexity. I can ask the model to cite sources for factual claims, dump the sources back into the next prompt, and ask if there are any factual claims not supported by the sources. It's a little circular, but also a bit Socratic and not really any worse than when I've tried to teach difficult mental skills to some bright human adults

Thank you for the clarification!

Of course, it is much more likely to be predictable a couple of days in advance than a year in advance, but even the former may conceivably be quite challenging depending on situational awareness of near-human-level models in training.

Do I understand correctly that you think that we are likely to only recognize AGI after it has been built? If so, how would we recognize AGI as you define it?

Do you also think that AGI will result in a fast take-off?

I don’t think I have anything better than a guess about any of those questions. 

That's alright. Would you be able to articulate what you associate with AGI in general? For example, do you associate AGI with certain intellectual or physical capabilities, or do you associate it more with something like moral agency, personhood or consciousness?

Also, I am just surprised I seem to be the only one making this fairly obvious point (?), and it raises some questions about our group epistemics. 

First and foremost, I want to acknowledge the frustration and more combatitive tone in this post and ask whether it is more of a pointer towards confusion about how we can be doing this so wrong?

I think that more people are in a similar camp to you but that it feels really hard to change group epistemics of this belief? It feels quite core and even if you have longer conversations with people about underlying problems with the models I find that it is hard to pull people out of the AGI IS COMING attractor state. If you look at the AI Safety community as an information network, there are certain clusters that are quite tightly coupled in terms of epistemics, for me timelines seem to be one of these dividing lines. I think the talk about it has become a bit more like politics where it is war and arguments are soliders? 

I don't think this is anyone's intentions but usually our emotions create our frame and if you believe that AGI might come in two years and that we're probably going to die, it is very hard to remain calm. 

The second problem is that the points around timelines and reasoning capacity of models is very hard to empirically forecast and I often think it comes down to a question to an individual's views on philosophy of science. What are the frames that you're using in order to predict useful real world progress? How are these coupled with pure ability on MMLU or Humanity's Last Exam? It is hard to know and these are complicated questions and so I think a lot of people often then just go back on vibes. 

The attractor state of the vibes being a more anxious one and so we get this collective cognitive effect where fear in an information network amplifies itself. 

I do not know what is right, I do know that it can be hard to have a conversation about shorter timelines with someone with shorter timelines because of a state of justifiable emotional tension. 

This all seems right - this is probably my most (and only?) “combative” post and I wish I’d toned it down a bit. 

(You can always change the epistemic note at the top to include this! I think it might improve the probability of a disagreeing person changing their mind.)

The base models seem to have topped out their task length around 2023 at a few minutes (see on the plot that GPT-4o is little better than GPT-4).  Reasoning models use search to do better.

Note that Claude 3.5 Sonnet (Old) and Claude 3.5 Sonnet (New) have a longer time horizon than 4o: 18 minutes and 28 minutes compared to 9 minutes (Figure 5 in Measuring AI Ability to Complete Long Tasks). GPT-4.5 also has a longer time horizon.

Interesting, but still apparently on a significantly slower doubling time than the reasoning models?

Yes, the reasoning models seem to have accelerated things. ~7 months to ~4 months doubling time on that plot. I'm still not sure I follow why "They found a second way to accelerate progress that we can pursue in parallel to the first" would not cause me to think that progress in total will thereafter be faster. The advent of reasoning models has caused an acceleration of increasing capabilities, not in one or two domains like chess, but across a broad range of domains.

I think this is at least superficially a reasonable interpretation, and if the new linear relationship continues then I’d be convinced it’s right, but I wish you had engaged more with the arguments I made in the post or could be a bit more explicit about which you don’t follow?

Basically, I just have very low confidence in putting a line through these points because I don’t see a principled reason to expect a linear relationship to hold, and I see some reasons to expect that it should not. 

I also don't have a principled reason to expect that particular linear relationship, except in general in forecasting tech advancements, I find that a lot of such relationships seem to happen and sustain themselves for longer than I'd expect given my lack of principled reasons for them.

I did just post another comment reply that engages with some things you said. 

To the first argument: I agree with @Chris_Leong's point about interest rates constituting essentially zero evidence, especially compared to the number of data points on the METR graph.

To the second: I do not think the PhD thesis is a fair comparison. That is not a case where we expect anyone to successfully complete a task on their own. PhD students, post-docs, and professional researchers break a long task into many small ones, receive constant feedback, and change course in response to intermediate successes and failures. I don't think there are actually very many tasks en route to a PhD tat can't be broken down into predictable, well defined subtasks that take less than a month, and the task of doing the breaking down is itself a fairly short-time-horizon task that gets periodically revised. Even still, many PhD theses end up being, "Ok, you've done enough total work, how do we finagle these papers into a coherent narrative after the fact?" Plus, overall, PhD students, those motivated to go to grad school with enough demonstrated ability to get accepted into PhD programs, fail to get a PhD close to half the time even with all that. 

I imagine you could reliably complete a PhD in many fields with a week-long time horizon, as long as you get good enough weekly feedback from a competent advisor. 1: Talk to advisor about what it takes to get a PhD. 2: Divide into a list of <1 week-long tasks. 3) Complete task 1, get feedback, revise list. 4) Either repeat the current task or move on to the new next task, depending on feedback. 5) Loop until complete. 5a) Every ten or so loops, check overall progress to date against the original requirements. Evaluate whether overall pace of progress is acceptable. If not, come up with possible new plans and get advisor feedback.

As far as not believing the current paradigm could reach AGI, which paradigm do you mean? I don't think "random variation and rapid iteration" is a fair assessment of the current research process. But even if it were, what should I do with that information? Well, luckily we have a convenient example of what it takes for blind mutations with selection pressure to raise intelligence to human levels: us! I am pretty confident saying that current LLMs would outperform, say, Australopithecus, on any intellectual ability, but not Home sapiens. So that happens in a few million years, let's say 200k generations of 10-100k individuals each, in which intelligence was one of many, many factors weakly driving selection pressure with at most a small number of variations per generation. I can't really quantify how much human intelligence and directed effort speed up progress compared to blind chance, but consider that 1) a current biology grad student can do things with genetics in an afternoon that evolution needs thousands of generations and millions of individuals or more to do, and 2) the modern economic growth rate, essentially a sum of the impacts of human insight on human activity, is around 15000x faster than it was in the paleolithic. Naively extrapolated, this outside view would tell me that science and engineering can take us from Australopithecus-level to human-level in about 13 generations (unclear which generation we're on now). The number of individuals needed per generation is dependent on how much we vary each individual, but plausibly in the single or double digits.

My disagreement with your conclusion from your third objection is that scaling inference time compute increases performance within a generation, but that's not how the iteration goes between generations. We use reasoning models with more inference time compute to generate better data to train better base models to more efficiently reproduce similar capability levels with less compute to build better reasoning models. So if you build the first superhuman coder and find it's expensive to run, what's the most obvious next step in the chain? Follow the same process as we've been following for reasoning models and if straight lines on graphs hold, then six months later we'll plausibly have one that's a tenth the cost to run. Repeat again for the next six months after that. 

We use reasoning models with more inference time compute to generate better data to train better base models to more efficiently reproduce similar capability levels with less compute to build better reasoning models.

This kind of thing isn't known to meaningfully work, as something that can potentially be done on pretraining scale. It also doesn't seem plausible without additional breakthroughs given the nature and size of verifiable task datasets, with things like o3-mini getting ~matched on benchmarks by post-training on datasets containing 15K-120K problems. All the straight lines for reasoning models so far are only about scaling a little bit, using scarce resources that might run out (verifiable problems that help) and untried-at-more-scale algorithms that might break down (in a way that's hard to fix). So the known benefit is still plausible to remain a one-time improvement, extending it significantly (into becoming a new direction of scaling) hasn't been demonstrated.

I think even remaining as a one-time improvement, long reasoning training might still be sufficient to get AI takeoff within a few years just from pretraining scaling of the underlying base models, but that's not the same as already believing that RL post-training actually scales very far by itself. Most plausibly it does scale with more reasoning tokens in a trace, getting from the current ~50K to ~1M, but that's separate from scaling with RL training all the way to pretraining scale (and possibly further).

I imagine you could reliably complete a PhD in many fields with a week-long time horizon, as long as you get good enough weekly feedback from a competent advisor. 1: Talk to advisor about what it takes to get a PhD. 2: Divide into a list of <1 week-long tasks. 3) Complete task 1, get feedback, revise list. 4) Either repeat the current task or move on to the new next task, depending on feedback. 5) Loop until complete. 5a) Every ten or so loops, check overall progress to date against the original requirements. Evaluate whether overall pace of progress is acceptable. If not, come up with possible new plans and get advisor feedback

I think it's nearly impossible to create unexpected new knowledge this way. 

As far as not believing the current paradigm could reach AGI, which paradigm do you mean? I don't think "random variation and rapid iteration" is a fair assessment of the current research process. But even if it were, what should I do with that information? Well, luckily we have a convenient example of what it takes for blind mutations with selection pressure to raise intelligence to human levels: us! I am pretty confident saying that current LLMs would outperform, say, Australopithecus, on any intellectual ability, but not Home sapiens. So that happens in a few million years, let's say 200k generations of 10-100k individuals each, in which intelligence was one of many, many factors weakly driving selection pressure with at most a small number of variations per generation. I can't really quantify how much human intelligence and directed effort speed up progress compared to blind chance, but consider that 1) a current biology grad student can do things with genetics in an afternoon that evolution needs thousands of generations and millions of individuals or more to do, and 2) the modern economic growth rate, essentially a sum of the impacts of human insight on human activity, is around 15000x faster than it was in the paleolithic. Naively extrapolated, this outside view would tell me that science and engineering can take us from Australopithecus-level to human-level in about 13 generations (unclear which generation we're on now). The number of individuals needed per generation is dependent on how much we vary each individual, but plausibly in the single or double digits.

I can't parse this. 

My disagreement with your conclusion from your third objection is that scaling inference time compute increases performance within a generation, but that's not how the iteration goes between generations. We use reasoning models with more inference time compute to generate better data to train better base models to more efficiently reproduce similar capability levels with less compute to build better reasoning models. So if you build the first superhuman coder and find it's expensive to run, what's the most obvious next step in the chain? Follow the same process as we've been following for reasoning models and if straight lines on graphs hold, then six months later we'll plausibly have one that's a tenth the cost to run. Repeat again for the next six months after that. 

You're probably right about distilling CoT. 

  1. You're right, but creating unexpected new knowledge is not a PhD requirement. I expect it's pretty rare that a PhD students achieves that level of research.
  2. It wasn't a great explanation, sorry, and there are definitely some leaps, digressions, and hand-wavy bits. But basically: Even if current AI research were all blind mutation and selection, we already know that that can yield general intelligence from animal-level-intelligence because evolution did it. And we already have various examples of how human research can apply much greater random and non-random mutation, larger individual changes, higher selection pressure in a preferred direction, and more horizontal transfer of traits than evolution can, enabling (very roughly estimated) ~3-5 OOMs greater progress per generation with fewer individuals and shorter generation times.
  3. Saw your edit above, thanks.

You're right, but creating unexpected new knowledge is not a PhD requirement. I expect it's pretty rare that a PhD students achieves that level of research.

I do weakly expect it to be necessary to reach AGI though. Also, I personally wouldn't want to do a PhD that didn't achieve this!

It wasn't a great explanation, sorry, and there are definitely some leaps, digressions, and hand-wavy bits. But basically: Even if current AI research were all blind mutation and selection, we already know that that can yield general intelligence from animal-level-intelligence because evolution did it. And we already have various examples of how human research can apply much greater random and non-random mutation, larger individual changes, higher selection pressure in a preferred direction, and more horizontal transfer of traits than evolution can, enabling (very roughly estimated) ~3-5 OOMs greater progress per generation with fewer individuals and shorter generation times.

Okay, then I understand the intuition but I think it needs a more rigorous analysis to even make an educated guess either way.

Saw your edit above, thanks.

No, thank you!

I personally wouldn't want to do a PhD that didn't achieve this!


Agreed. It was somewhere around reason #4 I quit my PhD program as soon as I qualified for a masters in passing.

So it seems to be a reasonable interpretation that we might see human level AI around mid-2030 to 2040, which happens to be about my personal median.

What are the reasons your median is mid-2030s to 2040, other than this way of extrapolating the METR results?

I believe intelligence is pretty sophisticated while others seem to think it's mostly brute force. This tangent would however require a longer discussion on the proper interpretation of Sutton's bitter lesson.

 

I'd be interested in seeing this point fleshed out, as it's a personal crux of mine (and I expect many others). The bullish argument which I'm compelled by goes something along the lines of:

  • Bitter Lesson: SGD is a much better scalable optimizer than you, and we're bringing it to pretty stupendous scales
  • Lots of Free Energy in Research Engineering: My model of R&D in frontier AI is that it is often blocked by a lot of tedious and laborious engineering. It doesn't take a stroke of genius to think of RL on CoT; it took (comparatively) quite a while to get it to work.
  • Low Threshold in Iterating Engineering Paradigms: Take a technology, scale it, find it's limits, pivot, repeat. There were many legitimate arguments floating around last year around the parallelism tradeoff and shortcut generalization which seemed to suggest limits of scaling pretraining. I take these to basically be correct, it just wasn't that hard to pivot towards a nearby paradigm which didn't face similar limits. I expect similar arguments to crop up around the limits of model-free RL, or OOD generalization of training on verifiable domains, or training on lossy representations of the real world (language), or inference on fixed weight recurrence, or... I expect (many) of them to basically be correct, I just don't expect the pivot towards a scalable solution to these to be that hard. Or in other words, I expect that much of the effort that comes from unlocking these new engineering paradigms to be made up of engineering hours which we expect to be largely automated.

What are your current AGI timelines?

As far as I am concerned, AGI should be able to do any intellectual task that a human can do. I think that inventing important new ideas tends to take at least a month, but possibly the length of a PhD thesis. So it seems to be a reasonable interpretation that we might see human level AI around mid-2030 to 2040, which happens to be about my personal median.

There is an argument to be made that at the larger scales of length, cognitive tasks become cleanly factored, or in other words it's more accurate to model completing something like a PhD as different instantiations of yourself coordinating across time over low bandwidth channels, as opposed to you doing very high dimensional inference for a very long time. If that's the case, then one would expect to roughly match human performance in indefinite time horizon tasks once that scale has been reached.

 

I don't think I fully buy this, but I don't outright reject it.

I agree, but my experience of doing a PhD actually feels more like an integrated project of discovery and creation. 

I remember some guy getting interviewed on YouTube who was making good money selling GPT3 generated books. If this was included in the METR graph, it would really throw things off.

Curated and popular this week