LESSWRONG
LW

elifland — LessWrong

I think you can't get around the uncertainty by modeling uplift as some more complicated function of coding automation fraction as in the AIFM, because you're still assuming that's logistic, we can't measure it any better than uplift, plus we're still uncertain how they're related. So we really do need better data.

But in the AIFM the coding automation logistic is there to predict the dynamics regarding how much coding automation speeds progress pre-AC. It doesn't have to do with setting the effective compute requirement for AC. I might be misunderstanding something, sorry if so.

Replying toResearch note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

elifland15h

Research note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

Re: the 1.6 number, oh that should actually be 1.8 sorry. I think it didn't get updated after a last minute change to the parameter value. I will fix that soon. Also, that's the parallel uplift. In our model, the serial multiplier/uplift is sqrt(parallel uplift).

Replying toResearch note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

elifland17h

Research note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

Suppose uplift at the start of 2026 was 1.6x as in the AIFM's median

Where are you getting this 1.6 number?

With respect to the rest of your comment, it feels to me like we have such little evidence about current uplift and what trend it follows (e.g. whether this assumption about a % automation curve that is logistic and its translation to uplift is a reasonable functional form). I'm not sure how strongly we disagree though. I'm much more skeptical of the claim that uplift can give much tighter confidence intervals than that it can give similar or slightly better ones. Again, this could change if we had much better data in a year or two.

Grading AI 2027's 2025 Predictions

Daniel Kokotajlo

Daniel Kokotajlo, elifland

AI 2027 laid out a detailed scenario for how AI would progress from 2025 through 2027, including quantitative predictions and qualitative descriptions of the AI landscape.

Now that we’re in early 2026, we can grade how its 2025 predictions compare to reality! This is exciting to us because we put a lot of effort into filling AI 2027 with concrete, falsifiable predictions, and now we reap the benefit of that effort: an additional method of forecasting AI timelines, to complement the methods we already use.1

The primary question we’ll answer is: How fast is AI progress moving relative to the AI 2027 scenario?

In aggregate, progress on quantitative metrics is at roughly 65% of the pace that... (read 2583 more words →)

Replying toResearch note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

elifland3d

Research note: A simpler AI timelines model predicts 99% AI R&D automation in ~2032

Grats on getting this out! I am overall excited about exploring models that rely more on uplift than on time horizons. A few thoughts:

It might be nice to indicate how these outputs relate to your all-things-considered views. To me your explicit model seems to be implausibly confident in 99% automation before 2040.

In particular, the "doubling difficulty growth factor", which measures whether time horizon increases superexponentially, could change the date of automated coder from 2028 to 2049! I suspect that time horizon is too poorly defined to nail down this parameter, and rough estimates of more direct AI capability metrics like uplift can give much tighter confidence intervals.

I am skeptical that uplift measurements... (read more)

Replying toConditional Kickstarter for the "Don't Build It" March

elifland10d

Conditional Kickstarter for the "Don't Build It" March

I think my personal beliefs would say "it's not very useful" or something. I think the "ban AGI locally" plan is dependent on a pretty specific path to be useful and I don't read the current phrasing as ruling out "One country Bans it and also does some other stuff in conjunction." (actually, upon reflection I'm not that confident I know what sort of scenario you have in mind here)

I think that a slowdown that is in the neighborhood of "ban AI development temporarily near but not after max-controllable AI" could potentially be very impactful. Banning AI development for long enough to allow China to pull ahead is less clear. I'm not sure what the intention of the sentence was, but to me it seems to imply that any domestic action on its own would be of very little use.

Replying toConditional Kickstarter for the "Don't Build It" March

elifland11d

Conditional Kickstarter for the "Don't Build It" March

If you would come to very similar March but object to details of the current framing, please let me know in the comments, and consider registering your email for the "Keep me informed" checkbox without making the commitment.

There's a decent chance I would join for the March as is given that I directionally agree with its sentiment and its recommendation. But I don't agree with some of the "We believe..." statements, which sound like they are intended to speak for all of the people who came to the March.

I disagree with these:

We believe that if any company or group, anywhere on the planet, builds an artificial superintelligence using anything remotely like current

... (read more)

Replying toClarifying how our AI timelines forecasts have changed since AI 2027

elifland17d

Clarifying how our AI timelines forecasts have changed since AI 2027

We did reach out to the contact email for each of the publications. Only one responded, and they denied that there was anything wrong in their article. It might be useful to reach out again linking this blog post, though.

Replying toClarifying how our AI timelines forecasts have changed since AI 2027

elifland18d

Clarifying how our AI timelines forecasts have changed since AI 2027

Fair enough. There is some reasoning on my end at the bottom of the post:

Dec 2024: 2032. Updated on early versions of the timelines model predicting shorter timelines than I expected. Also, RE-Bench scores were higher than I would have guessed.
Apr 2025: 2031. Updated based on the two variants of the AI 2027 timelines model giving 2027 and 2028 superhuman coder (SC) medians. My SC median was 2030, higher than the within-model median because I placed some weight on the model being confused, a poor framework, missing factors, etc. I also gave some weight to other heuristics and alternative models, which seemed overall point in the direction of longer timelines. I shifted

elifland

elifland, Daniel Kokotajlo, bhalstead

18d

Some recent news articles discuss updates to our AI timelines since AI 2027, most notably our new timelines and takeoff model, the AI Futures Model (see blog post announcement).^[1] While we’re glad to see broader discussion of AI timelines, these articles make substantial errors in their reporting. Please don’t assume that their contents accurately represent things we’ve written or believe! This post aims to clarify our past and current views.^[2]

The articles in question include:

The Guardian: Leading AI expert delays timeline for its possible destruction of humanity
The Independent: AI ‘could be last technology humanity ever builds’, expert warns in ‘doom timeline’
Inc: AI Expert Predicted AI Would End Humanity in 2027—Now He’s Changing His Timeline
WaPo:

... (read 1663 more words →)

elifland1mo

20% and 80% time horizons are kind of fake because there aren’t enough parameters to fit them separately.
We fit a two-parameter logistic model which doesn’t fit the top and bottom of the success curve linearly, so improving performance on 20% horizon tasks can lower 80% horizon.

My understanding is that you can still have a similarly unattractive issue with the 50% time horizon where performing better at high horizon lengths can reduce the 50% time horizon because it makes the slope less steep, but it doesn't seem to be as high magnitude of an issue as with 20+80%.

Replying toAI Futures Timelines and Takeoff Model: Dec 2025 Update

elifland1mo

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Thanks for the comments! Besides the below, I'm curious what your overall views are. What does your distribution for AC look like?

The authors don't seem to address the possibility that we are seeing a temporary acceleration of AI, because the labs are ramping methods that are much more expensive to scale, but they are doing so from very low baselines.

I think this is basically addressed in our uncertainty over the present doubling time, at least that's how I'd think of it for myself. Note that my median present doubling time estimate of 5.5 months is slower than the potentially accelerated recent time horizon trend.

I don't think there's any reason to believe that

... (read 1249 more words →)

AI Futures Timelines and Takeoff Model: Dec 2025 Update

elifland

elifland, bhalstead, Alex Kastner, Daniel Kokotajlo

1mo

We’ve significantly upgraded our timelines and takeoff model! It predicts when AIs will reach key capability milestones: for example, Automated Coder / AC (full automation of coding) and superintelligence / ASI (much better than the best humans at virtually all cognitive tasks). This post will briefly explain how the model works, present our timelines and takeoff forecasts, and compare it to our previous (AI 2027) models (spoiler: the AI Futures Model predicts longer timelines to full coding automation than our previous model by about 3-5 years, in significant part due to being less bullish on pre-full-automation AI R&D speedups). Added Jan 2026: see here for clarifications regarding how our forecasts have changed... (read 7337 more words →)

143

Response to titotal’s critique of our AI 2027 timelines model

elifland

elifland, Daniel Kokotajlo

2mo

Introduction

In June, a Substack/LessWrong/EA Forum user named titotal wrote “A deep critique of AI 2027’s bad timeline models”. Our original model that they were critiquing can be found here, with a brief overview below.^[1]

In a nutshell, we disagree with most of titotal’s criticisms. While they pointed out a few mistakes, for which we are grateful, on the whole we think the problems that they pointed out do not add up to the model being “bad.” In this post we will explain why.

While we initially responded via a comment, we wanted to make a more in-depth response. We apologize for the delay; we are planning to release a new timelines+takeoff model soon and... (read 12814 more words →)

Recent and forecasted rates of software and hardware progress

elifland

8mo

I originally wrote this just for myself but thought I'd share it in case others find it useful. Sharing rough notes in the spirit of not letting perfect be the enemy of the good. This was written in early May 2025.

In this post I collect evidence and forecasts regarding recent and future trends in software and compute progress. I also make my own forecasts which inform an updated version of AI 2027’s timelines forecast (more to come on these and further updates). These forecasts are rough and uncertain.

I especially build on this post.

Evidence and forecasts regarding the recent rate of software progress (and share vs. compute)
Evidence	Estimate	Caveats, reasoning, etc.
Compute efficiency increase at fixed performance
Epoch paper:

... (read 2127 more words →)

How 2025 AI Forecasts Fared So Far

Adam B

Adam B, romeo, elifland

9mo

At the end of 2024, there was a lot of discussion about whether AI scaling was hitting a wall and whether we would reach AGI soon.

We created a forecasting survey that would track key markers of AI progress in what might turn out to be a pivotal year. After filtering the data, we had 421 unique respondents. All questions were optional.

The survey was open from Nov 30th 2024 to Jan 20th 2025 – during which OpenAI o3 was announced, so we can compare forecasts before and after the announcement.

In this post, we summarise respondents' forecasts, and look at how they're holding up so far. At the end of the year, we'll resolve... (read 2142 more words →)

Slow corporations as an intuition pump for AI R&D automation

ryan_greenblatt

ryan_greenblatt, elifland

9mo

How much should we expect AI progress to speed up after fully automating AI R&D? This post presents an intuition pump for reasoning about the level of acceleration by talking about different hypothetical companies with different labor forces, amounts of serial time, and compute. Essentially, if you'd expect an AI research lab with substantially less serial time and fewer researchers than current labs (but the same cumulative compute) to make substantially less algorithmic progress, you should also expect a research lab with an army of automated researchers running at much higher serial speed to get correspondingly more done. (And if you'd expect the company with less serial time to make similar amounts... (read 2506 more words →)

Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]

elifland

elifland, Nikola Jurkovic

10mo

Authors: Eli Lifland,^[1] Nikola Jurkovic,^[2] FutureSearch^[3]

This is supporting research for AI 2027. We'll be cross-posting these over the next week or so.

Assumes no large-scale catastrophes happen (e.g., a solar flare, a pandemic, nuclear war), no government or self-imposed slowdown, and no significant supply chain disruptions. All forecasts give a substantial chance of superhuman coding arriving in 2027.

Summary

We forecast when the leading AGI company will internally develop a superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does, while being much faster and cheaper. At this point, the SC will likely speed up AI progress substantially as is explored in our takeoff forecast.

We first show Method 1:... (read 5156 more words →)

AI 2027: What Superintelligence Looks Like

Daniel Kokotajlo

Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo

10mo

In 2021 I wrote what became my most popular blog post: What 2026 Looks Like. I intended to keep writing predictions all the way to AGI and beyond, but chickened out and just published up till 2026.

Well, it's finally time. I'm back, and this time I have a team with me: the AI Futures Project. We've written a concrete scenario of what we think the future of AI will look like. We are highly uncertain, of course, but we hope this story will rhyme with reality enough to help us all prepare for what's ahead.

You really should go read it on the website instead of here, it's much better. There's a sliding... (read 12089 more words →)

222

677

Predict 2025 AI capabilities (by Sunday)

Jonas V

Jonas V, elifland, Sage Future

Until this Sunday, you can submit your 2025 AI predictions at ai2025.org. It’s a forecasting survey by AI Digest for the 2025 performance on various AI benchmarks, as well as revenue and public attention.

You can share your results in a picture like this one. I personally found it pretty helpful to learn about the different benchmarks, and also to think through my timelines estimates.

The survey will close on Sunday, January 19th (anywhere on Earth).

If you know any AI public intellectuals or discourse influencers who might be interested in submitting the survey, please encourage them to do so!

Survey link: ai2025.org

The word "overconfident" seems overloaded. Here are some things I think that people sometimes mean when they say someone is overconfident:

They gave a binary probability that is too far from 50% (I believe this is the original one)
They overestimated a binary probability (e.g. they said 20% when it should be 1%)
Their estimate is arrogant (e.g. they say there's a 40% chance their startup fails when it should be 95%), or maybe they give an arrogant vibe
They seem too unwilling to change their mind upon arguments (maybe their credal resilience is too high)
They gave a probability distribution that seems wrong in some way (e.g. "50% AGI by 2030 is so overconfident, I think

... (read more)

[cross-posting from blog]

I made a spreadsheet for forecasting the 10th/50th/90th percentile for how you think GPT-4.5 will do on various benchmarks (given 6 months after the release to allow for actually being applied to the benchmark, and post-training enhancements). Copy it here to register your forecasts.

If you’d prefer, you could also use it to predict for GPT-5, or for the state-of-the-art at a certain time e.g. end of 2024 (my predictions would be pretty similar for GPT-4.5, and end of 2024).

You can see my forecasts made with ~2 hours of total effort on Feb 17 in this sheet; I won’t describe them further here in order to avoid anchoring.

There might be a similar tournament on Metaculus soon, but not sure on the timeline for that (and spreadsheet might be lower friction). If someone wants to take the time to make a form for predicting, tracking and resolving the forecasts, be my guest and I’ll link it here.

Just made a bet with Jeremy Gillen that may be of interest to some LWers, would be curious for opinions:

[crossposted from EA Forum]

Reflecting a little on my shortform from a few years ago, I think I wasn't ambitious enough in trying to actually move this forward.

I want there to be an org that does "human challenge"-style RCTs across lots of important questions that are extremely hard to get at otherwise, including (top 2 are repeated from previous shortform):

Health effects of veganism
Health effects of restricting sleep
Productivity of remote vs. in-person work
Productivity effects of blocking out focused/deep work

Edited to add: I no longer think "human challenge" is really the best way to refer to this idea (see comment that convinced me); I mean to say something like "large scale RCTs of important things... (read more)

(epistemic status: exploratory)

I think more people into LessWrong in high school - college should consider trying Battlecode. It's somewhat similar to The Darwin Game which was pretty popular on here and I think generally the type of people who like LessWrong will both enjoy and be good at Battlecode. (edited to add: A short description of Battlecode is that you write a bot to beat other bots at a turn-based strategy game. Each unit executes its own code so communication/coordination is often one of the most interesting parts.)

I did it with friends for 6 years (junior year of high school - end of undergrad), and I think it at least helped me... (read more)

LESSWRONG
LW

LESSWRONG
LW

elifland

AI 2027: What Superintelligence Looks Like

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Eli's review of "Is power-seeking AI an existential risk?"

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland

Grading AI 2027's 2025 Predictions

Clarifying how our AI timelines forecasts have changed since AI 2027

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Response to titotal’s critique of our AI 2027 timelines model

Recent and forecasted rates of software and hardware progress

How 2025 AI Forecasts Fared So Far

Slow corporations as an intuition pump for AI R&D automation

elifland

AI 2027: What Superintelligence Looks Like

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Eli's review of "Is power-seeking AI an existential risk?"

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland

Grading AI 2027's 2025 Predictions

Clarifying how our AI timelines forecasts have changed since AI 2027

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Response to titotal’s critique of our AI 2027 timelines model

Recent and forecasted rates of software and hardware progress

How 2025 AI Forecasts Fared So Far

Slow corporations as an intuition pump for AI R&D automation

Introduction

Evidence and forecasts regarding the recent rate of software progress
(and share vs. compute)

Summary

elifland

AI 2027: What Superintelligence Looks Like

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Eli's review of "Is power-seeking AI an existential risk?"

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland

Grading AI 2027's 2025 Predictions

Clarifying how our AI timelines forecasts have changed since AI 2027

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Response to titotal’s critique of our AI 2027 timelines model

Recent and forecasted rates of software and hardware progress

How 2025 AI Forecasts Fared So Far

Slow corporations as an intuition pump for AI R&D automation

elifland

AI 2027: What Superintelligence Looks Like

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Eli's review of "Is power-seeking AI an existential risk?"

Clarifying how our AI timelines forecasts have changed since AI 2027

elifland

Grading AI 2027's 2025 Predictions

Clarifying how our AI timelines forecasts have changed since AI 2027

AI Futures Timelines and Takeoff Model: Dec 2025 Update

Response to titotal’s critique of our AI 2027 timelines model

Recent and forecasted rates of software and hardware progress

How 2025 AI Forecasts Fared So Far

Slow corporations as an intuition pump for AI R&D automation

Introduction

Evidence and forecasts regarding the recent rate of software progress (and share vs. compute)

Summary

Evidence and forecasts regarding the recent rate of software progress
(and share vs. compute)