ryan_greenblatt

I'm the chief scientist at Redwood Research.

Comments

Sorted by

(I don't expect o3-mini is a much better agent than 3.5 sonnet new out of the box, but probably a hybrid scaffold with o3 + 3.5 sonnet will be substantially better than 3.5 sonnet. Just o3 might also be very good. Putting aside cost, I think o1 is usually better than o3-mini on open ended programing agency tasks I think.)

The question of context might be important, see here. I wouldn't find 15 minutes that surprising for ~50% success rate, but I've seen numbers more like 1.5 hours. I thought this was likely to be an overestimate so I went down to 1 hour, but more like 15-30 minutes is also plausible.

Keep in mind that I'm talking about agent scaffolds here.

I mean, I don't think AI R&D is a particularly hard field persay, but I do think it involves lots of tricky stuff and isn't much easier than automating some other plausibly-important-to-takeover field (e.g., robotics). (I could imagine that the AIs have a harder time automating philosophy even if they were trying to work on this, but it's more confusing to reason about because human work on this is so dysfunctional.) The main reason I focused on AI R&D is that I think it is much more likely to be fully automated first and seems like it is probably fully automated prior to AI takeover.

I think you can add mirror enzymes which can break down mirror carbs. Minimally we are aware of enzymes which break down mirror glucose.

No, sorry I was mostly focused on "such that if you didn't see them within 3 or 5 years, you'd majorly update about time to the type of AGI that might kill everyone". I didn't actually pick up on "most impressive" and actually tried to focus on something that occurs substantially before things get crazy.

Most impressive would probably be stuff like "automate all of AI R&D and greatly accelerate the pace of research at AI companies". (This seems about 35% likely to me within 5 years, so I'd update by at least that much.) But this hardly seems that interesting? I think we can agree that once the AIs are automating whole companies stuff is very near.

Importantly, this is an example of developing a specific application (surgical robot) rather than advancing the overall field (robots in general). It's unclear whether the analogy to an individual application or an overall field is more appropriate for AI safety.

I think if you look at "horizon length"---at what task duration (in terms of human completion time) do the AIs get the task right 50% of the time---the trends will indicate doubling times of maybe 4 months (though 6 months is plausible). Let's say 6 months more conservatively. I think AIs are at like 30 minutes on math? And 1 hour on software engineering. It's a bit unclear, but let's go with that. Then, to get to 64 hours on math, we'd need 7 doublings = 3.5 years. So, I think the naive trend extrapolation is much faster than you think? (And this estimate strikes me as conservative at least for math IMO.)

Consider tasks that quite good software engineers (maybe top 40% at Jane Street) typically do in 8 hours without substantial prior context on that exact task. (As in, 8 hour median completion time.) Now, we'll aim to sample these tasks such that the distribution and characteristics of these tasks are close to the distribution of work tasks in actual software engineering jobs (we probably can't get that close because of the limited context constraint, but we'll try).

In short timelines, I expect AIs will be able to succeed at these tasks 70% of the time within 3-5 years and if they didn't, I would update toward longer timelines. (This is potentially using huge amounts of inference compute and using strategies that substantially differ from how humans do these tasks.)

The quantitative update would depend on how far AIs are from being able to accomplish this. If AIs were quite far (e.g., at 2 hours on this metric which is pretty close to where they are now) and the trend on horizon length indicated N years until 64 hours, I would update to something like 3 N as my median for AGI.

(I think a reasonable interpretation of the current trend indicates like 4 month doubling times. We're currently at like a bit less than 1 hour for this metric I think, though maybe more like 30 min? Maybe you need to get to 64 hours until stuff feels pretty close to getting crazy. So, this suggests 2.3 year, though I expect longer in practice. My actual median for "AGI" in a strong sense is like 7 years, so 3x longer than this.)

Edit: Note that I'm not responding to "most impressive", just trying to operationalize something that would make me update.

I would find this post much more useful to engage with if you more concretely described the type of tasks that you think AIs will remain bad and gave a bunch of examples. (Or at least made an argument for why it is hard to construct examples if that is your perspective.)

I think you're pointing to a category like "tasks that require lots of serial reasoning for humans, e.g., hard math problems particularly ones where the output should be a proof". But, I find this confusing, because we've pretty clearly seen huge progress on this in the last year such that it seems like the naive extrapolation would imply that systems are much better at this by the end of the year.

Already AIs seem to be not that much worse at tricky serial reasoning than smart humans:

  • My sense is that AIs are pretty competitive at 8th grade competition math problems with numerical answers and that are relatively shorter. As in, they aren't much worse than the best 8th graders at AIME or similar.
  • At proofs, the AIs are worse, but showing some signs of life.
  • On logic/reasoning puzzles the AIs are already pretty good and seems to be getting better rapidly on any specific type of task as far as I could tell.

It would be even better if you pointed to some particular benchmark and made predictions.

Sam also implies that GPT-5 will be based on o3.

IDK if Sam is trying to imply this GPT-5 will be "the AGI", but regardless, I think we can be pretty confident that o3 isn't capable enough to automate large fractions of cognitive labor let alone "outperform humans at most economically valuable work" (the original openai definition of AGI).

Load More