Tom Davidson - LessWrong

Slow corporations as an intuition pump for AI R&D automation

However, I'm quite skeptical of this type of consideration making a big difference because the ML industry has already varied the compute input massively, with over 7 OOMs of compute difference between research now (in 2025) vs at the time of AlexNet 12 years ago, (invalidating the view that there is some relatively narrow range of inputs in which neither input is bottlenecking)

Seems like this is a strawman of the bottlenecks view, which would say that the number of near frontier experiments, not compute, is the bottleneck and this quantity didn't scale up over that time

ETA: for example, if the compute scale up had happened, but no one had been allowed to run experiments with more compute than AlexNet, it seems a lot more plausible that the compute would have stopped helping because there just wouldn't have been enough people to plan the experiments

Plus the claim that alg progress might have been actively enabled by the access to new hardware scales

Should there be just one western AGI project?

Tom Davidson2mo40

Agree with those updates.

Though a small update as I don't think a default gov-led project would be much better on this front. (Though a well designed one led by responsible ppl could be way way better of course.)

And I've had a few other convos that made me more worried about race dynamics.

Still think two projects is prob better than one overall, but two probbetter than six

Will compute bottlenecks prevent a software intelligence explosion?

Tom Davidson3mo30

I meant at any point, but was imagining the period around full automation yeah. Why do you ask?

ryan_greenblatt's Shortform

Tom Davidson6mo10

I'll post about my views on different numbers of OOMs soon

ryan_greenblatt's Shortform

Tom Davidson6mo32

Sorry, for my comments on this post I've been referring to "software only singularity?" only as "will the parameter r >1 when we f first fully automate AI RnD", not as a threshold for some number of OOMs. That's what Ryan's analysis seemed to be referring to.

I separately think that even if initially r>1 the software explosion might not go on for that long

ryan_greenblatt's Shortform

Tom Davidson6mo10

Obviously the numbers in the LLM case are much less certain given that I'm guessing based on qualitative improvement and looking at some open source models,

Sorry,I don't follow why they're less certain?

based on some first principles reasoning and my understanding of how returns diminished in the semi-conductor case

I'd be interested to hear more about this. The semi conductor case is hard as we don't know how far we are from limits, but if we use Landauer's limit then I'd guess you're right. There's also uncertainty about how much alg progress we will and have met

Human takeover might be worse than AI takeover

Tom Davidson6mo63

Why are they more recoverable? Seems like a human who seized power would seek asi advice on how to cement their power

How will we update about scheming?

Tom Davidson6mo*Ω7145

Thanks for this!

Compared to you, I more expect evidence of scheming if it exists.

You argue weak schemers might just play nice. But if so, we can use them to do loads of intellectual labour to make fancy behavioral red teaming and interp to catch out the next gen of AI.

More generally, the plan of bootstrapping to increasingly complex behavioral tests and control schemes seems likely to work. It seems like if one model has spent a lot of thinking time designing a scheme then another model would have to be much smarter to zero shot cause a catastrophe without the scheme detecting it. Eg. analogies with humans suggest this.

Human takeover might be worse than AI takeover

Tom Davidson6mo*50

I agree the easy vs hard worlds influence the chance of AI taking over.

But are you also claiming it influences the badness of takeover conditional on it happening? (That's the subject of my post)

Human takeover might be worse than AI takeover

Tom Davidson6mo102

So you predict that if Claude was in a situation where it knew that it had complete power over you and could make you say that you liked it then it would stop being nice? I think would continue to be nice in any situation of that rough kind which suggests it's actually nice not just narcissistically pretending

LESSWRONG
LW

Posts

Wikitag Contributions

Comments