If "smarter than almost all humans at almost all things" models appear in 2026-2027, China and several others will be able to ~immediately steal the first such models, by default.
Interpreted very charitably: but even in that case, they probably wouldn't have enough inference compute to compete.
It's strange that he doesn't mention DeepSeek-R1-Zero anywhere in that blogpost, which is arguably the most important development DeepSeek announced (self-play RL on reasoning models). R1-Zero is what stuck out to me in DeepSeek's papers, and ex. the Arc Prize team behind the Arc-Agi benchmark says:
R1-Zero is significantly more important than R1.
Was R1-Zero already obvious to the big labs, or is Amodei deliberately underemphasizing that part?
Claude 3.5 Sonnet is a mid-sized model that cost a few $10M's to train
I don't get this, if frontier(ish) models cost $10M–$100M, why is Nvidia's projected revenue more like $1T–$10T? Is the market projecting 100,000x growth in spending on frontier models within the next few years? I would have guessed more like 100x–1000x growth but at least one of my numbers must be wrong. (Or maybe they're all wrong by ~1 OOM?)
Dario corrects misconceptions and endorses export controls.
Also:
Also:
One thing seems wrong:
If "smarter than almost all humans at almost all things" models appear in 2026-2027, China and several others will be able to ~immediately steal the first such models, by default.