I think that is a problem for the industry, but probably not an insurmountable barrier the way some commentators make it out to be.
One other thought is that there's probably an upper limit on how good an LLM can get even with unlimited high quality data and I'd guess that models would asymptotically approach it for a while. Based on the reporting around GPT-5 and other next-gen models, I'd guess that the issue is lack of data rather than approaching some fundamental limit.
It was all my twitter feed was talking about, but I think it's been really under-discussed in mainstream press.
RE Knoop's comment, here are some relevant grafs from the ARC announcement blog post:
To adapt to novelty, you need two things. First, you need knowledge – a set of reusable functions or programs to draw upon. LLMs have more than enough of that. Second, you need the ability to recombine these functions into a brand new program when facing a new task – a program that models the task at hand. Program synthesis. LLMs have long lacked this feature. The o series of models fixes that.
For now, we can only speculate about the exact specifics of how o3 works. But o3's core mechanism appears to be natural language program search and execution within token space – at test time, the model searches over the space of possible Chains of Thought (CoTs) describing the steps required to solve the task, in a fashion perhaps not too dissimilar to AlphaZero-style Monte-Carlo tree search. In the case of o3, the search is presumably guided by some kind of evaluator model. To note, Demis Hassabis hinted back in a June 2023 interview that DeepMind had been researching this very idea – this line of work has been a long time coming.
More in the ARC post.
My rough understanding is that it's like a meta-CoT strategy, evaluating multiple different approaches.
This is what Hsu just said about it: "3. I could be described as a China hawk in that I've been pointing to a US-China competition as unavoidable for over a decade. But I think I have more realistic views about what is happening in PRC than most China hawks. I also try to focus on simple descriptive analysis rather than getting distracted by normative midwit stuff."
Steve Hsu clarified some things on my thread about this discussion: https://x.com/hsu_steve/status/1861970671527510378
"Clarifications:
1. The mafia tendencies (careerist groups working together out of self-interest and not to advance science itself) are present in the West as well these days. In fact the term was first used in this way by Italian academics.
2. They're not against big breakthroughs in PRC, esp. obvious ones. The bureaucracy bases promotions, raises, etc. on metrics like publications in top journals, cititations, ... However there are very obvious wins that they will go after in a coordinated way - including AI, semiconductors, new energy tech, etc.
3. I could be described as a China hawk in that I've been pointing to a US-China competition as unavoidable for over a decade. But I think I have more realistic views about what is happening in PRC than most China hawks. I also try to focus on simple descriptive analysis rather than getting distracted by normative midwit stuff.
4. There is coordinated planning btw govt and industry in PRC to stay at the frontier in AI/AGI/ASI. They are less susceptible to "visionaries" (ie grifters) so you'll find fewer doomers or singularitarians, etc. Certainly not in the top govt positions. The quiet confidence I mentioned extends to AI, not just semiconductors and other key technologies."
Gotcha, well I'm on it!
Interesting, do you have a link for that?
US companies are racing toward AGI but the USG isn't. As someone else mentioned, Dylan Patel from Semianalysis does not think China is scale-pilled.
As mentioned in another reply, I'm planning to do a lot more research and interviews on this topic, especially with people who are more hawkish on China. I also think it's important that unsupported claims with large stakes get timely pushback, which is in tension with the type of information gathering you're recommending (which is also really important, TBC!).
Claiming that China as a country is racing toward AGI != Chinese AI companies aren't fast following US AI companies, which are explicitly trying to build AGI. This is a big distinction!
I don't think it's fair to say I made a bad prediction here.
Here's the full context of my quote: "The report clocks in at a cool 793 pages with 344 endnotes. Despite this length, there are only a handful of mentions of AGI, and all of them are in the sections recommending that the US race to build it.
In other words, there is no evidence in the report to support Helberg’s claim that "China is racing towards AGI.”
Nonetheless, his quote goes unchallenged into the 300-word Reuters story, which will be read far more than the 800-page document. It has the added gravitas of coming from one of the commissioners behind such a gargantuan report.
I’m not asserting that China is definitively NOT rushing to build AGI. But if there were solid evidence behind Helberg’s claim, why didn’t it make it into the report?"
Here's my tweet mentioning Gwern's comment. It's not clear that DeepSeek falsifies what Gwern said here:
V3 and R1 are impressive but didn't advance the absolute capabilities frontier. Maybe the capabilities/cost frontier, though we don't actually know how compute efficient OAI, Anthropic, GDM are.
I think this part of @gwern's comment doesn't hold up as well:
I still don't think DS is evidence that "China" is racing toward AGI. The US isn't racing toward AGI either. Some American companies are, with varying levels of support from the government. But there's a huge gap between that and Manhattan Project levels of direct govt investment, support, and control.
However, overall, I do think that DS has gotten the CCP more interested in AGI and changed the landscape a lot.