o-series of models may be able to produce new high quality training data
sufficiently good reasoning approaches + existing base models + scaffolding may be sufficient to get you to automating ML research

One other thought is that there's probably an upper limit on how good an LLM can get even with unlimited high quality data and I'd guess that models would asymptotically approach it for a while. Based on the reporting around GPT-5 and other next-gen models, I'd guess that the issue is lack of data rather than approaching some fundamental limit.

Reply

We are in a New Paradigm of AI Progress - OpenAI's o3 model makes huge gains on the toughest AI benchmarks in the world

garrison1mo40

It was all my twitter feed was talking about, but I think it's been really under-discussed in mainstream press.

RE Knoop's comment, here are some relevant grafs from the ARC announcement blog post:

To adapt to novelty, you need two things. First, you need knowledge – a set of reusable functions or programs to draw upon. LLMs have more than enough of that. Second, you need the ability to recombine these functions into a brand new program when facing a new task – a program that models the task at hand. Program synthesis. LLMs have long lacked this feature. The o series of models fixes that.
For now, we can only speculate about the exact specifics of how o3 works. But o3's core mechanism appears to be natural language program search and execution within token space – at test time, the model searches over the space of possible Chains of Thought (CoTs) describing the steps required to solve the task, in a fashion perhaps not too dissimilar to AlphaZero-style Monte-Carlo tree search. In the case of o3, the search is presumably guided by some kind of evaluator model. To note, Demis Hassabis hinted back in a June 2023 interview that DeepMind had been researching this very idea – this line of work has been a long time coming.

More in the ARC post.

My rough understanding is that it's like a meta-CoT strategy, evaluating multiple different approaches.

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo91

This is what Hsu just said about it: "3. I could be described as a China hawk in that I've been pointing to a US-China competition as unavoidable for over a decade. But I think I have more realistic views about what is happening in PRC than most China hawks. I also try to focus on simple descriptive analysis rather than getting distracted by normative midwit stuff."

https://x.com/hsu_steve/status/1861970671527510378

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo91

Steve Hsu clarified some things on my thread about this discussion: https://x.com/hsu_steve/status/1861970671527510378

"Clarifications:

1. The mafia tendencies (careerist groups working together out of self-interest and not to advance science itself) are present in the West as well these days. In fact the term was first used in this way by Italian academics.

2. They're not against big breakthroughs in PRC, esp. obvious ones. The bureaucracy bases promotions, raises, etc. on metrics like publications in top journals, cititations, ... However there are very obvious wins that they will go after in a coordinated way - including AI, semiconductors, new energy tech, etc.

3. I could be described as a China hawk in that I've been pointing to a US-China competition as unavoidable for over a decade. But I think I have more realistic views about what is happening in PRC than most China hawks. I also try to focus on simple descriptive analysis rather than getting distracted by normative midwit stuff.

4. There is coordinated planning btw govt and industry in PRC to stay at the frontier in AI/AGI/ASI. They are less susceptible to "visionaries" (ie grifters) so you'll find fewer doomers or singularitarians, etc. Certainly not in the top govt positions. The quiet confidence I mentioned extends to AI, not just semiconductors and other key technologies."

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo10

Gotcha, well I'm on it!

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo41

Interesting, do you have a link for that?

US companies are racing toward AGI but the USG isn't. As someone else mentioned, Dylan Patel from Semianalysis does not think China is scale-pilled.

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo112

As mentioned in another reply, I'm planning to do a lot more research and interviews on this topic, especially with people who are more hawkish on China. I also think it's important that unsupported claims with large stakes get timely pushback, which is in tension with the type of information gathering you're recommending (which is also really important, TBC!).

Reply

1

China Hawks are Manufacturing an AI Arms Race

garrison2mo41

Claiming that China as a country is racing toward AGI != Chinese AI companies aren't fast following US AI companies, which are explicitly trying to build AGI. This is a big distinction!

Reply

China Hawks are Manufacturing an AI Arms Race

garrison2mo106

Hey Seth, appreciate the detailed engagement. I don't think the 2017 report is the best way to understand what China's intentions are WRT to AI, but there was nothing in the report to support Helberg's claim to Reuters. I also cite multiple other sources discussing more recent developments (with the caveat in the piece that they should be taken with a grain of salt). I think the fact that this commission was not able to find evidence for the "China is racing to AGI" claim is actually pretty convincing evidence in itself. I'm very interested in better understanding China's intentions here and plan to deep dive into it over the next few months, but I didn't want to wait until I could exhaustively search for the evidence that the report should have offered while an extremely dangerous and unsupported narrative takes off.

I also really don't get the error pushback. These really were less technical errors than basic factual errors and incoherent statements. They speak to a sloppiness that should affect how seriously the report should be taken. I'm not one to gatekeep ai expertise, but idt it's too much to expect a congressional commission with a top recommendation to commence in a militaristic AI arms race to have SOMEONE read a draft who knows that chatgpt-3 isn't a thing.

Reply

Is Deep Learning Actually Hitting a Wall? Evaluating Ilya Sutskever's Recent Claims

garrison2mo10

Thanks for these!

Reply