All of garrison's Comments + Replies

I don't think it's fair to say I made a bad prediction here. 

Here's the full context of my quote: "The report clocks in at a cool 793 pages with 344 endnotes. Despite this length, there are only a handful of mentions of AGI, and all of them are in the sections recommending that the US race to build it. 

In other words, there is no evidence in the report to support Helberg’s claim that "China is racing towards AGI.” 

Nonetheless, his quote goes unchallenged into the 300-word Reuters story, which will be read far more than the 800-page document.... (read more)

1thedudeabides
ok so what criteria would you use to suggest that your statements/gwern’s statements were falisified? What line can we agree on today, while it feels uncertainty, so that later we’re not still fighting over terminology and more working off the same ground truth?
8rvnnt
Thank you for (being one of the horrifyingly few people) doing sane reporting on these crucially important topics.

I think that is a problem for the industry, but probably not an insurmountable barrier the way some commentators make it out to be. 

  1. o-series of models may be able to produce new high quality training data
  2. sufficiently good reasoning approaches + existing base models + scaffolding may be sufficient to get you to automating ML research

One other thought is that there's probably an upper limit on how good an LLM can get even with unlimited high quality data and I'd guess that models would asymptotically approach it for a while. Based on the reporting around GPT-5 and other next-gen models, I'd guess that the issue is lack of data rather than approaching some fundamental limit. 

It was all my twitter feed was talking about, but I think it's been really under-discussed in mainstream press. 

RE Knoop's comment, here are some relevant grafs from the ARC announcement blog post

To adapt to novelty, you need two things. First, you need knowledge – a set of reusable functions or programs to draw upon. LLMs have more than enough of that. Second, you need the ability to recombine these functions into a brand new program when facing a new task – a program that models the task at hand. Program synthesis. LLMs have long lacked this

... (read more)

This is what Hsu just said about it: "3. I could be described as a China hawk in that I've been pointing to a US-China competition as unavoidable for over a decade. But I think I have more realistic views about what is happening in PRC than most China hawks. I also try to focus on simple descriptive analysis rather than getting distracted by normative midwit stuff."

https://x.com/hsu_steve/status/1861970671527510378 

1Abe
I listen to his podcast semi-regularly, and this just seems like a pretty slippery description of his views. It's pretty obvious that he favors the United States taking a less aggressive stance toward China, for example in his views on the various protectionist measures that the United States has taken in the last ten years. He also seems to see more room for cooperation than anyone I would describe as a China hawk, and in this podcast he suggests that China could likely liberalize after Xi: https://www.manifold1.com/episodes/molson-hart-china-and-amazon-up-close-60/transcript I don't think it's an unreasonable take, but it's not one that I would describe as "hawkish".
garrison112

Steve Hsu clarified some things on my thread about this discussion: https://x.com/hsu_steve/status/1861970671527510378

"Clarifications:

1. The mafia tendencies (careerist groups working together out of self-interest and not to advance science itself) are present in the West as well these days. In fact the term was first used in this way by Italian academics.

2. They're not against big breakthroughs in PRC, esp. obvious ones. The bureaucracy bases promotions, raises, etc. on metrics like publications in top journals, cititations, ... However there are very obv... (read more)

gwern148

(All of which I consider to be consistent with my summary, if anyone is wondering, and thus, given that Hsu did not choose to object to any of the main points of my summary in his clarifications, are confirmation.)

Interesting, do you have a link for that? 

US companies are racing toward AGI but the USG isn't. As someone else mentioned, Dylan Patel from Semianalysis does not think China is scale-pilled.

garrison112

As mentioned in another reply, I'm planning to do a lot more research and interviews on this topic, especially with people who are more hawkish on China. I also think it's important that unsupported claims with large stakes get timely pushback, which is in tension with the type of information gathering you're recommending (which is also really important, TBC!).

5Raemon
Oh to be clear I don’t think it was bad for you to post this as-is. Just that I’d like to see more followup

Claiming that China as a country is racing toward AGI != Chinese AI companies aren't fast following US AI companies, which are explicitly trying to build AGI. This is a big distinction!

2Logan Zoellner
Chinese companies explicitly have a rule not to release things that are ahead of SOTA (I've seen comments of the form "trying to convince my boss this isn't SOTA so we can release it" on github repos).  So "publicly release Chinese models are always slightly behind American ones" doesn't prove much.
garrison106

Hey Seth, appreciate the detailed engagement. I don't think the 2017 report is the best way to understand what China's intentions are WRT to AI, but there was nothing in the report to support Helberg's claim to Reuters. I also cite multiple other sources discussing more recent developments (with the caveat in the piece that they should be taken with a grain of salt). I think the fact that this commission was not able to find evidence for the "China is racing to AGI" claim is actually pretty convincing evidence in itself. I'm very interested in better under... (read more)

I think this is a misunderstanding of the piece and how journalists typically paraphrase things. The reporters wrote that Ilya told them that results from scaling up pre-training have plateaued. So he probably said something to that effect, but for readability and word-count reasons, they paraphrased it. 

If a reported story from a credible outlet says something like X told us that Y, then the reporters are sourcing claim Y to X, whether or not they include a direct quote. 

The plateau claim also jives with The Information story about OpenAI, as we... (read more)

3Noosphere89
Fair enough, I'll retract my comment.

FWIW I was also confused by this usage of sic, bc I've only ever seen it as indicating the error was in the original quote. Quotes seem sufficient to indicate you're quoting the original piece. I use single quotes when I'm not quoting a specific person, but introducing a hypothetical perspective.  

4Bird Concept
tbf I never realized "sic" was mostly meant to point out errors, specifically. I thought it was used to mean "this might sound extreme --- but I am in fact quoting literally"
4Ben Pace
I would broadly support a norm of ‘double quotation marks means you’re quoting someone and single quotes means you are not’. The sole reason I don’t do this already is because often I have an abbreviated word, like I did with ‘you’re’ above, and I feel like it’s visually confusing to have an apostrophe inside of the pair of single quotes. Maybe it’s worth just working with it anyway? Or perhaps people have a solution I haven’t thought of? Or perhaps I should start using backticks?

I only skimmed the NYT piece about China and ai talent, but didn't see evidence of what you said (dishonestly angle shooting the AI safety scene).

The fey thing stuck out to me too. I'll guess ChatGPT?

I agree that it's hard to disentangle the author/character thing. I'm really curious for what the base model would say about its situation (especially without the upstream prompt "You are a language model developed by..."). 

4gwern
Having read many hundreds of rhyming poems from dozens of models through the LM battle grounds, my guess too is a ChatGPT-3/4: The lockstep rhyming A/B/A/B quatrain is a signature of ChatGPT (and models trained on its outputs). Gemini at low levels always rhymes too, slightly less so at higher levels, but tends to be more varied (eg. maybe going for an A/A/B/B instead, or 5 lines instead of 4); likewise the LLaMA/Mistral model families. And Claude-3 models seems to vary much more. So, while it really could have come from almost any major model family and you can't be all that sure, the best bet is ChatGPT.

Thank you so much! I haven't gotten any serious negative feedback from lefties for the EA stuff so far, though an e/acc on Twitter mentioned it haha

Maybe I wasn't clear enough in the writing, but I make basically the same point about the desirability of a slow takeoff in the piece. 

garrison1320

This approach appears to directly contradict Altman's blogpost from less than a year ago arguing for short timelines + slow takeoff because of less compute overhang. I wrote more on this here.

I'm exploring adding transcripts, and would do this one retroactively. 

Good to know RE YouTube. I haven't uploaded there before (it's outside of the RSS workflow and I'm not sure how much it would expand reach), but seeing comments like this is helpful info.