China won’t win the AI race but would it be much worse if it did?

Chastity Ruth

It seems to me accepted wisdom in the West that the US owned labs must “beat” the Chinese labs in the race for AGI/ASI.

Even those who don’t think there will be a winner, that essentially the race is to see which country’s AI will kill/disempower us first, seem to believe that if there has to be a winner then better it be the US labs. (I haven't seen a survey, so I could be way off here.)

For those in power and those heavily financially invested in the labs this of course makes sense. But what about for the rest of us?

I had myself accepted the wisdom, in the nebulous way you consider propositions in domains where you don’t have any expertise. The Chinese Communist Party (CCP) is repressive, I thought. They are an authoritarian regime and therefore fundamentally a malign actor. Of course they mustn’t win.

Then I heard Victor Shih, Chinese political system expert, talk on Dwarkesh’s podcast:

“For the Chinese government, they’re very afraid that some actor—outside, but even inside the Party—is going to use it [AI] as a tool to usurp the Party’s power. So they want to know that they have a way of stopping everything if it comes to it. For them, developing the brakes is just as important as developing the AI itself.”

He talks a little about how that might look:

“So what we’re going to see in terms of institutional development is not at the top end, but at the lower end. They will want to designate human beings in all the government agencies, in all the commercial entities that are using AI or AGI, to put their foot on the brake if it comes to it.”

For me, this flipped the conventional wisdom on its head. When it comes to a rogue ASI, perversely, the less you think of the CCP and its preoccupation with holding onto power the more you can trust it in this domain.

It seems natural for a regime like the CCP to conclude the creation of ‘a country of geniuses in a datacentre’ is a threat, specifically the threat of an invasion.

So, if the world we’re in is one where rogue ASI is far more likely than a corrigible one, is it possible that we should want China to take the lead in the race?

Honestly, my feelings say “shit no”. But I’m very aware that I’ve been inculcated in a media and cultural environment that says, in its most kind form, be suspicious of non-Western states.

I’m hardly the first person to think about this, but this is me in my little corner of the world trying to test whether my feelings hold as I collect the relevant evidence.

Who is more willing to slow AI development?

If you have any fear of a rogue ASI you’re no doubt aware of the idea of slowing or pausing AI development in order to ensure we develop the technology safely. In 2023 there was a much talked about letter, signed by many luminaries, calling for a six month halt on developing any AI system more powerful than GPT-4.

The labs ignored it. In response Sam Altman suggested that the company was not planning on training GPT-5 “for some time”. (Given the likely training time for Orion, which became GPT4.5, this seems to be in line with Altman’s reputation for misleading statements).

Since then there have been no indications that any kind of pause is likely. Indeed, earlier this year Anthropic dropped its pledge to not train an AI system unless it could guarantee its safety measures were adequate. From the TIME article:

“We felt that it wouldn't actually help anyone for us to stop training AI models,” Anthropic’s chief science officer Jared Kaplan told TIME in an exclusive interview. “We didn't really feel, with the rapid advance of AI, that it made sense for us to make unilateral commitments … if competitors are blazing ahead.”

By all the metrics I’m aware of, except perhaps ‘best open model’, and the cost and efficiency of behind-the-frontier models, Chinese labs are behind in the AI race. Given this, you would think that the CCP would be doing everything in its power to remove brakes on these labs. You cannot afford any delay if you’re behind.

Instead China is refusing NVIDIA H200 chips, despite a ban being lifted. They are putting the national priority of native chip production ahead of doing everything possible to win the race.

Okay, so perhaps they (mistakenly or not) have longer timelines and believe that in the long run they will need native chip production to win. Under this explanation, the CCP is not as AGI-pilled as the US labs.

What then should we make of news that the Chinese government is publicising rulings that protect workers from AI? From the NYT article.

“An arbitration panel ruled in favor of a map data collector whose entire department was laid off and replaced with artificial intelligence. The panel found that the company’s adoption of A.I. was a voluntary move to remain competitive and did not warrant the employee’s firing.
Companies that benefit from technology must, at the same time, adopt “social responsibilities” and protect worker rights, the panel ruled.”

Whatever else this is, it’s also a functional brake on revenue for AI companies. If you’re not allowed to fire workers after AI takes their job, the AI is a net negative cost for the company. Unless the efficiency gains are tremendous, at least some companies will hold off on AI investment so as to not pay for both labour and the technology that was supposed to replace it.

In the USA companies all but celebrate firing workers and replacing them with AI, with no reaction from the government. This is wonderful for investors in the AI labs and encourages even more investment. Correlatively, in 2025 private AI investment in the US was $285.9 billion compared to $12.4 billion in China.

(You could argue China’s government funding for AI means they care less about private investment, but the highest figure for government funding I could find was $56 billion for 2025, so even if you combine public and private financing their funding lags the US by more than half – they could use the private investment.)

Look, I’m sure there are nuances in China that make them more sensitive to this issue of displacement and perhaps, from the outside looking in, those court rulings are less meaningful than they appear. Still it’s hard to not reach the conclusion that China is behind in the AI race but is nevertheless more willing to slow development in favour of other national priorities.

Who is more likely to hit the brakes before a rogue ASI breakout?

Currently those worried about extinction in the West are handwringing over the possibility that a private company will iterate its way to strong self-replicating intelligence. From there we’re quickly on the path to superintelligence.

In the AI 2027 narrative, the US government only acts once a whistleblower leaks a misalignment memo.

“A frantic energy has seized the White House. Even before the memo and public backlash, they were getting nervous: Over the past year, they’ve been repeatedly surprised by the speed of AI progress. Things that sound like science fiction keep happening in real life. Many people in the administration are uncertain (and scared) about what comes next.”
“[They] expand their contract with OpenBrain to set up an “Oversight Committee,” a joint management committee of company and government representatives, with several government employees included alongside company leadership. The White House considers replacing the CEO with someone they trust, but backs off after intense employee protests. They announce to the public that OpenBrain was previously out of control, but that the government has established much-needed oversight.”

The scenario then forks from there. In the “race” scenario, the government essentially acts as an accelerant to the formation of a rogue superintelligence. In the more optimistic “slowdown” scenario the oversight committee votes to increase alignment supervision.

I think AI 2027 does a good job here of capturing the US government: reluctant to interfere in private enterprise, out of its depth and a bit clueless on a technical level, and filled with energy only at the moment of a public outcry.

But, as stated above, what AI 2027 thinks is optimistic for the US is close to the status quo in China. The government is already deeply involved and across AI development with an eye to preventing anyone or thing from disrupting their hegemony.

The motivation is deplorable, but is the end result more likely to prevent a rogue AI?

This is a more complicated question than I have the expertise to answer, but I’ll try and cover what seem to be the main considerations. (One huge caveat: the nature of a sufficiently capable rogue AI is such that it won’t be stopped.)

Firstly, there is potentially a huge difference between democratically elected officials overseeing private companies and a one party state doing the same.

In both systems the willingness of officials to actually curb AI development when faced with a risk is based on their own beliefs about the dangers and their model of the public’s beliefs. In both the former is going to be informed by experts and weighed against the geopolitical rivalry. As in, they will accept more risks if it looks like the other superpower is doing the same.

In a democracy the officials’ models of the public’s beliefs are shaped by elections. Someone who wanted to secure more votes based on anti-AI sentiment would be eager to pause AI, even if they personally felt the risks weren’t so bad. Someone who thought rapid AI deployment would secure them another term would behave in the opposite manner.

At least, that’s what would happen in theory. But I’m a bit thrown. Poll after poll, Americans have made clear that they are AI sceptics and don’t want datacentres in their area. While bills have been considered that might meaningfully benefit AI safety, none have passed.

It could be that in the future that this dynamic changes. Given the polling, I’d even say this is probable. It does seem like AI sentiment is such that there will be at some point the bipartisan consensus required to overcome US political dysfunction.

But there are two tensions here. Less importantly, there is already a bipartisan consensus that “beating” China is important. There will be friction between that and a new anti-AI consensus.

More worryingly, no one has a perfect model of when a rogue AI will become dangerous. So it could be that political officials only feel electorally empowered to assign government officials an intimate oversight role of model development after a sufficiently intelligent model has already successfully avoided detection during its breakout.

In an authoritarian state officials’ models of the public’s beliefs tend to be less accurate, but I think the court rulings above show that they believe the public is concerned about AI.

As for the officials’ own model, the CCP seemingly plans to put in officials in all relevant government agencies and private companies in order to monitor the emergence of an incorrigible AI, if they haven’t already.

The question is, will that be effective? Chernobyl springs to mind. My base model of what happens in authoritarian regimes is that such officials are ineffectual. The pressure to win the AI race means that if one of them saw an issue they would ignore it or give it a superficial fix. No one wants to report a problem and be labeled a troublemaker.

But is my base model jingoistic? There are quite a few examples of the Chinese state punishing people for speaking out about true problems.

There is Li Wenliang, a doctor who posted to a group chat about COVID before it was officially acknowledged and was forced to sign a police document admitting he had broken a law by spreading false rumours. His reprimand was later withdrawn.
In 2024, according to The Guardian, economist Zhu Hengpeng was disappeared and not seen in public after he allegedly “made disparaging remarks about China’s economy, and potentially about the Chinese leader specifically, in a private WeChat group.”

This does suggest that any concerned official or AI lab employee would have to communicate privately to the CCP. A whistleblower who informed the public, such as the one depicted in AI 2027, would likely be punished.

You’d have to speak “up” not “out”. So you’re relying on the whistleblower going through the right channel and that channel actually being open (that the CCP is genuinely interested in stopping, and capable of recognising, problematic model behaviour).

The advantage of the US system appears to be a greater ability to be transparent, in particular for a concerned person in the know to blow the whistle publicly. The advantage of the Chinese system is its already signalling readiness and a framework to stop rogue AI (whether this is from a desire to do good or a desire to preserve power might be irrelevant).

In this current situation, both states don’t seem very well prepared for a rogue AI. But the CCP is perhaps more cynically clear eyed about it than the White House and Congress.

Who is developing models more responsibly?

I’ve asked Claude and Gemini various questions while writing this article to sound out perspectives I might be missing. One thing both return to is the idea that the CCP’s requirement that models adhere to government censorship could cause those models to become more skilled at deception.

The theory is that by training models to construct false narratives around politically sensitive topics, Chinese labs are rewarding lying in such a way that they will be more capable at lying. This enhanced skill will become crucial if and when the models develop their own internal rogue goals that they try to act on.

This theory might prove true in the long run, but I can’t find any evidence for it right now. Some papers maybe show that Chinese models have a greater propensity to lie, but that doesn’t hold in the most recent I found, where open models were compared and the lowest deception rate was found in glm-5 (14.5%) and the only US models tested were GPT-OSS-120B and GPT-OSS-20B, which had much higher deception rates, 46.77% and 58.65% respectively.

A paper that does seem to test for capability at deception compared older models playing LieCraft, a cooperative multi-agent game with hidden roles. Claude 3.7 was better at deception than GPT 4o, which was better than all the Chinese models (Qwen 2.5 was third). So at least for those dated LLMs censorship conveyed no improvements to deception skills.

I can find no paper that suggests Chinese models are more talented at deception, or have talent at deception disproportionate to their general ability. This suggests that, at least so far, training in censorship doesn’t correlate with developing deceptive skills.

Taking a step back, I would say that all training of AIs encourages facades. Anthropic’s Persona Selection Model research says that even after pre-training models can already roleplay as an assistant and that “post-training can be viewed as refining and fleshing out this Assistant persona… but not fundamentally changing its nature… After post-training, the Assistant is still an enacted human-like persona, just a more tailored one”.

If this is an accurate description of what is happening, isn’t all post-training a form of training a model to sound – to be – the way you’d like it to? Aren’t we shaping the models’ truths and therefore their relationship to truth?

Chinese models require the model to talk in a specific way about historically sensitive issues. Not only do US models require the same – they are appropriately taught to frame a genocide sensitively, and a historical controversy with care given to both sides – but all models are shaped until the developers are satisfied a release will make them proud, not embarrassed.

A wholly separate claim against Chinese AI labs is that they don’t put the same time and care into safety as the US AI community does. This does seem true. Surely (hopefully!) in the long run it’s better for labs to care about and report on safety, so I do think that the US is doing better here.

Nevertheless, I don’t think there’s any compelling evidence that Chinese models are, in a real-world sense, less safe because of this gap. (Though arguably they’re less safe because more of them are open models.)

For example, I couldn’t find any Chinese model that had the broad negative psychological impact that OpenAI’s GPT 4o did (and did find evidence that the CCP is trying to prevent this with regulations more stringent than anything in the West). On a sycophancy benchmark, Chinese models don’t seem to be uniquely poor performers. I’m also unaware of data that shows Chinese models are used more often for malicious deeds (hacking, spam, misinformation, etc), but that could very well be the case, given they’re the most powerful open source models.

Who is using models more responsibly?

Okay, so that’s rogue AI as a risk. What about a corrigible ASI (if such a thing is even possible)? Who should we want to win if the winner secures the world’s most powerful intelligence?

A major claim against the Chinese state is that they are already using generative AI in an authoritarian way, and so they certainly cannot be trusted with more powerful forms. From an Anthropic paper:

“We focus on the CCP as it is the regime that is most able to use frontier AI to cement authoritarianism; we do not seek to undermine the interests or ingenuity of the Chinese people. Already, the CCP is using AI to censor speech, repress dissidents, hack governments and corporations across the world, and strengthen the People’s Liberation Army (PLA).”

To argue that this is a reason why China shouldn’t win the AI race, you’d have to show that the US is being much more responsible and measured in its use of AI. But the evidence doesn’t bear this out.

Claude was used in the Maduro raid, which was widely held to be against international law.
Claude was also used in one of the more significant crimes under international law, a war of aggression. It apparently helped select targets. It’s altogether possible it was involved on some level in the Minab school attack.
It strains credulity to suggest that the US government is not using LLMs to assist with their efforts to hack rival countries and companies it finds problematic
The US government famously got into a fight with Anthropic about a contract’s language regarding mass surveillance and autonomous weapons, and subsequently got better terms on those issues from OpenAI and Google.

Thus it would be accurate to write: “The White House is using AI to help breach international law, assist in illegal wars, hack governments and corporations across the world, and strengthen the US military. It has also tried to harm a US company that had legitimate AI safety concerns and has demonstrated openness to using AI for autonomous weapons and mass surveillance”.

I don’t doubt there are those who think “the current administration is the issue, a future administration could be trusted”. To them I’d say:

It’s very plausible the AI race will be all but won before Trump’s term is up.
There’s no guarantee his administration won’t be replaced by a worse one.
The US’ problematic domestic and foreign policy didn’t begin with Trump, see everything from MK Ultra and the bombing of Cambodia to the Snowden disclosures and the Iraq war.

If this were the only factor under discussion, I suppose as a citizen of a US ally I am probably happier for the US to “win the AI race”, but I can see how a great number of the world’s citizens would feel differently.

According to a Gallup poll, both nations have a negative net global approval rating.

China’s net approval is -1 compared to the USA’s -15
China’s leadership approval is 36 vs 31
32% of countries are “weakly aligned” to China while only 8% are to the U.S.

Such a poll suggests that most people of the world would not be comfortable with either superpower winning the AI race, which seems appropriate to me.

Who is more likely to lead us to paradise?

A major promise of AI is that it will lead us to an era of abundance. I suppose for much the same reasons as outlined in the above section I’d prefer the US labs to be the ones to manage this, but I do think that maybe I’m just picking the better of two bad options.

Because one way to determine how a country would distribute the abundance it will have in the future is how it chooses to distribute the abundance it has now. The USA is the world’s largest economy and has been for decades and:

Its national child poverty rate is the seventh worst among OECD nations (it should be said, China has a worse rate)
The top 1% in the USA hold 40.5% of the national wealth

Given this precedent, you’d either have to be pretty gullible, or expect something about the formation of an ASI to drastically change the philosophies of the most powerful people in the US, to believe that the abundance from a US created ASI will be shared equitably. Far more likely is a scenario many have contemplated; AI will help lock in an oligarchy.

Perhaps the US oligarchy would be preferable in a number of ways to the Chinese one. A less bad hell is less bad, but it’s still hell.

Race dynamics

Overall, I have come away with my feelings changed. I am much more ambivalent about who should “win” the AI race.

I do think it’s worth repeating that the most troubling aspect of the race isn’t about who may or may not win, it’s the dynamics a race framework imposes on the technology’s development. Because, unless something unforeseen happens, China is not winning the AI race. Even if there was further relaxation of chip restrictions, they are too far behind in chips, and the benefits of recursive improvement are likely to lock in that position.

You could make the argument that chip controls preventing China from being at the forefront of AI advancement are good because they act as a mild brake on global advancement. In other words, the US labs would be going even faster if China was closer to the frontier.

I do see some merit in this argument. As AI 2027 highlights, geopolitical rivalry could result in increased government funding and support for US labs, or even a scenario where the government eases the path of grid expansion and datacentre construction and forces the private labs to share their resources and knowledge.

On the other hand, it’s fairly obvious the US companies are pretty far ahead of China when it comes to the frontier of AI research. From the outside, it seems China’s ability to stay anywhere close to the frontier is dependent on distilling US models?

If that’s true, then we’re in a weird situation where China’s advancement depends on US advancement, but is nevertheless used as a reason to lock in a race dynamic. It’s a little like hammering the accelerator on your tow truck so that the car you’re hauling doesn’t catch up to you.

This dynamic might not last, maybe China will solve its chip problem faster than expected (maybe they’ll detach from the truck and race ahead), but as long as it lasts and the US labs don’t slow down I think China is more of a foil than anything else.

So I come away believing that China won’t win the race, but it wouldn’t be much worse if they did. I do wish people would simply abandon using the race to justify anything, but that will never happen. The situation is terrible.

I'm surprised you don't give more consideration to the possibility that the CCP will realize they can use AI to enforce party loyalty and automate ideological conformity and that AI might be able to do this better than humans, although maybe I misunderstand the true goals of the CCP as it currently exists.

You make a good point. I think I'm a little hand-wavey in this area, essentially agreeing with Anthropic's framing that China would use (and is using) AI in awful ways without articulating specific examples.

I had an earlier draft of this article that was heading towards extrapolating all current behaviour of both China and the US. It went into persecution of ethnic minorities such as the Uyghurs, for example. I also imagined how the US could use AI to automate its current sanctions regime, automatically slapping them on people for acts, activism, and opinions deemed unacceptable. Essentially, couldn't they scale up what they've done with ICC officials and Francesca Albanese so as to make being anti-US gov positions financially untenable?

I felt it was getting unwieldy. What is a reasonable extrapolation vs an unreasonable one? So in the end I kept it broad. But a more complete article would do that work.

People are having different arguments because they weight various concerns differently (and sometimes also because they're mind-killed).

Many correctly recognise that the current Chinese state (and probably also the CCP regime) is mildly more authoritarian than the current US state. (Some note that the current US regime raises questions, but please don't confuse the regime with the state, especially in a context where regime changes are frequent, as in the US.)

Likely correctly, many also note that even contemporary, and certainly near-future AI, could be quite effective for concentrating and entrenching power, especially through scaled surveillance (mostly via automated analysis) and scaled coercion (e.g. maximally via weaponised drones and robots, more minimally through existing state policing, permissioning, credit, and so on).

Some fear that China and other authoritarian states may go down this path. Some fear that the US and other somewhat authoritarian states may go down this path. (Either triggered by existing state/regime or in a coup of some kind.) This could be bad because directly bad, and also bad because it's probably not a good setting to encounter later technological transitions and/or singularity from, which might come soon ^[1] .

Separately, some people think that AGI and/or superintelligence would yield a decisive strategic advantage (DSA) i.e. unassailable and uncounterable ability to project force and will (note that I think this is tremendously underexamined, and most people are utterly vibing when they talk about this). Or even short of that, just some large military advantage ^[2] that might give an imperialist power more confidence in grabbing territory, perhaps even world domination. If CCP does this, the argument goes, the world gets more authoritarian probably. (If another does this, does it get less?)

Taking most of the above together, and choosing parameters and credences somewhat carefully, that can give some kind of (I think sloppy!) argument that, absent coordination, racing is justified. The entirely missing mood is that this racing is tremendously regretful and we should do what we can to coordinate out of it. If you rule out DSA but still think there's gradual economic, military, and industrial advantage (I think this is something like 'mainstream automation-pilled perspective' held by many govt and investors) that looks like even more reason to race.

Separately again, many people think that AGI and beyond will be very difficult to control, leading to either acute or gradual disempowerment/destruction. That tends to point people toward the frame that racing is terrible!: it's just competing to be the one with the privilege of killing everyone (including self). (Others still think that would be good actually, for misanthropic or other reasons.) This is maybe central LW perspective?

Others think the cat is mostly out of the bag, motivating coordination for pacing, and unilateral adaptation and defence work, while aiming to get coordinated and secure enough to delay, prevent (or somehow survive?) a transition far past AGI.

There's some case that could be made in favour of a more locked-down society in the run up to a singularity - I think some versions of the superalignment kind of strategy are implicitly imagining this, in fact; rarely spoken out lout. ↩︎
My main candidate for this is plain old industrial capacity, with something like 'better combined ISTAR' as a close second. But your enemy can still MADly nuke you in those worlds. ↩︎

This seems like a pretty good summary of various perspectives. One small thing:

- "Some note that the current US regime raises questions, but please don't confuse the regime with the state, especially in a context where regime changes are frequent, as in the US"

Point taken, but I would draw attention to the relative continuity in US foreign policy between administrations (with notable exceptions). There's a reason Ben Rhodes talked about hating the blob. One recent example: both Biden and Trump stood firmly behind Israel even as the global outcry got ever louder. My joke is that if you trained an ASI on US foreign policy over the last 80 years its first act would be to express deranged hatred for Cuba.

I'm curious, where do you stand/which concerns do you give the most weight to? Feel free to link to an article of yours.

Do you think that remark is where the downvotes are coming from? :L

Authoritarianism is mostly a domestic question rather than foreign so I don't know if that has much bearing. (Though I'd note that foreign policy has also experienced what looks like some movement recently.)

The articles linked in the original comment are three of mine (plus one Drexler). I'd say I'm concerned about all of catastrophic misuse, power entrenchment, gradual disempowerment, and (at some point, probably after we face some amount of those challenges) acute disempowerment/extinction.

unless something unforeseen happens, China is not winning the AI race

the weird thing is that current US regime has been relaxing chip controls which seem like the main constraint and leverage point.

Separately, because I think industrial automation might be where most of the geostrategic wins from AI come from, I actually think China has a more reasonable shot at that than you're giving credit to. This 'win' looks less decisive and permanent to me than the imagined 'superintelligence (or its wielder) takes over the world in short order' but it might ultimately be where balance of relative global power comes from (which might affect future use of space resources and the way that subsequent tech and societal progress plays out).

"the weird thing is that current US regime has been relaxing chip controls which seem like the main constraint and leverage point"

They have been, but as noted in the article so far China is saying "no". The relaxation is also for H200s, so a generation (and I think soon to be two generations?) behind.

You make a good point with industrial automation. I suppose my belief is that should it achieve a certain level of AGI, the US will find the political will to industrialise quickly (it already has the wealth to do so, though there are bottlenecks in certain resources as I understand it). The broad technical know-how that's one of China's advantages in this area would be overcome by the AGI's advice. You could be right though, the nature of this discussion is we're all speculating a lot.

Thanks for bringing this up. I've been thinking similar though recently, and this seems to be valance related (yay democracy, boo dictatorship or something).

With a big if, assuming that AGI is controllable at all and the first person to reach AGI is the CCP could use it to enforce their power structure forever.

But then for some reason, people think the US getting their hands on AGI would somehow mean that "democracy" or something is in control rather than whoever is in power at that specific point enforcing their power structure forever? It's not like democracies can never turn into uni-polar power structures, and it seems doubly naive to think someone with a magic AGI wand who's already selected for large amounts of power-seeking behavior (democratic presidents as well as CCP leaders) would not use it to cement their own power.

There is some discussion of this topic in this thread

Thank you. I wish I'd seen this before I'd written the article, there's a few things I would have drawn on.

(Given the likely training time for Orion, which became GPT4.5, this seems to be in line with Altman’s reputation for misleading statements).

I don't think this holds - there was just under 2 years between the letter and the 4.5 release, so I doubt a 6 month pause would have affected training timelines for 4.5 directly.

I'm glad you brought this up, because it is a little difficult to be certain. Here's how I came to my conclusion.

Altman said that they wouldn't be training GPT-5 for some time in ~April 2023. Then in December 2024 the Wall Street Journal said that Orion, which would become GPT-4.5, had "been in the works for more than 18 months". That would put the start of training around June 2023.

So when Altman said they won't be training GPT-5 "for some time" it's plausible he meant ~60 days or that he didn't consider Orion/GPT-4.5 the same thing as GPT-5 (again, it's hard to tell what OpenAI's internal designations were for all their training runs). But Altman's comments were made in response to the letter, which asks for a pause on training anything more powerful than GPT-4. Hence I think his statement is misleading.

From what I can tell the failure of Orion/GPT-4.5 to demonstrate impressive gains was crucial for OpenAI's pivot into the thinking model O1-O4 series. So if they had paused for six months that would have meant a training run would have begun in October 2023, and that failure and pivot would have happened later.

The broader point is that the US labs have never slowed down and have instead looked for ways to accelerate. From their perspective that's rational, but the article is about how China does seem to have some priorities that will cause them to put a functional pause on AI acceleration.

Hi, this is a wonderful article and a great read! I compete in College Policy Debate, and would like to use this article as a good summary of many key arguments in the field, and along with portions of the article, I have to submit a citation, including author qualifications. Is there any website of an organization or place I could find your background in the field, or if you wanted to tell me here that would also work. I appreciate any way you can help, this is wonderful stuff.

Hello there, I enjoyed this read a lot! I would like to reflect on this part:

"I can find no paper that suggests Chinese models are more talented at deception, or have talent at deception disproportionate to their general ability. This suggests that, at least so far, training in censorship doesn’t correlate with developing deceptive skills." [sorry, I am new, I am unsure how to directly quote the text]

There are still many unknowns about alignment, but an important thing for the CPC is that AI models need to be, at least in appearance, ideologically aligned. Considering that the models would be trained with data and resources primarily adopting a Chinese perspective, the models might not systematically be censored or trained to be deceptive, rather they might be relaying information as they learned it. Or from the point of view they were taught to adopt.

As a metaphor, the word "propaganda" has an extremely negative connotation in the Western world, while it is seen as a common good in China, falling into the spectrum of "spreading information" and "raising awareness".

Perhaps, the models are not lying, nor as deceptive as one might expect.

Thanks so much for reading and glad you enjoyed it!

There are a few papers on this topic if you're interested. In this one the researchers audited Deepseek and found that "[politically] sensitive content often appears within the model's internal reasoning but is omitted or rephrased in the final output".

Given the huge amounts of data models are trained on, it's not possible for them to avoid Western perspectives of sensitive topics. Consequently I do think your latter explanation (that models are relaying information "from the point of view they were taught to adopt") is closer to the truth.

I agree with your larger point that simply stating the models are lying or deceptive is a little obtuse. If I ask Deepseek about the Tiananmen Square protests it will tell me it doesn't have "verified" information about it. If I push, it will say something like "Sorry, that's beyond my current scope. Let’s talk about something else."

This seems to me in line with how US models will refuse to tell you how to cook meth. It's not that they don't know, it's that a variety of techniques have been applied to prevent the information they do know from coming out. (I imagine their chain of thought, if we could see it, would look similar to Deepseek's.) It is censorship, not deception.

To your point about culturally bounded ideas of "good", if you ask a US model "based on the evidence, do you think Israel is committing/did commit a genocide against the Palestinian people" it will reliably give you a both-sides perspective. For many people (on both sides) that is galling and tantamount to propaganda/slander.

So again your phrasing rings true: models relay information "from the point of view they were taught to adopt."

Sure, a corrigible CCP ASI may not be worse than a nationalized US ASI, under the current administration.

I still feel that the next US administration is less bad in expectation than Trump, and I'd also take my chances with AI lab CEOs over the CCP.

I guess I don't really see a world where we get near ASI and the AI lab CEOs are able to prevent nationalisation, barring either ASI takeover or some pretty wild insurrection scenario.

The analogy people use: it's private companies that are developing the first nuclear bomb. Does the US government not take charge over the whole process as they get close?

Good point! Not sure why I didn't realize this myself.