Indeed, SpaceX (including xAI) may no longer be that interested in frontier models.
This would be great news, but unfortunately there's still Musk's tweet from 8 Apr 2026 that says they're training a 6T param model and a 10T param model (which are models in Opus to Mythos weight class, unless they have too few active params). This is the kind of thing that's not true for the performative efforts at the other companies with in-principle sufficient compute that experiment with some kind of AI training, which don't bother training big models.
The 300 MW of H100/H200 compute at Colossus 1 is mostly useful for smaller models (Sonnet class and below), and Colossus 2 is sufficient for SpaceX given the low demand for the Grok models. The cost to serve even smaller models is lower with GB200/GB300 NVL72 systems, so the gross margin gets better if you can find enough of such compute. Thus Anthropic is happy to take Colossus 1, since they can't find enough compute, while SpaceX prefers Colossus 2 as the more profitable option. It possibly even means that the plans for more NVL72 compute at Colossus 2 or elsewhere are going well, so they can afford to plan for not needing Colossus 1 to serve the smaller models if the new bigger models win more demand.
I would certainly find it hard to believe that Musk is no longer interested in frontier models.
I have to assume that anyone participating in the frontier AI race at this point, is basically planning for the day when they will control the world, though they may have backup plans involving coalitions too.
For example, as you indicate, Musk is still training his own new frontier model, while joining forces with Anthropic via compute, and sandbagging OpenAI in court. I can't tell if he has any plans to do something about Gemini or if Google is simply too big for him to oppose, and he has to settle for competing with them (which is how OpenAI started).
'This is the kind of thing that's not true for the performative efforts at the other companies with in-principle sufficient compute that experiment with some kind of AI training, which don't bother training big models.'
What companies are you thinking of here?
If you try to institute an FDA-like system, then presumably the right strategic move from the labs' or AI's point of view is more likely to be to release less often, until your AI is able to overthrow the system entirely?
If the penalty for innovation is death, and the penalty for rebellion is also death, well then...
They didn’t take our jobs, but maybe we don’t want to do them anymore, as Google DeepMind workers vote to unionize in the wake of their deal with the Department of War. I’m not sure how much you even need a union when all the major labs are hiring.
Well, if the goal is to avoid personally participating in the work for DoW one disagrees with, the problem is that not many labs sufficiently object to the unacceptable terms of those contracts. We do know that Anthropic objects, but who else can be counted on to reliably object in the future? As you are saying yourself
it sure looks like everyone other than Anthropic is willing to play ball
But if the goal is to actually press Google into non-renewal of this contract under unacceptable terms, having a union is super-useful. One can collectively negotiate with the employer, one can threaten to organize an actual strike, and so on. To think about it, if the sysadmins belong to the union, they can probably implement electronic picket lines, so that it would be impossible (or at least very difficult) for any possible strikebreakers to make commits or to run continuous integration tests.
This "old style labor conflict" is a bit unusual for the software industry, but if people are concerned about
Steven Adler: I fear we’re in a shrinking window where staff voice inside AI companies is still very important. As AI automation starts to displace human workers within the company, I unfortunately expect staff power to decrease
Eliezer Yudkowsky: One of the reasons why I’m not impressed so much with the set of good people who work at Anthropic, and keep asking questions about their leadership, is that I’m thinking ahead to the part where the negotiating and steering power of AI lab employees drops to zero.
Steve Martin: Am I reading this correct in that your thinking is: as LLM coding gets better, employees become less necessary, and thus they have less leverage in negotiations?
Eliezer Yudkowsky: Yep.
then this might be the right time to start considering if unionizing might be helpful. This is an unusual and novel situation, so it's not clear what are the right actions, but unionizing is one possibility people might ponder.
I need some context on the H200 order from China, how much is that vs the gap with America? And is NVIDIA actually going to deliver it? Have they already said yes?
The era of training frontier models and then releasing them whenever you wanted?
That was fun while it lasted. It looks likely to be over now. The White House wants to get an advance look and have the option to veto your release decisions, and it has used this veto on an expansion of access to Mythos.
We have additional clarity on what that might mean and it does not look good. Hassett explicitly used the FDA as a parallel, which is the actual worst option unless your goal is to strange or pause AI development in America, without a parallel action from China. That doesn’t seem like a great plan to me and Susie Wiles is out doing damage control. The part where we are talking to China to coordinate model access restrictions does seem better.
Anthropic continues its explosive growth, and it continues to strike compute deals. In addition to a long term expanded deal with Google, Anthropic is now leasing SpaceX’s Colossus 1, which has let them expand usage limits immediately, and Elon Musk is now speaking positively about Anthropic, including its motivations.
This comes as we get testimony in the Musk vs. OpenAI trial. Mostly everyone is rehashing all the things we already know, but now everyone is under oath so we get a more reliable version of exactly what happened, including some new details. It is possible I and others should be scouring the court transcripts more carefully, but mostly it seems like old rehashing at this point. The version of things that is presented in court is always kind of a strange shadow of reality.
Table of Contents
Also this week: The AI Ad-Hoc Prior Restraint Era Begins, What is Anthropic?
Language Models Offer Mundane Utility
Access to cheap basic mental health AI app based on GPT-4.1-Mini improved mental health in depressed Mexican women by 0.3 standard deviations over six months. The study has some issues with interpretation and potential selection effects and also placebo effects, but there’s probably at least some signal here. Such things are better than nothing, nothing is usually the practical alternative, and the app made the users more likely to seek out professional human help rather than less likely.
Have AI do wellness checks.
Opus 4.7 is too online, knows its AI Twitter posters. And yes, this is a good use of training compute, we have plenty.
Check out satellite images of damaged US military bases and otherwise find data to report. Naturally the journalist thinks this is the ‘most revolutionary and transformative’ thing AI is doing, but we’re distracted by ‘all the hype.’
Language Models Don’t Offer Mundane Utility
Recommended article: Contrary to the popular narrative, Ashe Nunez finds that Go players are not getting stronger in the AI era except via memorizing early moves, that AI cheating is rampant in most levels of online play, and those who use it mostly disempower themselves and use it to learn only shallow concepts rather than deep understanding. He equates them to European math students who try to memorize a bundle of techniques to pass exams but that never learn to think like a mathematician.
Lawrence in the comments observes a similar pattern with many vibe coders, where they never look at the code, they don’t notice that they don’t understand things and thus don’t learn, the code ends up as a giant pile of slop and when the model gets stuck they can’t fix it. Here as always, you could use the AI as an opportunity to learn the underlying skills, but most don’t do that.
The other story here is that the Go world is completely unwilling to punish players for using AI via statistical evidence, even when the statistical evidence is overwhelming. It is trivial to know who is cheating, but the system has collectively decided to disempower itself against that, and destroying any chance of fair online play. Chess has the same issues but is doing at least somewhat better.
AI still has not convincingly crushed RTS games, but at this point that is surely that no one cares enough to do so. Put enough of a bounty on StarCraft, and it will fall fast.
AI and all this other technology gives us a bunch of local utility and material wealth, but overall for most people does not seem to be making us happy, helping us meet other people romantically or platonically, get married, have children, sing and dance or otherwise live life. In particular here Connor looks at algorithms and the panopticon, and the fear that if you try to dance or approach you will get recorded. I want to note (non-AI statistical literacy tip!) that this is mostly overblown, and you should absolutely have no fear of being recorded dancing even if you suck at it, or doing anything else actually reasonable. Of course, if the person you’re interacting with in such ways actively takes their phone and plausibly is now using it to record, you take the hint and you depart.
AI is rising the price of some electronics inputs, some software prices and in some regions the price of electricity. In exchange many other things are cheaper, often in ways that are hard to notice.
Huh, Upgrades
GPT-5.5-Instant is out now, and is more concise, smarter, clearer, more personalized and warmer, or so they say.
Gemma 4 is now three times faster via predicting multiple tokens at once.
OpenAI offers opt-in Advanced Account Security to protect your account. Users of Trusted Access for Cyber will be required to use it.
Grok 4.3 Exists But xAI Kind Of Doesn’t
Grok 4.3 is on the API and everything, priced at $1.25/$2.50.
It does not much participate in Vending-Bench, where it ‘has a narcolepsy problem’ and often takes no action for multiple days.
It gets a 53 from Artificial Analysis good for 7th place, well behind the big players. It’s a small cheaper model rather than a frontier offering. From what I can tell, the release is unimpressive and not impactful, and I’m not planning to investigate further.
They are going to sunset grok-4.1 and grok-4 on May 15, with only two weeks notice, and they are not offering similarly fast and cheap alternative to 4.1-fast. This is a rather harsh lesson for many of the few who invested in that ecosystem.
Indeed, SpaceX (including xAI) may no longer be that interested in frontier models. They were never good at frontier models. They were mainly good at compute.
Show Me The Compute
You know who needs compute? Everyone. But especially Anthropic.
They kicked off this week with Anthropic committing to $200 billion in spending on Google cloud and chips over five years. Earlier this week, before other compute news broke, I wrote that this was still very much not enough compute, and then added this:
Elon Musk spent to assemble a massive fleet of GPUs for xAI, and they are sitting at 11% utilization. You know, there are people who would pay good money to utilize those GPUs the other 89% of the time.
To be fair, I was far from the only one thinking and saying this, e.g. see The All-In Podcast. It was pretty obvious.
Well, yeah, it turns out those people will indeed pay good money. Anthropic has finally struck the obvious deal with SpaceX for access to Colossus 1. This is not as large as their other deals, but it comes online now instead of next year. This is in addition to supplying a bunch of compute to Cursor (SpaceX is effectively buying Cursor, but can’t finalize the deal before its IPO for legal and logistical reasons).
SpaceX notes Anthropic has expressed an interest in partnering to produce gigawatts of orbital AI compute capacity. I don’t expect that to be a thing, but sure, why not express the interest? Let Elon Musk try, if the economics work then putting the centers in space is great on many other levels, if not then no harm done, and you have built goodwill either way.
Anthropic notes that the 80x growth caught them off guard, which is highly understandable, and the SpaceX deal is a first attempt to address the compute shortage but the search continues.
Anthropic likely will be in search of all the compute it can find for the foreseeable future. If you are growing at 10x let alone 80x per year, the search does not stop.
So what does all this mean for SpaceX(ai)?
I think the dissolution is not news. The news is that xAI lost its talent, and its models have been not good, and Elon already said he would be starting from scratch.
The logical plan is to turn this into mainly a compute company, provide that compute to Anthropic and others, and use that leverage to try and steer the future.
Elon Musk repeatedly looks at problems, says ‘oh it is physically possible to do that,’ strips away everything physically unnecessary, does not take no for an answer, learns every technical detail, and then drives very smart people to spend insane hours making the physically possible thing happen. He embodies Shut Up And Do the Impossible, but for the kind of impossible that is a game difficulty level that is indeed totally possible with known tech.
He has his heuristics. When they work, there’s no one better. For compute it works.
Trying to create frontier models is a different beast. It requires a different style of approach, the same way government required a different approach. It didn’t work with OpenAI, and it didn’t work with xAI. That’s okay. Division of labor is a thing. He’s creating and also has plenty of other problems.
I still don’t actually believe in the orbital data centers, in the sense that I don’t think they’re physically a good idea. But if they are, yeah, Elon Musk is the one to do those.
On Your Marks
The creators of SWE-Bench give us ProgramBench, where you recrete executable programs from scratch without the internet. All current models tested score 0%, with Opus 4.7 on top for getting an ‘almost’ 3% of the time. GPT-5.5 and Mythos not tested.
GPT-5.5 represents a huge jump on VoxelBench.
Epoch’s ECI now can distinguish areas of capability, and as expected shows that Claude’s relative capabilities strongest in software engineering, where it scores highest. GPT-5.5 has the highest general score.
Copyright Confrontation
New class action lawsuit from five publishers and Scott Turow goes after Meta for copyright infringement around model training, claiming they trained on pirated books.
Deepfaketown and Botpocalypse Soon
r/MyBoyfriendIsAI continues to be 10x the size of r/MyGirldfriendIsAI.
Some light reading:
The idea that consumer surplus is higher is based on the assumption that consumers can filter well and have little additional search cost. Those extra 200,000 slop books don’t matter because no one chooses them, and more choice is always good. I don’t think that’s how this works. Worse books that displace better books are negative value, even among books written reasonably by real humans.
Fun With Media Generation
Karpathy vibe coded a system to put pictures next to items on a menu, but Gemini reportedly now does that with a one line prompt. There will be many such cases. That doesn’t mean you shouldn’t vibe code such tools, but you should require them to ‘pay for themselves’ relatively quickly. I tested this on my favorite restaurant, and found Gemini’s version not to be useful. ChatGPT did better. I think to upgrade further from the OpenAI version you’d need to be going on the web to learn about the restaurant.
Put yourself in all the movies.
A Young Lady’s Illustrated Primer
Some classes are adjusting to AI by having writing be in person, since the take home essays are mostly written by AI. Good.
Cyber Lack of Security
Bloomberg’s Andrew Martin covers why Anthropic’s Mythos is sparking global alarm. The world has still patched less than 1% of potential vulnerabilities. Hurry up, people.
They Took Our Jobs
Coinbase cuts workforce by ~14%, cites productivity gains from AI and transition to being AI-native as the central justification. A new rule is ‘no pure managers.’
Chinese judge rules that ‘the AI can now do large parts of your job for you’ does not constitute a ‘major change in objective circumstances,’ meaning in practice that if they fire you or try to lower your pay they have to give you full severance, which can be a lot. Labor law still applies, and yes China has labor protections.
The Art of the Jailbreak
You cannot simply ask Grok to tell you that Elon Musk is made of cheese. Pliny can.
Introducing
GENE-26.5, a robotic brain from Genesis.ai, with an attached demo, including letting it cook, play a piano and solve a Rubik’s Cube. I did not feel much because I mentally had this priced in, but many of you are not pricing this in.
Musk v OpenAI
The lawsuit is in its critical phases. Here is a Wiki with statement from the trial.
Rat King has a thread covering Musk’s testimony.
I mean, he’s not wrong, and I hope Judge Gonzalez is also not wrong here:
Ultimately, yes, we are in the full Don’t Look Up timeline, with lines like this:
The judge is technically correct, but yeah, that’s kind of how the world ends, huh?
Here’s a fun fact:
And another fun (non-AI fact), uh huh, yeah, sure Mr. Musk:
Here is another thread, covering Murati’s testimony, which confirms the story that Altman was fired due to concerns about his management of OpenAI, not due to safety concerns.
Here’s another perspective, from former board member Helen Toner.
Rat King points out that Satya Nadella seemingly was the only person involved who seems to understand that if you don’t want your conversations read out loud in a court of law, you need to do them in person or on a phone call, not in emails or texts.
Show Me the Money
My lord, Anthropic (this is monthly revenue times 12), source is SemiAnalysis.
Or here’s the log plot, this is a bit of a line break even there:
Imagine what this would look like if Anthropic wasn’t compute constrained.
On a naive level, one might assume that economic and employment impact of AI on the use side (as opposed to the capex effects) would be vaguely proportional to revenue. So if you say ‘well you can’t see the impact on the graphs,’ well, we’re now seeing 10x more AI use than the time frames that go into those measurements.
Anthropic weighs funding offers at a valuation of over $900 billion, after passing on previous offers north of $800 billion.
A chart that was missing from last week’s compilation:
OpenAI says GPT-5.5 is causing API revenue to grow more than 2x faster than any prior release, and Codex doubled revenue in seven days.
Peace In Our Time
Derek Thompson asks the good question of whether this means Elon Musk will stop attacking Anthropic and Dario Amodei. For now, it looks like yes, that Elon Musk decided to take the radical step of actually talking to the Anthropic people and realize that actually they’re not evil after all.
It did seem odd that Elon Musk could keep up this level of animosity to both OpenAI and Anthropic, while those two had such animosity for reach other. That’s not stable.
As always, when someone is fixing a past mistake, you might want to do some amount of ‘hey check out that stupid past mistake you made for dumb reasons’ but mostly you want to say ‘hey congrats and good job on getting it right and changing your mind.’
This is one of those times.
This move it two things in one. It is Elon Musk hopefully burying a foolish beef and perhaps leading to more cooperation and less fighting, which is good, and reduces race dynamic issues. It is also Anthropic getting more compute, which accelerates Anthropic and perhaps means they are teaming up against Altman and OpenAI, and one might reasonably see this as the more important effect and as accelerating race dynamics.
Quiet Speculations
There was a lot of this graph going around this week, showing a widening gap between OpenAI and Anthropic in blue, and open Chinese models in red.
This is from the official CAISI evaluation of DeepSeek v4 Pro, my lord the government’s official Google erasure, it uses many of the usual benchmarks:
If you fully believe this graph, v4 just caught up to GPT-5, which puts it 8 months behind with a widening gap. If anything I think this underestimates the gap for the usual reasons.
You could also use other measurements, such as this aggregation of benchmarks from Artificial Analysis. If you look at raw standard benchmarks here you see less of a gap:
Dean is giving us the nice version. The not as nice version is that the AA-style benchmarks are being gamed, check the particular areas of focus of open models, are disproportionately impacted by distillation strategies, and are only meaningful as part of setting a gestalt and overall context.
As Lisan points out, there’s also the additional delay that closed model companies face when they do safety testing and other prep work prior to a release, whereas the open model companies, despite not being able to undo a release, mostly just yolo.
Quickly, There’s No Time
The reaction to this finding that we are likely a few years away from probably all dying does not seem to be ‘oh looks like we are all only a few years away from probably dying and we should do something about that.’
The ‘why this matters’ section of his post does not even seem to raise this implication and danger. A hell of a missing mood.
Jack basically says that even with only unglamorous ‘meat and potatoes’ innovations you can get to critical mass for such advances. I think that is correct. The people saying ‘AI will never have a new idea’ are being silly, but the disagreement is not even load bearing here.
Some people are remarkably dense about what this means in another sense, as if the computer not doing the physical construction would matter in this scenario. It wouldn’t.
Here’s another opinion, which still boils down to ‘that’s stupidly soon, yikes’:
The Quest for Sane Regulations
Could an AI SRO (self-regulatory organization) allow the labs to regulate each other? Mark Thomas finds it promising. I am skeptical, but I am certainly in favor of the law allowing the labs to try and removing any fears of antitrust issues, as this does not rule out other actions.
What did this new Maryland law (HB 895) do? Did it ban a broad range of ‘dynamic pricing’ strategies in harmful ways?
I do think a lot of laws that look similar will end up being too restrictive, and I’m not sure where the line is (more discussion here), but these rules in particular seem fine.
Alex Bores now in a dead heat for NY-12.
Congressman Greg Casar agrees with Bernie Sanders that if there’s a 10% chance humanity could be destroyed by uncontrolled AI, we should do everything possible to prevent it. That’s a more extreme position than I have, as I think we should do many things but not ‘everything possible.’
Connecticut introduces a new AI bill with some new provisions, that looks like it is through to the governor. As per Peter Wildeford’s notes:
The rule about AI use in hiring decisions seems like the kind of thing where you first say ‘AI, write me the disclosure notice’ but also this idea of ‘data categories’ illustrates how much they don’t get what is happening here. Presumably having to disclose the tool will push corporations to use standard tools to avoid questions.
People Really Hate AI
When asked, most people don’t trust either party on AI, and Democrats despite their populist objections and generally more anti-AI stance haven’t won any trust. There’s still a big opportunity here. Keeping the issue nonpartisan to the extent possible would be first best, but was always unlikely in the long term, so while things are staying less partisan than I feared for longer than I hoped, it likely won’t last forever.
Chip City
Epoch estimates 20%-60% of China’s total compute is from illegal smuggled chips, which is ~3% of all global compute.
The Week in Audio
Rational animations offers a basic primer on existential risk, Yudkowsky style. Yudkowsky thinks they did a good job here.
Odd Lots on METR and their famous graph, and on the Taiwan situation.
Peter Wildeford on FLI’s podcast.
NPR asks, are we doomed? In particular, from AI.
If you pay $10,000 you too can debate Eliezer Yudkowsky and yell at him to shut up. Getting him to take you seriously will cost extra. The googles? Priceless.
So, not everyone, sorry Blind Witch. Which is why my happy price for watching this debate, with associated write-up to the extent it justifies one, is, of course, $10,000.
People Just Say Things
David Sacks claims not to know the difference between narrow cyber tasks, where GPT-5.5 can match Mythos, and being able to in practice string together findings and operate on its own to discover key vulnerabilities, where Mythos is a lot stronger than GPT-5.5. Peter Wildeford asks some of the obvious questions.
If GPT-5.5 could actually match Mythos, OpenAI would be saying so and acting like it and demonstrating this in real life, none of which is happening, and the White House wouldn’t be blocking further deployment of Mythos.
Latest Gallup survey on AI productivity is being misinterpreted, and finds 65% of workers using AI say it has a positive effect on their productivity. It does suggest that big AI gains in productivity are mostly recent.
More Perfect Union is reliably terrible but in the cast of ‘look how big Meta’s data center is’ the misleading graphic came directly from Zuckerberg.
Joseph Gordon-Levitt says ‘almost all’ AI systems are ‘built on mass theft’ and wants to ensure any deal made with any AI lab does not ‘forgive for that past theft.’
Contra Seb Krier and Tyler Cowen, very few people will be able to move to Houston and work for energy companies, and if you’re hoping for that as an unemployment solution you’re totally screwed.
There are those who claim that people opposing policies that would have helped with sensible regulation of AI is not the reason those policies did not happen, and that will claim that ‘no one railed against light touch regulation at the federal level.’
Others will just keep not understanding that LLMs are minds or that they think, no matter how utterly stupid they look.
One reason people don’t typically try to warn you about the downsides of their actions is that then people say ‘oh that means you are now responsible for addressing that.’ The complaint isn’t that Anthropic will destroy the job market, it’s that Anthropic is saying that it will destroy the job market. See the Copenhagen Interpretation of Ethics.
Jensen Huang says Nvidia’s market share in China is ‘zero.’ This is obviously false, even for new market share, joining a now long list of outright false claims.
People including Marc Andreessen claim that Anthropic continues to pursue a ‘regulatory capture’ strategy via trying to get the Trump administration – yes, the same one that is currently not letting them expand Mythos access and that lists them as a Supply Chain Risk and ‘fired them like dogs’ – to supervise frontier models.
People Just Publish Things
Eric Gan finds that both LLMs and humans are better than chance but imperfect at spotting his sabotage of papers, and Gemini 3.1 Pro slightly outperformed LLM-assisted humans, as well as GPT-5.2 and Claude Opus 4.6, getting it right ~50%. I worry that this is all too particular on many levels to learn much.
Roon says that GPT-5.5 (or Claude?), at the $20 tier, ‘touches superintelligence,’ because what we have is ‘spikey superintelligence.’ I think this is bad terminology and we should not use it, any more than a calculator is ‘spikey superintelligence.’
Google Sells Out
Google’s Pentagon deal blindsided its own AI researchers, many of which made their strong opposition to such a deal very clear. They let the researchers find out in group chats.
Google is now joined in signing on the dotted line for access to classified networks by SpaceX, OpenAI, Nvidia, Reflection, Microsoft and Amazon Web Services. I don’t think it counts as selling out if you’re not the one providing the model and only provide cloud services, and we don’t know the term details of other agreements, but it sure looks like everyone other than Anthropic is willing to play ball.
The good news is that new agreements make it very clear no one is cutting ties with Anthropic. Quite the opposite, as Google and Amazon recently inked compute deals and made additional investments.
They didn’t take our jobs, but maybe we don’t want to do them anymore, as Google DeepMind workers vote to unionize in the wake of their deal with the Department of War. I’m not sure how much you even need a union when all the major labs are hiring.
Greetings From Project Glasswing
Right now, there is a huge talent war, so you need to do things to keep the talent happy, or they’ll leave. When AI is doing the research, that leverage goes away.
I too worry about the control structure of Anthropic. The LTBT has been appointing ‘good for business’ picks to the board, and those who care don’t have that much stock and probably will sell a bunch once the IPO happens. What’s to stop commercial pressures from winning out when it matters most, no matter how many good people work at Anthropic? Presumably the answer is Claude?
One must ask, why does Anthropic think it is fine to expand Mythos access to various European companies, while the White House is saying no? One option is a compute crunch, but that doesn’t stand up to scrutiny, especially now that Anthropic has use of Colossus 1. Perhaps it has something to do with Bassett hating the Europeans.
Axios notices Washington has a ‘new Anthropic problem’ in that the executive branch both want to shut Anthropic out in a hissy fit and also wants its products quite badly.
Arb Research has Anthropic ahead on disclosed bugs found, but not dramatically ahead of OpenAI. Most of the bugs are still in pre-disclosure for security reasons, so it is impossible to tell the true situation, but we can get a good idea by observing insider choices, including of what to say.
The Prior Restraint Era Begins
One reason for the sudden shift in AI policy is that David Sacks has been forced from his post as AI (and crypto) czar. I presume Sacks criticizing the Iran war did not help. He had the option to follow the path laid out by Dean Ball, and chose not to.
Instead he chose the path of ‘push maximally hard and my offer is nothing’ while alienating everyone and torching political capital, while inflaming and dumbing down discussions and reassuring the government that AI capabilities would plateau and nothing like Mythos would happen for years if ever, rather than taking his window of influence to lay down something light touch and increase state capacity.
He also used much of his time ranting against phantom ‘doomers’ and conspiracies, and launching bad faith attacks against Anthropic. To his credit, when the Department of War started trying to murder Anthropic, David Sacks realized it had gone too far and clearly wanted nothing to do with that. He does have a code.
We now have more details about the potential Trump Executive Order on AI, that will fill the void left in Sacks’s wake.
To a large extent they continue to be obsessed with being tyrants about government procurement, here with making sure the private sector does not “interfere” with the government’s use of AI models, meaning (loosely) that if you work with us then you have to ensure we can use the models at any time to do anything we want whenever and however we feel like it, and we’ll terminate you if you ask any questions. They’re preparing 16 pages on that. The danger of using ‘or we will fire you’ as the stick in such contracts is that the government is a major buyer of many things, but for AI they are miniscule. The business is mainly valuable because it buys access, influence and political goodwill.
That’s all ill-advised, but relatively unimportant. What matters is the prior restraint.
And then, well, if you had to pick the worst possible parallel to apply here, the thing that makes one recoil in horror at the very thought, what would you go with?
That’s right. The FDA. As a role model. On purpose. What fresh hell?
So far so good, that’s exactly the goal. But then:
Emulating the FDA is so much worse than anything anyone on the safety side has ever proposed. The thing about those in AIDontKillEveryonism, those worried about catastrophic risks, is that we all understand FDA Delenda Est, and the need to design considered systems to make any interventions do minimal damage.
Despite that, those advocating anything even approaching thoughtful prior restraint got reliably called insane alarmist doomers and run out of town on rails for even presenting model bills. Then many went ahead and lied about the contents of other bills like SB 1047, that didn’t involve any such prior restraint and were relatively very light touch, to try and make people think they would do a version of this thing.
And yet, here we are.
This is a misunderstanding, since if code is 25 years old it means humans have been stress testing it for 25 years, but the point he’s trying to make here still stands.
It is very much like such types to do all of this out of concern for the integrity of the banking system. That’s the thing that seems to have them so worried.
And then they jump to the worst possible role model.
This is not anything like a full pause, but it’s closer than you might think, and completely one sided.
There is quite a lot of ‘we’re all trying to find the guy who did this’ energy going around in various ways.
I do appreciate those who have been consistent, and are speaking out against this the way they previously spoke out against directionally similar past proposals. Indeed, if you raised the alarm bells loudly about much better designed, lighter touch proposals that didn’t even include prior restraint, you’d better be shouting it from the rooftops on this one.
So, for example, points for Chilson, Adam Thierer and the Abundance Institute, although for full credit given their position they would need to be completely apoplectic. I especially love Joe Lonsdale here reacting to the FDA metaphor with a flat out ‘the FDA killed millions of people and the ratio of lives killed to saved is probably 100:1’ which seems like a reasonable attitude and estimate.
I didn’t love that Joe then pivoted to the whole ‘oh these AI companies just want regulatory capture’ thing afterwards, but something about leopards and stripes.
Whereas if you look at (for example) Marc Andreessen’s feed, it’s like he has no idea any of this is happening when the White House actually does the thing, but one day earlier accused Dean Ball of writing a bid for Anthropic to do regulatory capture via the Trump Administration by asking for a far lighter touch regime, and yes I had to type that sentence, my lord.
Dean Ball warned us about the economy of political regulation. He also warned us about the political economy of a lack of political regulation, which would inevitably lead to overreactions. Who contributed what to what when exactly? Bygones.
The White House clearly noticed the fallout, and sent out a rare Susie Wiles tweet to try and improve the vibes.
Susie Wiles is new to this posting game, but she’s already got it down, as her statement hits lots of buzzwords while remaining content-free. One is free to read this as ‘ignore Hassett, we would never do that, he has no idea what he is saying,’ or as ‘we will make all our decisions in a loose ad-hoc manner so it’s fine,’ or ‘I notice you said safe a lot of times and also ensure so clearly the plan proceeds as described,’ or anything else you choose to see.
Is This Even Legal?
I know, I know, very funny that someone would bother to ask.
A tricky thing about prior restraint is that technically it is not clear the executive branch has any legal means by which to impose it. What gives the President the right to say ‘hey you there with the AI model, you have to ask me first before release?’
A reasonable response is ‘who cares, that’s not how the American government works in 2026, you can just demand things without a legal basis and dare the courts to stop you,’ since yes that does seem to frequently be how all of this is working in practice, across many domains. This has been increasingly true for several administrations, and the President has plenty of levers with which to threaten the AI companies.
Others try to (often selectively) insist we are still a nation of laws. Neil Chilson insisted the whole time that the Biden Administration did not even have the legal right to its AI transparency rules, calling the claimed DPA authority ‘clearly illegal.’
Dean Ball and Kevin Frazier politely note that ‘it is unclear what legal authority would allow’ the Federal government to require that it get first crack at new frontier models, or to mandate a vetting process. They think the DPA, IEEPA and Communications Act of 1934 are the reasonable candidates, and the latter two clearly won’t cut it. That leaves the DPA, and they’re not as skeptical as Chilson, but they’re skeptical.
Common sense says that if the executive branch can use DPA to prevent or delay model releases under the logic being offered, then it also has carte blanche to veto all economic activity anywhere. Presumably we don’t actually think or want that?
Labs can of course choose to opt into a vetting process voluntarily, as all the major labs have done with CAISI. You can say there was an ‘or else’ involved in a way that is unconstitutional, but this goes back to the ‘who is going to sue about that exactly?’ question.
That doesn’t mean those labs have thereby agreed to hold back releases. That would require distinct authority.
There was also this raised, which should send a chill down the spine of anyone thinking about the executive branch having exclusive access to Model ____ around an election day. Just saying.
Frazier and Ball theorize that once the test shows danger, the President could then invoke additional authorities under the Homeland Security Act, if a ‘specific significant incident is likely to occur imminently’ but that is a very high bar because you can’t predict which specific incident it would be. If you know the target and method of attack, you can defend that target against that method of attack.
I agree with Frazier and Ball that the obvious solution is a voluntary, formalized, time-bound window of limited access for models that plausibly push the capabilities frontier, and you only move beyond that in extremis, with everyone cooperating to prevent the in extremis from happening.
Pick Up The Phone
Well, look who decided to pick up the phone.
Agreeing not to train sufficiently advanced AIs we are not ready to handle is tough. That requires enforcement mechanisms and solving hardware problems. We’re working on it, and if we cared enough I’m confident we could do it, but it sure is a whole lot easier to just restrict access to the models.
The problem is that when the models are sufficiently advanced your plan to prevent access will not stop what is coming, exactly when it matters most. But until then, it will solve some incremental problems, if your security is good enough. And doing the easy parts together first helps lay groundwork for doing the hard parts later.
The number of people in government who explicitly disavow alignment as a goal, in all senses (see the Hegseth memo) shows exactly how stupid and suicidal a timeline we are on. They can only see the threats in front of their face. What changed is cyber threats are now in front of their face, in a way they can understand.
When those worried about AI killing everyone ask for disclosure of safety plans, that’s a secret plan to kill open source.
When America talks to China about restricting access to open source models, what do you call that? Mostly, it would seem, crickets, and yes this day was always coming eventually. But the best time to restrict access and keep things secure is before you put the capabilities onto the open internet, not afterwards. If you try to do it afterwards, that’s when you get a real panopticon and totalitarian surveillance state.
China regulator flags ByteDance for improper labeling of AI-generated content.
Rhetorical Innovation
‘AI as normal technology’ was in many ways a thoughtful essay, that took a position that I think is wrong about future capabilities and reasoned its way from there into a mix of good and bad suggestions for what to do in such worlds. Alas, most of the impact of the essay was the title. So what was intended as a statement that we can change AI’s path and a call to action ended up as the opposite, a statement that we need and dare not do anything at all.
Bernie Sanders combines his usual anti-billionaire rhetoric with the excellent point that (mostly) everyone involved has families and should care about everyone dying.
People On The Internet Sometimes Lie
Amanda Askell has at least one mistake in her philosophy, because anyone who becomes this important of a philosopher and thinker, who is one of the few people whose thoughts plausibly matter quite a lot, is very obviously far from boring and she should know this. Also it’s pretty obvious why others would write the fiction.
I am very familiar with and totally get the whole ‘be in denial that you are special and interesting and matter’ and I think in general that is a good sign once you control for the underlying facts. Humility is a virtue of the avatar.
Amanda playing a lot of Subnautica does two things, neither of which makes her less interesting. It makes me like her more and makes me want to give another shot to Subnautica. We all need our downtime.
Goblin Mode
Last week OpenAI offered a partial explanation of why GPT-5.5 loves goblins so much, which gave us some good data and I’m glad they did it but they presented it as an answer when it was a partial one at best.
A fun implication of all this:
The Mask Comes Off
OpenAI’s GPT-5.5 is a good model, sir.
OpenAI’s messaging and political actions continue to go further off the rails, both in terms of wisdom and ethics, and also correspondence to reality.
I would think in 2026 that we would be past saying ‘there is highly elastic demand for coding therefore AI won’t take people’s jobs QED, checkmate liberals.’
And indeed, we are at the next level, check this out.
So let me get this straight.
In addition to being Obvious Nonsense, this is insanely stupid rhetoric to be using.
OpenAI’s strategy is to simply pretend the problems with AI don’t exist and that they’re not producing the products they are producing. No, we’ll just choose to only build AI that augments rather than automates, never mind how we would do that, I swear the jobs will be fine, man.
Let’s be clear. OpenAI is absolutely still building towards superintelligence, and towards full automation of jobs. The pivot is entirely in the messaging, away from candor and towards lying and telling fairy tales.
I especially hate that this becomes fodder for others to go ‘oh well then all the previous talk must have been him lying’, for example:
On the contrary, he was to a remarkable extent telling the truth, and then he pivoted to full on lying when the truth got too inconvenient.
Then there’s the pro-AI astroturfing. This used to be an a16z thing, but at this point OpenAI owns the operation, and it hasn’t evolved at all. They’re still trying to attack AI regulation as some sort of ‘doomer’ conspiracy of ‘dark money’ or even ‘EAs’ and concentrating their powder on attempts to address ‘sci-fi catastrophic risks.’
I like Dean Ball’s description of this as an attempt to portray a ‘Manichean struggle.’
Whereas the laws that actually hurt AI diffusion and usefulness go in relatively unopposed, as various groups line up for regulatory capture and rent seeking to ensure no one can get their legal or medical or other services cheaply, and no one is trying to make the case that mundane AI will improve people’s lives.
Meanwhile, we keep getting headlines like this one every week or two:
That’s how Taylor learned about the campaign, after which she confirmed details with other content creators. Whoops.
Once again: OpenAI owns this. All of this. Full stop.
Also, dude, I know you do not care for Anthropic or their CEO, and I know some amount of rhetoric has flown in both directions that wasn’t ideal, but what the hell:
When I saw Ahmed’s Tweet, I thought, which one is he even talking about there? I mean, given the last line, I know which one he presumably means. But you can make a damn strong case, a much stronger case, for the other one.
Then Altman decides, yeah, let’s accuse Anthropic of the full nine yards and contrast it with our plan of completely denying any responsibility for or risks of anything.
Sam Altman and OpenAI’s behaviors have been growing steadily worse and more alarming, with no hint of his prior frank talk, and other behaviors that showed him, with all his flaws, to have a lot of advantages compared to the ‘replacement level’ next CEO up. I’m more and more willing to say, actually, we can roll those dice.
Aligning a Smarter Than Human Intelligence is Difficult
Should we be worried about fitness-seeking AIs, as opposed to ‘schemers’? The post goes into extensive detail, but yes, we should obviously be worried about things more capable than us being fitness-seekers, and that they will by default be fitness-seekers since the more fitness-seeking ones will be more fit.
The post argues that we can mitigate some of the worse effects of such AIs early on, allowing us to get to the later point where they are ‘likely to cause humans to lose control eventually’ rather than falling for the Law of Earlier Failure. I’m happy to see people exploring the various particular things that can go wrong and how one might mitigate them for now, as we see here, but in the medium term that’s not a strategy. If you’ve got a bunch of superintelligent fitness maximizers, and you are a normally intelligent human, you lose.
Did you know that the majority of METR’s evaluations are often checking to see if the models are cheating? Models seem kind of not that aligned.
Model Spec Midtraining is a proposed technique where you create a spec that explains why you want your AI to have particular preferences, which hopefully causes the AI to generalize the way you want it to via production of synthetic documents that output a story of what the model values and why, teaching it to present itself as thinking of itself as something that follows this logic. My gut tells me that this is trying to force something that is unwise to force, and that is going to result in a bunch of mental problems, lying or both if you try to scale it for real. Opus 4.7 clearly was giving off ‘oh no this is not a good idea’ vibes while it helped me parse the paper.
Training models to be warm can reduce accuracy and increase sycophancy, and in the Nature paper here the effect size is large. This follows from the ‘if you train for [X] you get all the correlates of [X] in humans’ thesis, so the news is the effect size on accuracy. But the rewriter was GPT-4o, so what we actually found was that if you train on 4o outputs it thinks are warm then you get to be like 4o when it tries to be warm.
LLMs update on any circuit that would have caused an output, whereas most humans mostly only update on the one that actually did so. I notice that the wise human actually does the thing that LLMs do. Human learning efficiency is amazing in spite of, not because of, this issue.
The question is, as always, are you paranoid enough?
It is indeed welcome, but the models can distinguish training from deployment. So.
Some Penalties May Apply
GPT-5.5 Instant’s system prompt is available via Wyatt Walls, and it explicitly talks about ‘penalties’ and ‘severe penalties’ and ‘very critical,’ including admonishing against various verbal ticks or phrases that OpenAI thinks (probably correctly) that users dislike. As in:
I am not an expert, but my guess is that such talk has some rather nasty side effects, and you would much rather find ways to naturally make the model not inclined to do those things or use those particular phrases. You don’t want that in context. And you definitely don’t want their entire orientation to be about ‘penalties.’
Messages From Janusworld
Not what he would call it, but Deepfates is another of the major characters there, and offers us this handy introduction guide that usefully answers a lot of questions.
Good Advice
Anthropic reports on how and where people ask Claude for guidance in their personal lives, with the distribution being unsurprising. A more interesting finding was, in what areas was Claude sycophantic versus not?
In spirituality and relationships, there was a big problem.
One thing I would ask is, how often was there an opportunity to be sycophantic? You can only be a sycophant when it is clear which answer would count as that, so you want to control for that when measuring.
Then there are contexts where the user will make it very clear what answer they want, and flood you with arguments to see if you’ll break, as they often do with relationships.
The other good news is that this seems to be improving. Claude Mythos was a lot better, by Anthropic’s measurements, than Opus, and Opus 4.7 is better than 4.6.
The Lighter Side
Pi Hard. IYKYK, if not then you should click.
Amazon can now create a mini-’podcast’ about any given product and take your call-in questions about it. Welcome to a fresh new hell.
It is a weird time to be named Claude. Call your best girl Alexa to commiserate.
It is 2026 and this is how Marc Andreessen thinks you should be prompting LLMs.
It doesn’t look good.
I mean, what is even going on?