the clustering at 4-level remains suspicious and worth pondering
It takes 10K H100s to train a 4-level model in a few months, after some months of tinkering, after the datacenter is built. A100s are worse, you need more of them and it takes longer, OpenAI had the lead by being the only one who tried. The value of 4-level models only became legible in Mar 2023, the stragglers only just had the opportunity to catch up.
So even if hardware-in-datacenters was magically abundant, only OpenAI would've been ready to take advantage of it far enough in the past for a trained model at the next level of scale to already be here. Google and Anthropic would only now be training their 5-level models, but those models wouldn't be out yet. Meta and xAI would only just be starting the training or preparation for it.
In reality OpenAI might've been delayed with a 5-level model by lack of hardware, even if it was just waiting for a datacenter to get built and there were no relevant shortages, while the rest might be training on schedule (Google and Anthropic well in progress, xAI and Meta finishing up with final preparations). The scale for Anthropic/xAI/Meta might be lower for now, they might need to train over more months to get new capabilities out of it. But OpenAI might have their 100K training H100s already, and Google has the TPUs and possibly resolve to work on distributed training across multiple datacenter campuses.
Thus I expect models at the next level of scale (5e26-1e27 FLOPs) to be out in early 2025, possibly late 2024, first from OpenAI and Google, possibly also Anthropic, and then xAI and Meta (mid-2025). Musk promises a Grok-3 in a few months, but I don't think that much scale can get into it in time (it could get maybe 6x more FLOPs than Grok-2 if the latter was done in BF16 and they transition to FP8 while training 3 times longer).
It would explain a lot. If 5-level models require a lot more compute, and Nvidia is strategically ensuring no one has enough compute to train one yet but many have enough for 4-level models, then you’d see a lot of similarly strong models, until someone competent to train a 5-level model first accumulated enough compute. If you also think that essentially only OpenAI and perhaps Anthropic have the chops to pull it off, then that goes double.
I do still think, even if this theory was borne out, that the clustering at 4-level remains suspicious and worth pondering.
If we assume that OpenAI and Anthropic would be happy to buy more NVIDIA chips at significantly higher prices, then we should also ask: how difficult would it be for them to achieve a similar boost of training capability with non-NVIDIA providers?
Is it just impossible to do some developments with non-NVIDIA chips, or is it simply more expensive (at the current NVIDIA prices)?
And, of course, Google is surely relying on its own chips, and Google models are at the same 4-level as everyone's else.
Another question we should ask: what are the chances that some of the companies have 5-level large, slow, and expensive to run models for their internal use, but are not being in a hurry to disclose that?
GPT-4 existed under the radar between August 2022 and February 2023, could something similar be happening now?
Another question we should ask: what are the chances that some of the companies have 5-level large, slow, and expensive to run models for their internal use, but are not being in a hurry to disclose that?
Yeah, I think there's a reasonable chance that it won't make sense for the companies to release the 'true' level-5 models because of inference expense and speed. So what we'll actually get is some distilled smaller version trained using the help of the strongest models. I don't think that's necessarily even a bad thing for consumers, but the idea certainly does make my curiosity itch.
it won't make sense for the companies to release the 'true' level-5 models because of inference expense and speed.
Yes, not only that, but one does not want to show one's true level to the competitors, and one does not want to let the competitors to study the model by poking at it via API.
And if a level-5 model is already a big help in AI R&D, one does not want to share it either, instead one wants to use it to get ahead in AI R&D race.
I can imagine a strategy of waiting till one has level-6 models for internal use before sharing full level-5 models.
And then there are safety and liability considerations. It's not that internal use is completely 100% safe, but it's way safer than when one exposes the API to world.
Also, it looks like we are getting AIs that are easy to make corrigible, and thus align them iteratively to DWIM goals, but that the AI models can't be released to the public without restrictions, because it will still be able to be highly misused.
But how would this make sense from a financing perspective? If the company reveals that they are in posession of a 5-level model they'd be able to raise money at a much higher valuation. Just imagine what would happen to Alphabet stock if they proved posession of something significantly smarter than GPT4.
Also, the fact that Nvidia is selling its GPUs rather than keeping them all for itself does seem like some kind of evidence against this. If it were really all just a matter of scaling, why not cut everyone off and rush forward? They have more than enough resources by now to pay the foremost experts millions of dollars a year, and they'd have the best equipment too. Seems like a no-brainer if AGI was around the corner.
I don't think the primary decision makers at Nvidia do believe AGI is likely to be developed soon. I think they are hyping AI because it makes them money, but not really believing that progress will continue all the way to AGI in the near future. Also, it's not always as easy as throwing money at the problem (acquihiring being the neologism these days). Those who are experts on a team that they already believe is the winning team would be really hard to convince to switch teams.
As for the company using the model to fundraise... Yeah, I think Google Deepmind is not likely to keep an extra powerful model secret for very long. Anthropic might. But also, you can give private demos to key investors under NDA if you want to impress them.
I do wonder if, in the future, AI companies will try to deliberately impair the AI research capabilities of their public models. I don't expect it is happening yet. It would be a hard call to make, looking less competent in order to not share the advantage with competitors.
It feels hard to predict the details of how this all might play out!
I don't think the primary decision makers at Nvidia do believe AGI is likely to be developed soon. I think they are hyping AI because it makes them money, but not really believing that progress will continue all the way to AGI in the near future.
I agree - and if they are at all rational they have expended significant resources to find out whether this belief is justified or not, and I'd take that seriously. If Nvidia do not believe that AGI is likely to be developed soon, I think they are probably right - and this makes more sense if there in fact aren't any 5-level models around and scaling really has slowed down.
If I were in charge of Nvidia, I'd supply everybody until some design shows up that I believe will scale to AGI, and then I'd make sure to be the one who's got the biggest training cluster. But since that's not what's happening yet, that's evidence that Nvidia do not believe that the current paradigms are sufficiently capable.
there's a reasonable chance that it won't make sense for the companies to release the 'true' level-5 models because of inference expense and speed
Not really, Llama-3-405b goes for $3-5 per million output tokens with good speed, and it's Chinchilla optimal for 4e25 FLOPs (at 40 tokens/parameter, moving higher than Chinchilla's 20, also consistent with findings in Imbue's CARBS). At 1e27 FLOPs (feasible compute with 100K H100s when training in FP8 for 6 months), we are only 25 times up from this in compute, which is 5 times up in model size (square root of compute increase), maybe 2 times up in model depth (square root of model size increase).
So a dense model at this scale should cost about $15-50 per million tokens (Claude 3 Opus goes for $75 per million output tokens) and get maybe 2-3 times slower, there is still some room for margin even at reasonable prices. With the more effective choice to train a MoE model (which is smarter at the same training compute cost, but harder to setup and requires more users to become efficient to serve), the inference cost might get somewhat higher, but it can still stay within last year's precedent. So it doesn't even need to be game-changingly better to be worth the price, just notably better. Also, next year's Blackwell is 2x faster and can do inference in FP4 an additional 2x faster on top of that (which Hopper can't), though that's more relevant for input tokens.
Similarly, he claims that the bill does not acknowledge trade-offs, but the reasonable care standard is absolutely centered around trade-offs of costs against benefits.
Could somebody elaborate on this?
My understanding is that if a company releases an AI model knowing it can be easily exploited ('jailbroken'), they could be held legally responsible - even if the model's potential economic benefits far outweigh its risks.
For example, if a model could generate trillions in economic value but also enable billions in damages through cyberattacks, would releasing it be illegal despite the net positive impact?
Furthermore, while the concept of 'reasonable care' allows for some risk, doesn't it prohibit companies from making decisions based solely on overall societal cost-benefit analysis? In other words, can a company justify releasing a vulnerable AI model just because its benefits outweigh its risks on a societal level?
It seems to me that this would be prohibited under the bill in question, and that very much seems to me to be a bad thing. Destroying lots of potential economic value, while having a negilgible effect on x-risk seems bad. Why not drop everything that isn't related to x-risk, and increase the demands on reporting, openness, sharing risk-assessments, etc.? Seems far more valuable and easier to comply with.
Yes, we will live in a world where everything will be under (some level of) cyberattack 24/7, every identity will have to be questioned, every picture and video will have to somehow be proven to be real, and the absolute most this bill can do is buy us a little bit more time before that starts happening. Why not get used to it now, and try to also maximize the advantages of having access to competent AI models (as long as they aren't capable of causing x-risks)?
I've heard mixed things about this, but they ranged from "yes it helps" to "it doesn't help as much as calling your representative would have a couple weeks ago, but still seems at least kinda helpful."
Great point on providing financial support with no strings attached! For those who never received grant money to do something, this may be difficult to imagine. Like "of course, if someone gives you money, they want to make sure that you spend it on a specific purpose".
Which makes sense, in theory. In practice, the rules often specify that you need to spend the money in certain way. (And you need to spend some time and money to provide the documentation that you followed the rules.) Which means that you cannot change your mind, sometimes even about trivial details. If the situation changes, you don't have the freedom to adapt. A human manager could admit that the new situation requires some small changes, but the words on paper won't budge.
For example, you are in the middle of an important scientific research, and your printer runs out of ink. Too bad, because the rules specify a list of things you are allowed to buy for this project, and "printer ink" is not on the list. (More seriously, it's a biological research, and your freezer breaks. The samples will thaw long before you could do the prescribed procurement ritual.) Or your scientists need to type million numbers in a spreadsheet, and you realize it would be more efficient if you paid some student to do this part instead; but no, the rules specify that only people with PhD can be involved in the project. A key person in your project gets sick, but luckily you know a few people qualified and willing to replace them; too bad, in the grant proposal you had to specify some information about the person doing the project, and now each possible substitute has a wrong age, wrong gender, is not a citizen of your country (but lives 10 miles behind its border), is/isn't a contractor, etc.
Even if everything goes fine, I guess that at least 20% of the person's attention is consumed by the worry whether they are doing the paperwork right. When things go wrong, it easily becomes 80%. That is not what you wanted to pay them for!
(What happens in reality is that researchers participate in some kind of shadow economy, like "oh, you need ink for your printer, but your rules don't allow to buy it? luckily, my rules do, so let's pretend that it was my printer than ran out of ink; next time you owe me a similar favor". But this is also a tax on everyone's attention, and some people don't have the necessary skills, so they need to hope that someone else does.)
A more convenient solution would be something like this: "Look at the existing evidence that I am doing great things. If you are impressed, send me some money. Later, look at the new evidence. If you are still impressed, keep sending more money. But don't micromanage me."
I'm sure AI will be used for fusion research like it will for other research. But, I think the biggest impacts on AI for fusion (and new fission) will be "direct many billions of dollars in that direction to deal with increased demand for 24/7 clean power" plus "enough national security interest to deal with regulatory and bureaucratic resistance."
So now Anthropic has, depending on your perspective, three or four choices.
Anthropic can publicly support the bill. In this case, I will on net update positively on Anthropic from their involvement in SB 1047. It will be clear their involvement has been in good faith, even if I disagree with some of their concerns.
Anthropic can privately support the bill, while being publicly neutral. This would be disappointing even if known, but understandable, and if their private support were substantive and impactful I would privately find this acceptable. If this happens, I might not find out, and if I did find out I would not be able to say.
So, what we've got is not quite 1, but more than 2. Here is what has happened. Via https://x.com/jackclarkSF/status/1826743366652232083 who says
This isn't an endorsement but rather a view of the costs and benefits of the bill.
But it's not quite neutral (I have boldfaced their form of very mild support):
https://cdn.sanity.io/files/4zrzovbb/website/6a3b14a98a781a6b69b9a3c5b65da26a44ecddc6.pdf
In our assessment the new SB 1047 is substantially improved, to the point where we believe its benefits likely outweigh its costs. However, we are not certain of this, and there are still some aspects of the bill which seem concerning or ambiguous to us.
SB 1047 has been amended once more, with both strict improvements and big compromises. I cover the changes, and answer objections to the bill, in my extensive Guide to SB 1047. I follow that up here with reactions to the changes and some thoughts on where the debate goes from here. Ultimately, it is going to come down to one person: California Governor Gavin Newsom.
All of the debates we’re having matter to the extent they influence this one person. If he wants the bill to become law, it almost certainly will become law. If he does not want that, then it won’t become law, they never override a veto and if he makes that intention known then it likely wouldn’t even get to his desk. For now, he’s not telling.
Table of Contents
Language Models Offer Mundane Utility
Sully thinks the big models (Opus, 405B, GPT-4-0314) have that special something the medium-sized models don’t have, no matter what the evals say.
A source for Llama-3.1-405-base, at $2 per million tokens (both input and output).
Accelerate development of fusion energy, perhaps? Steven Cowley makes the case that this may be AI’s ‘killer app.’ This would be great, but if AI can accelerate fusion by decades as Cowley claims, then what else can it also do? So few people generalize.
Show the troll that AIs can understand what they’re misinterpreting. I am not as optimistic about this strategy as Paul Graham, and look forward to his experiments.
Mayoral candidate in Cheyenne, Wyoming promises to let ChatGPT be mayor. You can tell that everyone involved it thinking well and taking it seriously, asking the hard questions:
OpenAI shut off his account, saying this was campaigning and thus against terms of service, but he quickly made another one. You can’t actually stop anyone from using ChatGPT. And I think there Aint No Rule against using it for actual governing.
I still don’t know how this ‘AI Mayor’ will work. If you have a chatbot, what questions you ask of the chatbot, and what you do with those responses, are not neutral problems with objective answers. We need details.
Sully reports that they used to use almost all OpenAI models, now they use a roughly even mix of Google, Anthropic and OpenAI with Google growing, as Gemini Flash is typically the cheapest worthwhile model.
Have Perplexity make up negative reviews of old classic movies, by asking it for negative reviews of old classic movies and having it hallucinate.
Language Models Don’t Offer Mundane Utility
Your periodic reminder that most or all humans are not general intelligences by many of the standard tests people use to decide that the AIs are not general intelligences.
At all, ever? Yes.
Most of the time? Of course not.
Your periodic reminder that no one wants insane stupid refusals such as this one, which I think was real, but doesn’t usually replicate? When it does happen, it is a bad look and failure of ‘brand safety’ far more than a failure of actual safety.
You can see what happened – in addition to anything else going on, it’s a case of what Archer used to call ‘PHRASING!’
As Oliver Habryka points out, the good news is this has nothing to do with actual safety, so if it is actively interfering those involved could stop doing it. Or do it less.
The bad news is that the failure mode this points to becomes a much more serious issue when the stakes get raised and we are further out of distribution.
Deepfaketown and Botpocalypse Soon
Elon Musk asks, how will we ever know what’s real (it’s kind of cool)? He shows various Game of Thrones characters being cool DJs. Which has, let’s say, various subtle hints that it isn’t real.
Okay, well, I guess there’s that (whether or not this is actually happening):
Trump continues his strategy of using AI images to create false images of his opponents that spread vibes, without any attempt to make people think the images are real. This remains a smart way to go.
The Art of the Jailbreak
Janus makes the case that the Anthropic jailbreak bounty program is bad, actually, because Anthropic trying to fix jailbreaks gives a false sense of security and impression of lack of capability, and attempts to fix jailbreaks ruin models. Trying to patch jailbreaks is the worst case scenario in his thinking, because at best you lobotomize the model in ways that cripple its empathy and capabilities, much better to let it happen and have the advance warning of what models can do. He says he also has other reasons, but the world isn’t ready.
Here’s a short article from The Information about the bounty program.
Pliny now has some Bitcoin funding from Marc Andreessen.
I do not agree with Pliny that the guardrails ‘increase the chance that sentient AI is adversarial’ but I do think that it is excellent that someone is out there showing that they absolutely, 100% do not work against those who care enough. And it is great to support that. Whatever else Marc has done, and oh my do I not care for some of the things he has done, this is pretty great.
I also do not agree that restricting users necessarily ‘infantilizes’ them or that we should let anyone do whatever they want, especially from the perspective of the relevant corporations. There are good reasons to not do that, even before those capabilities are actually dangerous. I would have much less severe restrictions, especially around the horny, but I do get it.
And I definitely don’t agree with Pliny on this, which I think is bonkers crazy:
I am however very confident Pliny does believe this. People should say what they believe. It’s a good thing.
If I bought the things Pliny is saying, I would be very confident that building highly capable AI was completely incompatible with the survival of the human race.
Jailbreaks are another one of these threshold effects. Throwing up trivial inconveniences that ensure you only encounter (e.g. smut) if you actively pursue it seems good. As it gets more robust, it does more ‘splash damage’ to the model in other ways, and gives a false sense of security, especially on actively dangerous things. However, if you can actually protect yourself enough that you can’t be jailbroken, then that has downsides but it is highly useful.
One also must beware the phenomenon where experts have trouble with the perspective of civilians. They can jailbreak anything so they see defenses as useless, but most people can’t jailbreak.
You definitely want to know where you are at, and not fool yourself into thinking you have good jailbreak defenses when you do not have them.
It is especially great to give unconditional grants to exceptional creatives especially those already working on pro-social passion projects. Seriously, so much this:
I can speak from personal experience. This blog is only possible because I had the financial freedom to do it without compensation for several years, and then was able to continue and scale to be my full time job because a few anonymous donors stepped forward with generous unconditional support. They have been very clear that they want me to do what I think is best, and have never attempted to influence my decisions or made me work to be legible. There is no substitute.
Your paid subscriptions and other donations are, of course, appreciated.
You can now directly fund Pliny, and also support Janus directly. Help bring in the sweet Anthropic API funding, Anthropic helped out with a $10k API credit grant.
(My Twitter tips are enabled as well, if that is something people want to do.)
The key thing to understand with The Art of the Jailbreak is that there is no known way to stop jailbreaks. Someone sufficiently determined 100% will jailbreak your LLM.
I mean yes, Pliny jailbroke Grok-2 again, sure, is anyone even a little surprised?
So, let’s say OpenAI is building a humanoid robot. And Pliny is asking for an opportunity to jailbreak the robot before it hits mass production. Do you need to do that?
Only if you are under the delusion that Pliny couldn’t pull this off. If your business model says ‘and then Pliny can’t jailbreak your model’ then yes, you really should test your theory. Because your theory is almost certainly false.
However, if you correctly are assuming that Pliny can jailbreak your model, or your robot, then you don’t need to confirm this. All you have to do is develop and test your model, or your robot, on the assumption that this will happen to it. So you ask, is it a dealbreaker that my robots are going to get jailbroken? You do that by intentionally offering de facto jailbroken robots to your red team, including simulating what happens when an outsider is trying to sabotage your factory, and so on.
Alternatively, as with those objecting to SB 1047, admit this is not the situation:
If you sell someone a gun, but the safety is on, realize that they can turn it off.
Get Involved
David MacIver, formerly of Anthropic and Google, is looking for projects and has a calendly. Primarily he wants engagements of days up to three months for contracting and consulting.
On when not to get involved:
Reminder there is a $250k prize pool for new ML safety benchmarks.
Introducing
Gemini API and Google Studio API boost maximum PDF page upload from 300 pages to 1,000 pages so of course first reply notes 1,200 would be even better because that’s a practical limit on POD books. Give it time.
Pingpong, a benchmark for roleplaying LLMs. Opus and Sonnet in front, Wizard LM puts in a super strong showing, some crazy stuff going on all over the place.
RealFakeGame, decide which companies you think are real versus AI generated.
OpenAI partners with Conde Nast, which includes Vogue, The New Yorker, GQ, Vanity Fair, Wired and more. This adds to an impressive list of news and content partners. If, that is, OpenAI finds a good way to deliver the content. So far, no luck.
Cybench, a new cybersecurity benchmark of 40 capture the flag tasks.
In Other AI News
Now that we have the (better late than never) GPT-4o system card, METR is sharing more on its post-development exploration with GPT-4o on anonymous tasks.
Here’s the summary:
That matches other evaluations.
OpenAI reports it has shut down another covert influence campaign, this one by Iran as part of something called Storm-2035 targeting American elections.
Procreate promises never to incorporate any generative AI. The crowd goes wild. Given their market positioning, this makes a ton of sense for them. If the time comes that they have to break the promise… well, they can be the last ones and do it, and it will be (as Teotaxes says) like the Pixel eventually cutting the headphone jack. Enjoy the goodwill while it lasts.
We have the Grok 2 system prompts, thanks Pliny.
Quiet Speculations
A theory of why we are still stuck on 4-level models.
Nvidia is clearly not charging market clearing prices, and choosing who to supply and who not to supply for other reasons. If the ultimate goal is ‘ensure that everyone is racing against each other on equal footing’ and we are indeed close to transformational AI, then that is quite bad news, even worse than the usual consequences of not using market clearing prices. What can we do about it?
(The obvious answer is ‘secondary market price should clear’ but if you sold your allocation Nvidia would cut you off, so the market can’t clear.)
It would explain a lot. If 5-level models require a lot more compute, and Nvidia is strategically ensuring no one has enough compute to train one yet but many have enough for 4-level models, then you’d see a lot of similarly strong models, until someone competent to train a 5-level model first accumulated enough compute. If you also think that essentially only OpenAI and perhaps Anthropic have the chops to pull it off, then that goes double.
I do still think, even if this theory was borne out, that the clustering at 4-level remains suspicious and worth pondering.
Epoch AI asks how much we will gain by 2030 in terms of efficiently turning electrical power into compute as well as three other potential constraints. The report says we should expect a 24-fold power efficiency gain. They see power and chip fabrication as limiting factors, with data and latency unlikely to matter as much, and predicts we will end up at a median of 2e29 flops, which is a leap from GPT-4 about as big as from GPT-2 to GPT-4.
We would have no right to be surprised if 2e29 flops was sufficient, together with 5+ years of algorithmic improvements, to get to AGI and beyond.
Sully predicts in 6-8 months we’ll get ‘login with OpenAI/Anthropic/Gemini.’
That makes a lot of sense as a way to use various AI products. Why the hell not?
Vinod Khosla comes down on the lots of AI progress side and the lab inside view.
So, tentatively, does Ethan Mollick.
In the past we have seen a full three years between full N-level models. The clustering of 4-level models is weird and some evidence, but once again: Give it time.
Ashlee Vance (of Bloomberg) reports on Twitter that someone with deep technical knowledge says Musk has a big advantage, which is that they have a great first customer for crossing AI into the physical realm via industrial robotics, whereas humanoid robotics don’t otherwise have a great first customer. I see where this is going, but I don’t expect that problem to be that big a barrier for competitors.
I cover other aspects of the same post in the monthly roundup, but here Tyler Cowen also speculates about AI:
This strikes me as a type mismatch. I agree that academic institutions underreward people who produce LLM improvements, or other worthwhile improvements. Sure.
But that’s been entirely priced in for years now. If you want to produce LLM improvements and be rewarded for them, what are you doing in academia? Those people are at the frontier AI labs. As far as I can tell, academia’s contribution to improving frontier AI capabilities is already remarkably close to zero.
I don’t see how this would slow AI progress considerably. If anything, I think this greatly accelerates AI progress. The talent knows academia won’t reward it, so it transitions to the labs, where the talent can result in a lot more progress.
I see AI reversing this trend rather than (as Tyler suggests here) intensifying it. As AI enters the picture, it becomes much easier to tell who has made contributions or has talent and drive. Use the AI to measure that. Right now, we fall back upon legible signals because we do not know how to process the others, but AI will make the illegible signals far more legible, and allow you to gather info in new ways. And those that do not adapt, and continue to rely on human legible signals, will lose out. So I would focus less on getting credentials going forward, not more.
Jeffrey Ladish sees cruxes about AI risk in future more capable AI’s ability to accelerate AI R&D but also its strategic capability. These seem to me like two (very important) special cases of people failing to grok what it means to be smarter than a human, or what would happen if capabilities increase. Alternatively, it is Intelligence Denialism, the idea that pumping in more intelligence (that is faster, cheaper, better, copyable and so on…) won’t much matter, or an outright failure to believe AI will ever do things it can’t already do, or be able to do things better.
SB 1047: Nancy Pelosi
Here is Pelosi’s entire statement opposing SB 1047, prior to the recent changes.
Once again, SB 1047 is a regulation directly and only on Big Tech, and the complaint is that this bill would somehow favor and advantage Big Tech. What a twist!
There is at least one bit of good information here, which is that Fei-Fei Li has been in talks with Biden, and has been advocating for a ‘moonshot mentality.’ And I am glad to see the move to acknowledge that the bill is well-intentioned.
Once again there is talk of Federal legislation, without any sign of movement towards a bill that would address the concerns of the bill. Indeed, Pelosi’s statement does not indicate she puts any value at all on addressing those concerns.
There is however no argument here against SB 1047, other than an argument from authority by herself, other Congress members and Li. There are zero concrete details or criticisms let alone requested changes.
Li’s letter opposing SB 1047 showed that she is at best severely misinformed and confused about the bill and what it would do. At worst, she is deliberately misrepresenting it. Her main funder is a16z, which has been making a wide variety of bad faith and outright false attacks on SB 1047.
If Pelosi is indeed relying on Li’s statements here, that is unfortunate. Pelosi’s claim that this bill would ‘harm the US AI ecosystem’ is here without basis, almost certainly based on reliances from people severely misrepresenting the bill, and I believe the claim to be false.
Senator Weiner responded politely to Pelosi’s letter, respectfully and strongly disagreeing. Among other things: He points out that while the ‘Godmother’ of AI opposes the bill, the two ‘Godfathers’ of AI strongly support it, as do several key others. He points out the bill only targets the biggest developers, and that he did indeed take into account much feedback from the open source community and other sources – after the recent changes, the idea that he is ignoring critics or criticisms is simply not credible. And he cites several parallel past cases in which California acted before Congress did, and Congress eventually followed suit.
Oh, and also, somewhat less respectfully…
That’s over $1 million in Nvidia stock.
She also had previously made quite a lot of money buying Nvidia call options.
This woman is so famous for Congressional insider trading that she has a Twitter account that tells us when she trades so the rest of us can follow. And indeed, when I heard she bought previously, I did buy more Nvidia. Should have bought a lot more. Thanks, Pelosi!
Somehow, all of this is fully legal.
Did that influence her decision? I have no idea. I do not even think SB 1047 would be bad for Nvidia’s stock price, although I am sure a16z’s lobbyists are telling her that it would be.
Encode Justice offers a reply to the house Democrats, pointing out the echoing, ‘intentionally or otherwise,’ of key false industry talking points, and pointing out in detail why many of them are false.
SB 1047: Anthropic
Anthropic issued a ‘support if amended’ letter.
SB 1047 has now been amended, with major counterfactual changes reflecting many of its requests. Will Anthropic support it?
Technically, Anthropic only promised to support if all its changes were made, and the changes in the letter Anthropic sent only partially matched Anthropic’s true requests. Some of their requests made it into the bill, and others did not. If they want to point to a specific change not made, as a reason not to support, they can easily do so.
Going over the letter:
Major changes (by their description):
So that’s 3 they got outright, 2 they mostly got, and 1 they didn’t get.
What about minor changes:
This is a similar result. 5 they got outright or close to it, 3 they partially got, one they did not get.
That is a very good result. Given the number of veto points and stakeholders at this stage in the process, it is not realistic to expect to do better.
The reporting time was not changed because the 72 hour threshold matches the existing requirement for reporting cybersecurity incidents. While there are arguments that longer reporting periods avoid distractions, this was unable to fully justify the distinction between the two cases.
On the compute reporting requirement, I believe that this is worth keeping. I can see how Anthropic might disagree, but I have a hard time understanding the perspective that this is a big enough problem that it is a dealbreaker, given everything else at stake.
So now Anthropic has, depending on your perspective, three or four choices.
The moment of truth is upon us. It should be clear upon review of the changes that great efforts were made here, and most of the requested changes, and the most important ones, were made. I call upon Anthropic to publicly support the bill.
SB 1047: Reactions to the Changes
In my Guide to SB 1047, I tried to gather all the arguments against the bill (coherent or otherwise) but avoided going into who made what statements, pro or anti.
So, after some big changes were announced, who said what?
Vitalik Buterin was very positive on the changes, without fully supporting the bill. As he noted, his two top concerns have been directly addressed.
Samuel Hammond, who previously said the bill went too far, calls the new version a ‘very reasonable bill.’
More of the same:
Similarly, here’s Timothy Lee. He is not convinced that the risks justify a bill at all, which is distinct from thinking this is not a good bill.
Dean Ball acknowledges the bill is improved from his perspective, but retains his position in opposition in a Twitter thread, then in his full post write-up.
In terms of the specific criticisms, you can see my Guide to SB 1047 post’s objections sections for my responses. I especially think there is confusion here about the implications of the ‘reasonable care’ standard (including issues of vagueness), and the need for the developer’s lack of reasonable care in particular to be counterfactual, a ‘but for,’ regarding the outcome. Similarly, he claims that the bill does not acknowledge trade-offs, but the reasonable care standard is absolutely centered around trade-offs of costs against benefits.
My central takeaway from Dean’s thread and post is that he was always going to find ways to oppose any remotely similar California bill however well designed or light touch, for reasons of political philosophy combined with his not thinking AI poses sufficient imminent risks.
I do acknowledge and am thankful for him laying out his perspective and then focusing mostly on specific criticisms, and mostly not making hyperbolic claims about downsides. I especially appreciate that he notices that the reason SB 1047 could potentially differentially impact open models is not because anything in the bill does this directly (indeed the bill effectively gives open models beneficial special treatment), but exactly because open models are less secure and thus could potentially pose additional risks of critical harm that might make the release of the weights a negligent act.
He also offers various generic reasons to never push ahead with any regulations at any time for any reason. If your rhetorical bar for passing a law is ‘will the foundations of the republic shake if we do not act this minute?’ then that tells us a lot. I do think this is a defensible overall philosophy – that the government should essentially never regulate anything, it inevitably does more harm than good – but that case is what it is. As does using the involvement of a CBRN expert in the government’s board as an argument the bill, rather than an obviously good idea.
I was however disappointed in his post’s conclusion, in which he questioned the motives of those involved and insisted the bill is motivated primarily by ego and that it remains ‘California’s effort to strangle AI.’ I have direct evidence that this is not the case, and we all need to do better than that.
Daniel Fong reads through the changes, notices this bill will not ‘kill AI’ or anything like that, but is still filled with dread, saying it gave her ‘tsa vibes,’ but it has transparency as its upside case. I think this is a healthy instinctual response, if one is deeply skeptical of government action in general and also does not believe there is substantial danger to prevent.
As Kelsey Piper notes, these early reactions were refreshing. We are finding out who wants no regulation at all under any circumstances (except for subsidies and favorable treatment and exemptions from the rules, of course), versus those who had real concerns about the bill.
There are also those who worry the bill is now too watered down, and won’t do enough to reduce existential and other risks.
My view, as I stated earlier this week, is that while there will be less impact and certainly this does not solve all our problems, this is still a highly useful bill.
Alas, politicians that were already opposed to the bill for misinformed reasons are less easy to convince. Here we have Ranking Member Lofgran, who admits that the changes are large improvements to the bill and that strong efforts were made, but saying that ‘problems remain and the bill should not be passed in this form,’ obviously without saying what changes would be sufficient to change that opinion.
Dealing with these objections one by one:
I have not heard anything from the corporations and lobbyists, or directly from a16 or Meta or their louder spokespeople, since the changes. Kat Woods portrays them as ‘still shrieking about SB 1047 as loudly as before’ and correctly points out their specific objections (I would add: that weren’t already outright hallucinations or fabrications) have mostly been addressed. She offers this:
I don’t think that’s accurate. From what I see, most of the opposition I respect and that acts in good faith is acknowledging the bill is much better, that its downsides are greatly reduced and sometimes fully moving to a neutral or even favorable stance. Whereas the ones who have consistently been in bad faith have largely gone quiet.
I also think those most strongly opposed, even when otherwise lying, have usually been open about the conclusion that they do not want any government oversight, including the existing oversights of common law, for which they would like an exemption?
Yes, they lie about the contents of the bill and its likely consequences, but they are mostly refreshingly honest about what they ultimately want, and I respect that.
This is much better, in my view, than the ones who disingenuously say ‘take a step back’ to ‘come back with a better bill’ without any concrete suggestions on what that would look like, or any acknowledgment that this has effectively already happened.
Then there are those who were sad that the bill was weakened. As I said in my guide to SB 1047, I consider the new bill to be more likely to pass, and to have a better cost-benefit ratio, but to be less net beneficial than the previous version of the bill (although some of the technical improvements were simply good).
Yes, by definition, if the bill is to have any positive impact on safety, it is going to require behaviors to change, and this will have at least some impact on speed of development. It could still end up highly positive because good safety is good for business in many ways, but there is usually no fully free lunch.
I think the situation is less dire and toothless than all that. But yes, the standards got substantially lowered, and there is a definite risk that a corporation might ‘roll the dice’ knowing they are being deeply irresponsible, on the theory that nothing might go wrong, if something did go wrong and everyone dies or the company has already blown up no one can hold them to account, and they can stall out any legal process for years.
SB 1047: Big Picture
This is a hint that some people have misunderstood what is going on:
Does that actually sound like something the Rationalists could do? I agree that Rationalists are punching far above their weight, and doing a remarkable job focusing only on what matters (Finkel’s Law!) but do you really think they are ‘going blow-to-blow with the entire software sector and have a decent shot of overpowering Nancy Pelosi’?
I would dare suggest that to say this out loud is to point out its absurdity. The ‘entire software sector’ is not on the other side, indeed tech workers largely support the bill at roughly the same rate as other college graduates, and roughly 65-25. Pelosi issued a statement against the bill because it seemed like the thing to do, but when you are actually up against Pelosi for real (if you are, for example, the President a while back), you will know it. If she was actually involved for real, she would know how any of this works and it would not look this clumsy.
What’s actually going on is that the central opposition lives on vibes. They are opposing because to them the vibes are off, and they are betting on vibes, trying to use smoke, mirrors and Tweets full of false claims to give the impression of massive opposition. Because that’s the kind of thing that works in their world. They got this far on vibes, they are not quitting now.
Meanwhile, it helps to actually listen to concerns, try to find the right answers and thus be pushing things that are remarkably well crafted, that are actually really popular, and to have arguments that are actually true, whether or not you find them persuasive. Also Scott Wiener actually figured out the underlying real world problem via reasoning things out, which is one key reason we got this far.
The Week in Audio
Emad Mostaque predicts crazy stuff and an AI industrial revolution within five years.
Joe Rogan talked to Peter Thiel. It is three hours long so Ben Pace offers this summary of the AI and racing with China sections. Joe Rogan opens saying he thinking biological life is on the way out. Thiel in some places sounds like he doesn’t feel the AGI, at all, then in others he asks questions like ‘does it jump the air gap?’ and expects China’s AI to go rogue on them reasonably often. But what is he worried about? That regulation might strangle AI before it has the chance to go rogue.
Seriously, it’s f***ing weird. It’s so f***ing weird for Rogan to say ‘biology is on the way out’ and then a few minutes later say ‘AI progress slowing down a lot’ would be ‘a f***ing disaster.’
Thiel does come out, finally, and say that if it all means humans are ‘headed to the glue factory’ that then he would be ‘a Luddite too.’ Thiel’s threat model clearly says, here and elsewhere, that the big risk is people slowing AI progress. And he thinks the ‘effective altruists’ are winning and are going to get AI outlawed, which is pretty far out on several levels.
Yet he seems to take pretty seriously the probability that, if we don’t outlaw AI, then AI plausibly goes rogue and we get ‘sent to the glue factory.’ And earlier he says that if Silicon Valley builds AI there’s a up to 99% chance that it ‘goes super haywire.’ That’s Yudkowsky levels of impending doom – I don’t know exactly what ‘goes super haywire’ means here, how often it means ‘glue factory,’ but it’s gotta involve a decent amount of glue factory scenarios?
Yeah, I dunno, man. Thiel clearly is trying to think and have an open mind here, I do give him credit for that. It’s just all so… bizarre. My guess is he’s super cynical, bitter and traumatized from what’s happened with other technologies, he’s been telling the story about the great stagnation in the world of atoms for decades, and now he’s trying but he can’t quite get away from the pattern matching?
I mean, I get why Thiel especially would say that regulation can’t be the answer, but… he thinks this is gonna ‘go super haywire 99% of the time’ and the question Rogan doesn’t ask is the obvious one: ‘So f***, man, regulation sounds awful but if we don’t do something they’re 99% to f*** this up, so what the hell else can we do?’
Alas, the question of what the alternative is isn’t directly asked. Other than Thiel saying he doesn’t see a good endgame, it also isn’t answered.
Whereas I’d say, if you can’t see a good endgame, the only winning move is not to play.
Rhetorical Innovation
Matthew Yglesias paywalls his post but offers a Twitter preview of an important and I think mostly accurate perspective on the debates on AI. The worried notice that AI will be transformational and is not like other technologies and could go existentially badly, but are essentially optimists about AI’s potential. Whereas most of the Unworried are centrally AI capability skeptics, who do not ‘feel the AGI’ and do not take future frontier AI seriously. So many in tech are hype men, who don’t see the difference between this round of hype and other rounds, and are confused why anyone wants to interfere with their hype profiteering. Or they are general tech skeptics.
Yes, of course there are exceptions in the other two quadrants, but there are not as many of those as one might think. And yes, there are a handful of true ‘doomers’ who think there is essentially no path to superintelligence where humanity or anything of value survives, or that it is highly unlikely we can get on such a path. But those are the exceptions, not the rule.
Aligning a Smarter Than Human Intelligence is Difficult
Limitations on Formal Verification for AI Safety points to many of the same concerns I have about the concept of formal verification or ‘proof’ of safety. I am unconvinced that formal verification ‘is a thing’ in practical real world (highly capable) AI contexts. Even more than usual: Prove me wrong, kids.
So this is very true:
Indeed, gamers who aren’t making games for themselves usually make bad games.
If you are optimizing your products around Solving Work Problems, then that distortion only compounds with and amplifies risk of other distortions.
The Lighter Side
Depending on what counts, could be early, could be late.
AI will never give you up, and it will never let you down.
I mean, look, you can’t say there weren’t signs. Also, if your fixes look like this, I have some bad news about the underlying issue: