This was the press release; the actual order has now been published.
One safety-relevant part:
4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:
(i) Companies developing or demonstrating an intent to develop potential dual-use foundation models to provide the Federal Government, on an ongoing basis, with information, reports, or records regarding the following:
(A) any ongoing or planned activities related to training, developing, or producing dual-use foundation models, including the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats;
(B) the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights; and
(C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section, and a description of any associated measures the company has taken to meet safety objectives, such as mitigations to improve performance on these red-team tests and strengthen overall model security. Prior to the development of guidance on red-team testing standards by NIST pursuant to subsection 4.1(a)(ii) of this section, this description shall include the results of any red-team testing that the company has conducted relating to lowering the barrier to entry for the development, acquisition, and use of biological weapons by non-state actors; the discovery of software vulnerabilities and development of associated exploits; the use of software or tools to influence real or virtual events; the possibility for self-replication or propagation; and associated measures to meet safety objectives; and
(ii) Companies, individuals, or other organizations or entities that acquire, develop, or possess a potential large-scale computing cluster to report any such acquisition, development, or possession, including the existence and location of these clusters and the amount of total computing power available in each cluster.
(b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:
(i) any model that was trained using a quantity of computing power greater than 1026 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 1023 integer or floating-point operations; and
(ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 1020 integer or floating-point operations per second for training AI.
This requires reporting of plans for training and deployment, as well as ownership and security of weights, for any model with training compute over FLOPs. Might be enough of a talking point with corporate leadership to stave off things like hypothetical irreversible proliferation of a GPT-4.5 scale open weight LLaMA 4.
(k) The term “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by:
(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;
(ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or
(iii) permitting the evasion of human control or oversight through means of deception or obfuscation.
Models meet this definition even if they are provided to end users with technical safeguards that attempt to prevent users from taking advantage of the relevant unsafe capabilities.
(i) substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;
Wouldn't this include most, if not all, uncensored LLMs?
And thus any person/organization working on them?
I think the key here is 'substantially'. That's a standard of evidence which must be shown to apply to the uncensored LLM in question. I think it's unclear if current uncensored LLMs would meet this level. I do think that if GPT-4 were to be released as an open source model, and then subsequently fine-tuned to be uncensored, that it would be sufficiently capable to meet the requirement of 'substantially lowering the barrier of entry for non-experts'.
Do you know who would be deciding on orders like this one? Some specialized department in the USG, whatever judge that happens to hear the case, or something else?
I do not know. I can say that I'm glad they are taking these risks seriously. The low screening security on DNA synthesis orders has been making me nervous for years, ever since I learned the nitty gritty details while I was working on engineering viruses in the lab to manipulate brains of mammals for neuroscience experiments back in grad school. Allowing anonymous people to order custom synthetic genetic sequences over the internet without screening is just making it too easy to do bad things.
Do you think we need to ban open source LLMs to avoid catastrophic biorisk? I'm wondering if there are less costly ways of achieving the same goal. Mandatory DNA synthesis screening is a good start. It seems that today there are no known pathogens which would cause a pandemic, and therefore the key thing to regulate is biological design tools which could help you design a new pandemic pathogen. Would these risk mitigations, combined with better pandemic defenses via AI, counter the risk posed by open source LLMs?
I think that in the long term, we can make it safe to have open source LLMs, once there are better protections in place. By long term, I mean, I would advocate for not releasing stronger open source LLMs for probably the next ten years or so. Or until a really solid monitoring system is in place, if that happens sooner. We've made a mistake by publishing too much research openly, with tiny pieces of dangerous information scattered across thousands of papers. Almost nobody has time and skill sufficient to read and understand all that, or even a significant fraction. But models can, and so a model that can put the pieces together and deliver them in a convenient summary is dangerous because the pieces are there.
Why do you believe it's, on the whole, a 'mistake' instead of beneficial?
I can think of numerous benefits, especially in the long term.
e.g. drawing the serious attention of decision makers who might have otherwise believed it to be a bunch of hooey, and ignored the whole topic.
e.g. discouraging certain groups from trying to 'win' in a geopolitical contest, by rushing to create a 'super'-GPT, as they now know their margin of advantage is not so large anymore.
Oh, I meant that the mistake was publishing too much information about how to create a deadly pandemic. No, I agree that the AI stuff is a tricky call with arguments to be made for both sides. I'm pretty pleased with how responsibly the top labs have been handling it, compared to how it might have gone.
Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.
Edit: I do think that there is some future line, across which AI academic publishing would be unequivocally bad. I also think slowing down AI progress in general would be a good thing.
Okay, I guess my question still applies?
For example, it might be that letting it progress without restriction has more upsides then slowing it down.
An example of something I would be strongly against anyone publishing at this point in history is an algorithmic advance which drastically lowered compute costs for an equivalent level of capabilities, or substantially improved hazardous capabilities (without tradeoffs) such as situationally-aware strategic reasoning or effective autonomous planning and action over long time scales. I think those specific capability deficits are keeping the world safe from a lot of possible bad things.
I think... maybe I see the world and humanity's existence on it, as a more fragile state of affairs than other people do. I wish I could answer you more thoroughly.
The temporary technical conditions in 4.2(b) such as FLOPs of training compute seem to apply without further qualification for whether a model is "dual-use" in a more particular sense. So unclear if the definition of "dual-use" in 3(k) is relevant to application of reporting requirements in 4.2(a) until updated technical conditions get defined.
This made me wonder about a few things:
Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use and focusing solely on those aimed at risk.
Thanks for the work put into the distillation! But I think that the acceleration proposal to safety proposal ratio is highly relevant. British PM's Rishi Sunak's speech, for example, was in large part an announcement that the UK would not regulate AI anytime soon. I've argued previously that governments have strong short term incentives to accelerate AI and even lie about it, so my prediction is that omitting the ratio of safety to pro-acceleration points here, by omitting pro-acceleration points entirely, is net harmful.
Hmm, I get the idea that people value succinctness a lot with these sorts of things, because there's so much AI information to take in now, so I'm not so sure about the net effect, but I'm wondering maybe if I could get at your concern here by mocking up a percentage (i.e. what percentage of the proposals were risk oriented vs progress oriented)?
It wouldn't tell you the type of stuff the Biden administration is pushing, but it would tell you the ratio which is what you seem perhaps most concerned with.
[Edit] this is included now
I spent a few hours reading, and parsing out, sections 4 and 5 of the recent White House Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.
The following are my rough notes on each subsection in those two subsections, summarizing what I understand each to mean, and my personal thoughts.
My high level thoughts are at the bottom.
4.1
4.2
4.3
4.4 – For reducing AI-mediated CHEMICAL, BIOLOGICAL, RADIOLOGICAL, AND NUCLEAR threats, focusing on biological weapons in particular.
4.5
4.6
4.7
4.8
5.1 – Attracting AI Talent to the United States.
5.2
5.3 – Promoting Competition.
Mostly, this executive order doesn’t seem to push for much object-level action. Mostly it orders a bunch of assessments to be done, and reports on those assessments to be written, and then passed up to the president.
My best guess is that this is basically an improvement?
I expect something like the following to happen:
In general, this executive order means that the Executive branch is paying attention. That seems, for now, pretty good.
(Though I do remember in 2015 how excited and optimistic people in the rationality community were about Elon Musk, “paying attention”, and that ended with him founding OpenAI, what many of those folks consider to be the worst thing that anyone had ever done to date. FTX looked like a huge success worthy of pride, until it turned out that it was a damaging and unethical fraud. I’ve become much more circumspect about which things are wins, especially wins of the form “powerful people are paying attention”.)
My guess is that this comment would be much more readable with the central chunk of it in a google doc, or failing that a few levels fewer of indented bullets.
e.g. Take this section.
- Thoughts:
- Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
- What do I think about this overall?
- I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
- The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
- The interest in red-teaming is promising, but again it depends on the implementation details.
- I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
- What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?
I find it much more readable as the following prose rather than 5 levels of bullets. Less metacognition tracking the depth.
Thoughts
Do those numbers add up? It seems like if you’re worried about models that were trained on 10^26 flops in total, you should be worried about much smaller training speed thresholds than 10^20 flops per second? 10^19 flops per second, would allow you to train a 10^26 model in 115 days, e.g. about 4 months. Those standards don’t seem consistent.
What do I think about this overall?
I mean, I guess reporting this stuff to the government is a good stepping stone for more radical action, but it depends on what the government decides to do with the reported info.
The thresholds match those that I’ve seen in strategy documents of people that I respect, so that that seems promising. My understanding is that 10^26 FLOPS is about 1-2 orders of magnitude larger than our current biggest models.
The interest in red-teaming is promising, but again it depends on the implementation details.
I’m very curious about “launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.”
What will concretely happen in the world as a result of “an initiative”? Does that mean allocating funding to orgs doing this kind of work? Does it mean setting up some kind of government agency like NIST to…invent benchmarks?
Possibly. I wrote this as personal notes, originally, in full nested list format. Then I spent 20 minutes removing some of the the nested-list-ness in wordpress which was very frustrating. I would definitely have organized it better if wordpress was less frustrating.
I did make a google doc format. Maybe the main lesson is that I should have edited it there.
The actual text or the order is 70 pages long and very hard to navigate. At the request of some DC friends, I made a tool for navigating the text that adds
Hope it's useful for some of you here! https://www.aijobstracker.com/ai-executive-order
My overall takeaway is all these things are generally good, though insufficient to actually address many X-risk-related concerns.
All of the standards assume we can wait until after a very powerful system is trained to evaluate it, which by my current understanding would not address risks of deception.
Frankly, this is more than I would have expected the white house to do, and thus I think a positive update on likely future actions.
Yeah, I think the reference class for me here is other things the executive branch might have done, which leads me to "wow, this was way more than I expected".
Worth noting is that they at least are trying to address deception by including it in the full bill readout. The type of model they hope to regulate here include those that permit "the evasion of human control or oversight through means of deception or obfuscation". The director of the OMB also has to come up with tests and safeguards for "discriminatory, misleading, inflammatory, unsafe, or deceptive outputs".
this is crazy, perhaps the most sweeping action taken by government on AI yet.
Seems like too much consulting jargon and "we know it when we see it" vibes, with few concrete bright-lines. Maybe a lot hinges on enforcement of the dual-use foundation model policy... any chance developers can game the system to avoid qualifying as a dual-use model? Watermarking synthetic content does appear on its face a widely-applicable and helpful requirement.
My general impression is for these sorts of things, vagueness is generally positive, since it gives the executive and individual actors who want to make a name for themselves more leeway, and makes companies less able to wriggle out on technicalities. Contrast with vague RSPs, for which the value of vagueness is in the opposite direction.
But of course this is an executive order, so if enough companies aren’t subject to it based on technicalities, it could easily be changed and re-issued. I don’t know how common this is though.
Garrett responded to the main thrust well, but I will say that watermarking synthetic media seems fairly good as a next step for combating misinformation from AI imo. It's certainly widely applicable (not really even sure what the thrust of this distinction was) because it is meant to apply to nearly all synthetic content. Why exactly do you think it won't be helpful?
I agree, I was trying to highlight it as one of the most specific, useful policies from the EO. Understand the confusion given my comment was skeptical overall.
UK’s proposal for a joint safety institute seems maybe more notable:
Sunak will use the second day of Britain's upcoming two-day AI summit to gather “like-minded countries” and executives from the leading AI companies to set out a roadmap for an AI Safety Institute, according to five people familiar with the government’s plans.
The body would assist governments in evaluating national security risks associated with frontier models, which are the most advanced forms of the technology.
The idea is that the institute could emerge from what is now the United Kingdom’s government’s Frontier AI Taskforce, which is currently in talks with major AI companies Anthropic, DeepMind and OpenAI to gain access to their models. An Anthropic spokesperson said the company is still working out the details of access, but that it is “in discussions about providing API access.”
https://www.politico.eu/article/uk-pitch-ai-safety-institute-rishi-sunak/
Released today (10/30/23) this is crazy, perhaps the most sweeping action taken by government on AI yet.
Below, I've segmented by x-risk and non-x-risk related proposals, excluding the proposals that are geared towards promoting its use[1] and focusing solely on those aimed at risk. It's worth noting that some of these are very specific and direct an action to be taken by one of the executive branch organizations (i.e. sharing of safety test results) but others are guidances, which involve "calls on Congress" to pass legislation that would codify the desired action.
[Update]: The official order (this is a summary of the press release) has now be released, so if you want to see how these are codified to a greater granularity, look there[2].
Existential Risk Related Actions:
Non-Existential Risk Actions:
General
Discrimination
Healthcare
Jobs
Privacy
Out of 26 distinct proposals, 7 (27%) are geared towards increasing use or capabilities and 2 (8%) proposals are a mixed bag of both encouraging development but also further safety.
I can also do a similar post for that if there's interest, but it would be significantly longer