On the Executive Order

Zvi

Or: I read the executive order and its fact sheet, so you don’t have to.

I spent Halloween reading the entire Biden Executive Order on AI.

This is the pure ‘what I saw reading the document’ post. A companion post will cover reactions to this document, but I wanted this to be a clean reference going forward.

Takeaway Summary: What Does This Do?

It mostly demands a lot of reports, almost entirely from within the government.

A lot of government employees will be writing a lot of reports.
After they get those reports, others will then write additional reports.
There will also be a lot of government meetings.
These reports will propose paths forward to deal with a variety of AI issues.
These reports indicate which agencies may get jurisdiction on various AI issues.
Which reports are requested indicates what concerns are most prominent now.
1. A major goal is to get AI experts into government, and get government in a place where it can implement the use of AI, and AI talent into the USA.
2. Another major goal is ensuring the safety of cutting-edge foundation (or ‘dual use’) models, starting with knowing which ones are being trained and what safety precautions are being taken.
3. Other ultimate goals include: Protecting vital infrastructure and cybersecurity, safeguarding privacy, preventing discrimination in many domains, protecting workers, guarding against misuse, guarding against fraud, ensuring identification of AI content, integrating AI into education and healthcare and promoting AI research and American global leadership.
There are some tangible other actions, but they seem trivial with two exceptions:
1. Changes to streamline the AI-related high skill immigration system.
2. The closest thing to a restriction are actions to figure out safeguards for the physical supply chain for synthetic biology against use by bad actors, which seems clearly good.
If you train a model with 10^26 flops, you must report that you are doing that, and what safety precautions you are taking, but can do what you want.
If you have a data center capable of 10^20 integer operations per second, you must report that, but can do what you want with it.
If you are selling IaaS to foreigners, you need to report that KYC-style.

What are some things that might end up being regulatory requirements in the future, if we go in the directions these reports are likely to lead?

Safety measures for training and deploying sufficiently large models.
Restrictions on foreign access to compute or advanced models.
Watermarks for AI outputs.
Privacy enhancing technologies across the board.
Protections against unwanted discrimination.
Job protections of some sort, perhaps, although it is unclear how or what.

Essentially that this is the prelude to potential government action in the future. Perhaps you do not like that for various reasons. There are certainly reasonable reasons. Or you could be worried in the other direction, that this does not do anything on its own, and that it might be confused for actually doing something and crowd out other action. No laws have yet been passed, no rules of substance put into place.

One can of course be reasonably concerned about slippery slope or regulatory ratcheting arguments over the long term. I would love to see the energy brought to such concerns here, being applied to actual every other issue ever, where such dangers have indeed often taken place. I will almost always be there to support it.

If you never want the government to do anything to regulate AI, or you want it to wait many years before doing so, and you are unconcerned about frontier models, the EO should make you sad versus no EO.

If you do want the government to do things to regulate AI within the next few years, or if you are concerned about existential risk from frontier models, then the EO should make you happy versus no EO.

You should likely be especially sad or happy about the compute reporting thresholds, which I believe were set reasonably and conservatively.

If you were hoping for or worried about potential direct or more substantive action, then the opposite applies – there is very little here in the way of concrete action, only the foundation for potential future action.

No matter who you are, you should (as I read the document) be happy that the EO seems to mostly be executed competently, with one notable and important exception.

The big mistake is that they seem to have chosen quite a terrible definition of AI. Even GPT-4 did a much better job on this. The chosen definition both threatens to incorporate many things that most everyone should prefer that this executive order not apply to, and also threatens to fail to incorporate other systems where the executive order definitely intends to apply. It creates potential loopholes that will put us in danger, and also could end up imposing onerous requirements where they serve no purpose. I very much hope it is fixed.

Also, yes, this document does at one point Declare Defense Production Act. Sigh.

What now? You have four options:

You can read the Fact Sheet section and skip the close reading.
You can skip the Fact Sheet section and only read the close reading.
You can read both, if you want the full picture including intended impressions.
You can read neither, with or without reading the reactions post.

I consider all four reactions highly reasonable. Stay sane, my friends.

Fact Sheet

I intentionally went over the Fact Sheet before the full order, then I went back and modified based to include reference to what the actual text said.

Require that developers of the most powerful AI systems share their safety test results and other critical information with the U.S. government. In accordance with the Defense Production Act, the Order will require that companies developing any foundation model that poses a serious risk to national security, national economic security, or national public health and safety must notify the federal government when training the model, and must share the results of all red-team safety tests. These measures will ensure AI systems are safe, secure, and trustworthy before companies make them public.

Develop standards, tools, and tests to help ensure that AI systems are safe, secure, and trustworthy. The National Institute of Standards and Technology will set the rigorous standards for extensive red-team testing to ensure safety before public release. The Department of Homeland Security will apply those standards to critical infrastructure sectors and establish the AI Safety and Security Board. The Departments of Energy and Homeland Security will also address AI systems’ threats to critical infrastructure, as well as chemical, biological, radiological, nuclear, and cybersecurity risks. Together, these are the most significant actions ever taken by any government to advance the field of AI safety.

Excellent ideas. Details matter. There are incentive traps to avoid here on all sides.

For now, in practice, the rule is ‘do whatever you want as long as you tell us what you did and did not do.’

One important thing the fact sheet does not mention is the biggest game in the whole document: Reporting requirements for foundation models, triggered by highly reasonable compute thresholds.

Protect against the risks of using AI to engineer dangerous biological materials by developing strong new standards for biological synthesis screening. Agencies that fund life-science projects will establish these standards as a condition of federal funding, creating powerful incentives to ensure appropriate screening and manage risks potentially made worse by AI.

Yes, please. I would hope that even pure accelerationists would be on board with this one. Even if we handle AI as well as we realistically can, or even if we never built another AI system ever again, we would still need to better secure access to biological synthesis that could be used to generate dangerous materials.

This section seems clearly good and I hope we can all agree on that.

Protect Americans from AI-enabled fraud and deception by establishing standards and best practices for detecting AI-generated content and authenticating official content. The Department of Commerce will develop guidance for content authentication and watermarking to clearly label AI-generated content. Federal agencies will use these tools to make it easy for Americans to know that the communications they receive from their government are authentic—and set an example for the private sector and governments around the world.

This is the clear lowest-bar test. Will you require all AI content to be identifiable as such in some fashion? The fact sheet passes. The amount of complaining that focused on this point surprised me, as it does not seem difficult or that expensive to comply with such requirements. Of course, government could get it wrong, and make it so that every AI voice needs to start with ‘I am an AI and I created this message’ or something, that is always a danger, but we often do get such things right.

The full document does not yet require such watermarking, which would presumably have been beyond an EO’s scope in any case. Instead, the document calls for figuring out how one might watermark, with the clear expectation of requiring it later.

Establish an advanced cybersecurity program to develop AI tools to find and fix vulnerabilities in critical software, building on the Biden-Harris Administration’s ongoing AI Cyber Challenge. Together, these efforts will harness AI’s potentially game-changing cyber capabilities to make software and networks more secure.

Clear win unless epically botched, I have seen no objections.

It does not have much in the way of teeth, at least not yet, but what did you expect?

Order the development of a National Security Memorandum that directs further actions on AI and security, to be developed by the National Security Council and White House Chief of Staff. This document will ensure that the United States military and intelligence community use AI safely, ethically, and effectively in their missions, and will direct actions to counter adversaries’ military use of AI.

Pick a place to think more about such issues in a unified and systematic manner. Definitely a good idea if you intend to do things. The objections I have seen to this type of action are objecting because they think doing things other than enforcing existing law would be a bad idea, and this enables doing those new things.

There is next a section on privacy.

Protect Americans’ privacy by prioritizing federal support for accelerating the development and use of privacy-preserving techniques—including ones that use cutting-edge AI and that let AI systems be trained while preserving the privacy of the training data.

Strengthen privacy-preserving research and technologies, such as cryptographic tools that preserve individuals’ privacy, by funding a Research Coordination Network to advance rapid breakthroughs and development. The National Science Foundation will also work with this network to promote the adoption of leading-edge privacy-preserving technologies by federal agencies.

Evaluate how agencies collect and use commercially available information—including information they procure from data brokers—and strengthen privacy guidance for federal agencies to account for AI risks. This work will focus in particular on commercially available information containing personally identifiable data.

Develop guidelines for federal agencies to evaluate the effectiveness of privacy-preserving techniques, including those used in AI systems. These guidelines will advance agency efforts to protect Americans’ data.

That’s a lot of talk and not a lot of specifics beyond some harmless research funding. Need to read the full order to know if this is more than all talk. Update: Yep, all talk.

Up next is advancing equity and civil rights, something the Biden Administration cares about quite a lot.

Provide clear guidance to landlords, Federal benefits programs, and federal contractors to keep AI algorithms from being used to exacerbate discrimination.

Address algorithmic discrimination through training, technical assistance, and coordination between the Department of Justice and Federal civil rights offices on best practices for investigating and prosecuting civil rights violations related to AI.

Ensure fairness throughout the criminal justice system by developing best practices on the use of AI in sentencing, parole and probation, pretrial release and detention, risk assessments, surveillance, crime forecasting and predictive policing, and forensic analysis.

Signaling out landlords is very on brand for Biden.

This is saying that various institutions need to avoid using AI to do the particular forms of discrimination that are not allowed. As usual, when a computer system is doing it, such results become easier to show and also more blameworthy, so countermeasures will be required. AI picks up on correlations and makes probability assessments, so if you don’t like what that implies you’ll have to do manual correction.

Once again, all talk, although that is the first step.

Next they plan to ‘stand up for’ consumers, patients and students.

Advance the responsible use of AI in healthcare and the development of affordable and life-saving drugs. The Department of Health and Human Services will also establish a safety program to receive reports of—and act to remedy – harms or unsafe healthcare practices involving AI.

Shape AI’s potential to transform education by creating resources to support educators deploying AI-enabled educational tools, such as personalized tutoring in schools.

Health care is a place no one can do anything, and where I have been so far pleasantly surprised by how much we have let AI be used. Formalizing rules could protect this or destroy it, depending on details. The document is all about reports on rules one might implement, with little hint what the new rules might be.

Helping educators use AI tools is good, but we must always beware lock-in, and especially attempts to keep the existing educational systems in place for no reason. On the margin I expect the order to make little difference, the real challenges come later, but the little difference to be good on the margin. The post expresses concern about the underserved, but does not provide any substance about what is to be done.

What about supporting workers? If we’re not careful, they’ll take our jobs.

Develop principles and best practices to mitigate the harms and maximize the benefits of AI for workers by addressing job displacement; labor standards; workplace equity, health, and safety; and data collection. These principles and best practices will benefit workers by providing guidance to prevent employers from undercompensating workers, evaluating job applications unfairly, or impinging on workers’ ability to organize.

Produce a report on AI’s potential labor-market impacts, and study and identify options for strengthening federal support for workers facing labor disruptions, including from AI.

We will do a bunch of paperwork and file a bunch of reports. OK, then. As always, talk of addressing ‘displacement’ is going to be a boondoggle, and workplace concerns will make workers worse off and actually accelerate them losing their jobs, but that’s how it goes. And indeed, that is exactly what is called for in the text.

I’ll skip ahead for a second to the last section, government use of AI.

Issue guidance for agencies’ use of AI, including clear standards to protect rights and safety, improve AI procurement, and strengthen AI deployment.

Help agencies acquire specified AI products and services faster, more cheaply, and more effectively through more rapid and efficient contracting.

Accelerate the rapid hiring of AI professionals as part of a government-wide AI talent surge led by the Office of Personnel Management, U.S. Digital Service, U.S. Digital Corps, and Presidential Innovation Fellowship. Agencies will provide AI training for employees at all levels in relevant fields.

The government should indeed be hiring more AI professionals and using such tools to provide better services, and of course they will need to do this compatible with mandates on rights and safety and so forth.

There is an obvious longer term danger in introducing AI systems into government, which will become increasingly concerning over time, but for now yes they need to get up to speed on this and so many other things.

This section reads like someone desperately trying to do everything they can within the crazy rules of a crazy system, hoping to get anywhere at all some day, because that is exactly what it is. They tried, I guess.

So far, those have been the parts of the order aimed in clearly good directions. Then we have the other government concerns. Government once again is here to talk to you about Promoting Innovation and Competition, and perhaps this one time they might mean it?

Catalyze AI research across the United States through a pilot of the National AI Research Resource—a tool that will provide AI researchers and students access to key AI resources and data—and expanded grants for AI research in vital areas like healthcare and climate change.

Promote a fair, open, and competitive AI ecosystem by providing small developers and entrepreneurs access to technical assistance and resources, helping small businesses commercialize AI breakthroughs, and encouraging the Federal Trade Commission to exercise its authorities.

America was always going to pay lip service to innovation, research and competition. No one steps up and says those things are bad until far later in the game than this, and no one wants to fall behind in the ‘beat China’ rhetorical game, where everyone is vastly more performatively worried than they have any need to be, with many government officials thinking that China’s government intervention would, absent our government intervention in any direction, cause them to ‘win the race’ against our companies, which is well into missile gap territory.

So yes, this stuff was always going to be here, the same way those who always oppose every government action were always going to scream about regulatory capture. No matter how much we have warned for decades about what happens when there are existential-level negative externalities while companies race against each other, everyone thinks competition is good here, while government knee-caps competition across the rest of the economy where it would be really great if they stopped.

Then again, if you wanted to promote competition, would you ask the government for help? Me neither, nor do I expect any of the actions contained herein to offer much help.

Then we notice this:

Use existing authorities to expand the ability of highly skilled immigrants and nonimmigrants with expertise in critical areas to study, stay, and work in the United States by modernizing and streamlining visa criteria, interviews, and reviews.

While this is indeed an accelerationist thing to do for obvious reasons, it is also transparently the correct thing to do except that it doubtless does not go far enough. There is no reason for those capable of building tomorrow’s AI companies and technologies to be anywhere but the United States. Nor is there reason for all those other high skill immigrants with non-AI talents and interests not to do likewise. I am happy to see us actually doing at least some amount of this.

This section seems far more like an effort to actually do something than much of the rest of the document. There are specifics and specifications throughout. Might go somewhere.

Finally, there is a section on Advancing American Leadership Abroad.

Expand bilateral, multilateral, and multistakeholder engagements to collaborate on AI. The State Department, in collaboration, with the Commerce Department will lead an effort to establish robust international frameworks for harnessing AI’s benefits and managing its risks and ensuring safety. In addition, this week, Vice President Harris will speak at the UK Summit on AI Safety, hosted by Prime Minister Rishi Sunak.

Accelerate development and implementation of vital AI standards with international partners and in standards organizations, ensuring that the technology is safe, secure, trustworthy, and interoperable.

Promote the safe, responsible, and rights-affirming development and deployment of AI abroad to solve global challenges, such as advancing sustainable development and mitigating dangers to critical infrastructure.

Good talk. Thanks. The problem is that if you look at the list of collaborators at the end of the document, there’s a lot of good names there, but once that is conspicuously missing: China. As far as I can tell this is due to us never picking up the phone to ask. Perhaps they would refuse to work with us, but there is only one way to find out.

Meanwhile, good talk is the nicest way I could describe the actual contents of the international relations section. Even less content than usual.

I Read the Whole Damn Thing So You Don’t Have To

It clocks in at a little short of 20,000 words. Here we go.

Sections 1 and 2: Introduction and Principles

The first two sections are an introduction. Let’s boil it down.

Section 1 is the statement of purpose, standard boilerplate. It unfortunately does not mention existential risk, only threats to national security. I found this interesting:

In the end, AI reflects the principles of the people who build it, the people who use it, and the data upon which it is built.

If that was sufficiently robustly true going forward I’d be a lot less worried.

Section 2 lays out principles.

Principle (a) is that AI must be safe and this requires testing.

Principle (b) is that the USA must remain in the lead in AI, competition, innovation, education, training and all that.

Principle (c) is to ‘support American workers.’

Principle (d) is equity and civil rights.

Principle (e) is consumer protection and provision of goods.

Principle (f) is privacy and civil liberties, mostly privacy protections.

Principle (g) is the government building AI capacity and using AI responsibly.

Principle (h) is American global leadership.

Section 3: Definitions

Section 3 lays out definitions. The definitions I don’t mention seem solid, or refer to definitions in other laws and documents, which I chose not to examine.

(b) The term “artificial intelligence” or “AI” has the meaning set forth in 15 U.S.C. 9401(3): a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine- and human-based inputs to perceive real and virtual environments; abstract such perceptions into models through analysis in an automated manner; and use model inference to formulate options for information or action.

That is… not a great definition. There are a number of potential loopholes here. The biggest one would be: What if the system is generating its own objectives? Is DALLE even an AI under this definition?

On the flip side, this is overly broad in other places, encompassing many computer systems that do not require the actions laid out in this EO. Does this apply to an existing automated toilet flusher?

I asked GPT-4 to critique this, and then propose a better definition. It gave me this:

(b) The term “Artificial Intelligence” or “AI” refers to a computational system that fulfills one or more of the following criteria:

Predictive Capabilities: The system can make probabilistic or deterministic forecasts about future states based on input data.

Decision-Making: The system can autonomously make choices or recommendations among alternatives to achieve predefined or learned objectives.

Environment Interaction: The system can perceive and act upon either a real or a simulated environment to achieve objectives, where perception could include but is not limited to sensory data like vision and sound.

Automated Modeling: The system can autonomously create, refine, or utilize mathematical or computational models to represent aspects of the world.

Learning Ability: The system can improve its performance or adapt its functions through data-driven learning algorithms.

Additional clarifications:

Human and Machine Inputs: AI systems can utilize both human-generated and machine-generated data for their operations.

Exclusions: Systems that solely rely on preprogrammed heuristics, simple statistical methods, or traditional databases are not considered AI under this definition.

Objective Framework: The objectives guiding the AI system can be either explicitly human-defined or derived through meta-learning or other advanced techniques.

I would grade GPT-4’s answer as better than the one in the EO.

Whereas this definition mostly seems excellent:

(k) The term “dual-use foundation model” means an AI model that is trained on broad data; generally uses self-supervision; contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters, such as by:

  (i)    substantially lowering the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons;

   (ii) enabling powerful offensive cyber operations through automated vulnerability discovery and exploitation against a wide range of potential targets of cyber attacks; or

  (iii) permitting the evasion of human control or oversight through means of deception or obfuscation.

In particular I like ‘could be easily modified to exhibit’ and especially ‘substantially lowering the barrier of entry for nonexperts.’

My only substantive qualm is that I would be hesitant to outright require a minimum of 20 billion parameters. I would have preferred to say they typically are expected to have tens of billions of parameters. For now that is safe, but it would be unsurprising to see a 13B model qualify for this down the road. I might also extend the (i) clause to other dangerous elements.

A silly one got some outside attention.

(m) The term “floating-point operation” means any mathematical operation or assignment involving floating-point numbers, which are a subset of the real numbers typically represented on computers by an integer of fixed precision scaled by an integer exponent of a fixed base.

Neil Chilson: WTF is an “integer of fixed precision”? Do they mean fixed width? (They can’t really, the significand is a fraction!) This is a weird definition. From the new AI Executive Order definition of “floating-point operation.”

And yet a mere five definitions later they properly define an integer.

As Nathan points out, looks like the EO just copied the definition of “floating point operation” from Wikipedia.

I mean, yes, if you were highly competent you would not say it like that, but it is fine.

Worth noting this one, as I haven’t seen a formal definition elsewhere, it seems good in practice for now, and it is interesting to think whether this is technically correct:

(p) The term “generative AI” means the class of AI models that emulate the structure and characteristics of input data in order to generate derived synthetic content. This can include images, videos, audio, text, and other digital content.

Since I’ve seen objections to watermarking, here’s what it means in context.

(gg) The term “watermarking” means the act of embedding information, which is typically difficult to remove, into outputs created by AI — including into outputs such as photos, videos, audio clips, or text — for the purposes of verifying the authenticity of the output or the identity or characteristics of its provenance, modifications, or conveyance.

Under this definition a watermark need not be prominent, and even being difficult to remove is typical but not required. It is only necessary that it allows identification, which could be designed to happen under computer analysis.

Section 4: Ensuring Safety and Security

Section 4 is entitled Ensuring the Safety and Security of AI Technology. This is hopefully the big kahuna.

We start with 4.1, with Commerce and NIST at the helm.

4.1. Developing Guidelines, Standards, and Best Practices for AI Safety and Security. (a) Within 270 days of the date of this order, to help ensure the development of safe, secure, and trustworthy AI systems, the Secretary of Commerce, acting through the Director of the National Institute of Standards and Technology (NIST), in coordination with the Secretary of Energy, the Secretary of Homeland Security, and the heads of other relevant agencies as the Secretary of Commerce may deem appropriate, shall:

…

(C) launching an initiative to create guidance and benchmarks for evaluating and auditing AI capabilities, with a focus on capabilities through which AI could cause harm, such as in the areas of cybersecurity and biosecurity.

…

(A) coordinating or developing guidelines related to assessing and managing the safety, security, and trustworthiness of dual-use foundation models

(B) in coordination with the Secretary of Energy and the Director of the National Science Foundation (NSF), developing and helping to ensure the availability of testing environments, such as testbeds, to support the development of safe, secure, and trustworthy AI technologies, as well as to support the design, development, and deployment of associated PETs, consistent with section 9(b) of this order.

…

implement a plan for developing the Department of Energy’s AI model evaluation tools and AI testbeds

…

At a minimum, the Secretary shall develop tools to evaluate AI capabilities to generate outputs that may represent nuclear, nonproliferation, biological, chemical, critical infrastructure, and energy-security threats or hazards. The Secretary shall do this work solely for the purposes of guarding against these threats, and shall also develop model guardrails that reduce such risks.

These seem like a very important things to do.

We need the ability to assess risks from foundation models. I notice that the emphasis in 4.1 is entirely on serious potential harms from misuse, without shoehorning in any everything bagel concerns. It does not mention existential risks as such, although they are of course implied.

Note the timeline. That’s 270 days to launch an initiative to create guidance to then evaluate something.

I find that delay unfortunate, but not unreasonable.

Next up are reporting requirements for dual-use foundation models, which I understand as 20B+ parameters plus dangerous plausible capabilities. I wonder how many 19B parameter models we will see. If we do see them, we know why.

4.2. Ensuring Safe and Reliable AI. (a) Within 90 days of the date of this order, to ensure and verify the continuous availability of safe, reliable, and effective AI in accordance with the Defense Production Act, as amended, 50 U.S.C. 4501 et seq., including for the national defense and the protection of critical infrastructure, the Secretary of Commerce shall require:

(i) Companies developing or demonstrating an intent to develop potential dual-use foundation models to provide the Federal Government, on an ongoing basis, with information, reports, or records regarding the following:

(A) any ongoing or planned activities related to training, developing, or producing dual-use foundation models, including the physical and cybersecurity protections taken to assure the integrity of that training process against sophisticated threats;

(B) the ownership and possession of the model weights of any dual-use foundation models, and the physical and cybersecurity measures taken to protect those model weights;

(C) the results of any developed dual-use foundation model’s performance in relevant AI red-team testing based on guidance developed by NIST pursuant to subsection 4.1(a)(ii) of this section….

(ii) Companies, individuals, or other organizations or entities that acquire, develop, or possess a potential large-scale computing cluster to report any such acquisition, development, or possession, including the existence and location of these clusters and the amount of total computing power available in each cluster.

My translation: If you train something dangerous you have to tell us what you are doing, and what safety precautions you are taking.

Also please report your compute: If you have a sufficiently large computing cluster you need to tell us where it is and how much compute it has.

Aside from the part where he also declares Defense Production Act, seriously can we please stop doing that, this seems eminently reasonable to me. Note that this does not bar any activity, it only imposes reporting requirements.

(b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:

(i) any model that was trained using a quantity of computing power greater than 10²⁶ integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10²³ integer or floating-point operations; and

(ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10²⁰ integer or floating-point operations per second for training AI.

There has been an ongoing debate over where we should set our FLOP limit, if we did set a limit. They reserve the right to change the reporting threshold at any time. They are staring with 10²⁶. That means that probably no existing models would quality, although there is some chance GPT-4 would, and models in the next generation plausibly will likely qualify.

Again, this is (for now) a reporting requirement. Pause advocates would say you cannot train above some threshold, and of the thresholds proposed this is relatively high, and it is a reporting requirement only.

I have been arguing for setting the threshold at a number like this rather than something lower. I felt and still feel it would be unreasonable to go lower, and we need to accept some (real) risk that we set the bar too high. As a reporting-only requirement, this is very high, any model that goes over this and is a move up from previous models is going to pose substantial existential risk.

What about the cluster threshold? I asked about this. It is a high threshold.

Lennart Heim: That’s roughly 50k H100s (only 25k if you assume sparse operations). There’s no public information of any 50k H100 cluster. The big cloud providers might have some. It’s a high threshold.

Nabeel S. Qureshi: (1) At 10^23 FLOPS to train GPT3, GPT3+DNA would trigger (i) here (depending on how you interpret “primarily”)

(2) Numbers for GPT4 aren’t public, but a rough estimate would put it at 10^25 FLOPS, maybe 10^26, which is at the threshold

(3) Quite a few companies have clusters >10e20. Numbers aren’t public but usual bigtech suspects + Tesla at minimum.

e.g. Note that Dojo alone is planned to be 100 exaflops by 2024.

George Mathew: Interesting to note: 1. Assuming compute-bound training, a cluster that satisfies (ii) would train a model that satisfies (i) in ~12 days. 2. GPT3 took 3.14E23 flops to train. A cluster that satisfies (ii) would train GPT3 in 52 minutes.

Soumith Chintala: Regulation starts at roughly two orders of magnitude larger than a ~70B Transformer trained on 2T tokens — which is ~5e24. Note: increasing the size of the dataset OR the size of the transformer increases training flops. The (rumored) size of GPT-4 is regulated.

Jack: 100 GBit/s is below many single nodes. that’s not a cluster. The reporting thresholds in the EO are much much higher than that. I’m pretty skeptical of AI regulation pretty broadly and not at all mad about the numbers they picked.

Which, again, only means that those centers would have to register, not that they would have to change anything they are doing. If Jack, who is downplaying how anti-regulation he acts, is happy with the thresholds, they’re not too low.

I conclude these were well-chosen thresholds. Any lower and they would have seemed unreasonable and gotten more pushback. Any higher and you don’t have enough impact.

I am also curious about all the things people are curious about in this thread:

Dave Kasten: Honestly, I’m very, very, very curious about who outputted these numbers for the EO

Patrick McKenzie: I’m idly curious how many people working for the USFG

a) know what 10^26 is

b) know what a floating point operation is

c) can go to a blackboard and calculate how many floating point operations, say, Google Search probably uses within 6 orders of magnitude. (Not a criticism.)

I, erm, think it is very probable that many compliance departments are going to wake up in the morning and start sharpening their pencils on “We know this wasn’t what you meant, however, you may be surprised to learn that we do call FICO a ‘model’ and we have crunched on it, so.”

Section (c) here talks about reporting when foreigners are transacting with us for training cyberattack-enabling training runs, where we are to prepare a report to know when it is time to worry about that, and then if a request crosses that threshold you will have to report and more or less do KYC. Note that anything over the reporting thresholds above is assumed dangerous, but that the response is still ‘file a report first.’

Section (d) extends KYC requirements to any foreigners looking for IaaS accounts.

4.3 seems straight up good, and is about AI in critical infrastructure and cybersecurity, especially in government. (a) tells everyone to prepare reports on cybersecurity, and (b) tells everyone in government to implement what the reports say. There is also mention of other critical infrastructure like financial institutions.

4.4 seems straight up good, and is about Reducing Risks at the intersection of AI and CBRN (Chemical/Biological/Radiological/Nuclear) Threats.

(a) says everyone should prepare reports on what such threats AI systems most. (b) calls for improved protections against misuse of nucleic acid synthesis, via tightening the industry’s defenses in anticipation of what AI will enable, and making sure everyone uses new more secure procedures.

4.5 addresses risks posed by synthetic content. As usual (a) tells various people to prepare reports. (b) asks Commerce and OCB to prepare guidance within 180 days. Then (c) calls for further guidance 180 days after that, and (d) calls for the Federal Acquisition Regulatory Council to maybe take that into account.

That’s how you control risks, federal government bureaucracy style.

4.6 calls for soliciting opinion on open sourcing dual-use foundation models. Is it the worst thing you can do, or is it so hot right now because it helps you advance AI capabilities, so it’s worth the risk? Seems like this will be a high-value place to provide feedback, especially for those with government-legible credentials.

4.7 takes steps to prevent federal government data from being used in AI training in a way that would leak private information, by which we mean (you guessed it) prepare reports and then maybe do what those reports say.

4.8 asks for a report on national security, including DoD and intelligence.

(a) provide guidance to the Department of Defense, other relevant agencies, and the Intelligence Community on the continued adoption of AI capabilities to advance the United States national security mission, including through directing specific AI assurance and risk-management practices for national security uses of AI that may affect the rights or safety of United States persons and, in appropriate contexts, non-United States persons; and

(b) direct continued actions, as appropriate and consistent with applicable law, to address the potential use of AI systems by adversaries and other foreign actors in ways that threaten the capabilities or objectives of the Department of Defense or the Intelligence Community, or that otherwise pose risks to the security of the United States or its allies and partners.

This seems to be the only kind of threat that NatSec style people are able to see.

Section 5: Promoting Innovation and Competition

The previous section aims to keep us safe. This one turns around and says go faster.

The immigration provisions seem promising. The rest seems ineffectual.

5.1 is the immigration section. What affordances does the administration have here, and why aren’t they using them on everything all the time?

(a) is about streamlining the visa process. Great, now do it everywhere.

(b) tells the Secretary of State to consider modifying the J-1 requirements to list AI skills as critical to America, perhaps revising the 2009 Revised Exchange Visitor Skills List and also a domestic visa renewal program.

Here’s (c):

(i) consider initiating a rulemaking to expand the categories of nonimmigrants who qualify for the domestic visa renewal program covered under 22 C.F.R. 41.111(b) to include academic J-1 research scholars and F-1 students in science, technology, engineering, and mathematics (STEM); and

(ii) establish, to the extent permitted by law and available appropriations, a program to identify and attract top talent in AI and other critical and emerging technologies at universities, research institutions, and the private sector overseas, and to establish and increase connections with that talent to educate them on opportunities and resources for research and employment in the United States, including overseas educational components to inform top STEM talent of nonimmigrant and immigrant visa options and potential expedited adjudication of their visa petitions and applications.

When you don’t know what you can do, tell the person in charge to do everything they are allowed to do. Maybe something good will happen. I’m not here to knock it.

(d) then has Homeland Security clarify and modernize O-1A, EB-1 and EB-2, and to improve the H1-B process. All of those seem like great low-hanging fruit that we should be picking way more than we are. Again, good idea, now do this everywhere.

(e) aims to put AI skills on the “Schedule A” list of occupations.

(f) tells State and Homeland Security to do anything else they can to attract talent.

(g) tells them to publish better institutional resources and guides for new talent.

It all seems like good stuff, and stuff we should be doing across the board in all high-skill, high value areas. Why aren’t we doing our best to attract all the best talent?

In the case of AI in particular there is the obvious concern of potential accelerationism, but also the need to ‘beat China’ and otherwise stay ahead drives the need to race. So we should all be able to agree that, whatever the final package of goods is, it includes going after the best talent.

5.2 speaks of Promoting Innovation. I do not expect much impact here.

(a) forms the National AI Research Resource (NAIRR) as per past recommendations, then an NSF Regional Innovation Engine prioritizing AI within 150 days, then four new National AI Research Institutes within 540 days, they say in addition to the 25 currently funded ones.

(b) establishes a pilot program for training 500 new AI researchers by 2025 ‘capable of meeting the rising demand for AI talent.’

It is almost cute that the government thinks this is supposed to be its job, here.

(c) (i) and (ii) direct the USPTO to develop guidelines on inventorship related to AI as it relates to patent applications. As much as I did not in practice mind the situation being deeply messed up, I can’t argue it did not need to be fixed.

(iii) calls for recommendations on potential future EOs related to copyright, including both copyright for AI-created works and copyright protections against being used by AI. I am reasonably happy with what would happen if we applied existing law, but again it makes sense to instead do something smart intentionally. There is no sign of where they plan to go with this. They could make copyright protections against AI stronger or weaker, they could make getting copyright protection for AI creations harder or easier. We shall see, seems like another good place to get your comments in.

(d) applies the report-grab-bag principle to intellectual property concerns.

(e) asks HHS to use grants and other methods to support AI in healthcare, including data collection.

(f) is an AI Tech Sprint nominal nod to Veterans’ Affairs.

(g) calls for exploring the ability of foundation models to improve the power grid, science in general, and climate risks in particular. Sure, why not.

(h) calls for exploring the potential of AI to help solve ‘major societal and global challenges.’ Again, sure, why not.

5.3 is about Promoting Competition. If you are for competition, would you want the government promoting it? What if you were against competition?

(a) says everyone should always be promoting competition.

(a) The head of each agency developing policies and regulations related to AI shall use their authorities, as appropriate and consistent with applicable law, to promote competition in AI and related technologies, as well as in other markets.

Such actions include addressing risks arising from concentrated control of key inputs, taking steps to stop unlawful collusion and prevent dominant firms from disadvantaging competitors, and working to provide new opportunities for small businesses and entrepreneurs.

In particular, the Federal Trade Commission is encouraged to consider, as it deems appropriate, whether to exercise the Commission’s existing authorities, including its rulemaking authority under the Federal Trade Commission Act, 15 U.S.C. 41 et seq., to ensure fair competition in the AI marketplace and to ensure that consumers and workers are protected from harms that may be enabled by the use of AI.

The FTC watching for fair play tends to translate to them occasionally gathering the fury of a thousand suns over what is more often than not approximately nothing, and causing headaches for everyone. It is at least fun to watch. Presumably they will say things like ‘having Twitter’s data gives you an unfair monopoly’ one week, then YouTube, then Facebook, then maybe Amazon. Usually nothing comes of it.

I notice I am getting more and more sympathy for any government department ever tasked with doing anything, constantly told here are your 200 other things to promote and do alongside your core mission, no we have no idea why nothing ever gets done.

(b) tries to extend the CHIPS Act to small business with respect to semiconductors. Still no permitting reform.

(c) tries to use various programs to promote AI small business. None of it seems worth bothering to collect, I’d much rather go to VCs and raise money. Wrong mindset is in play here. Might help some traditional small businesses a bit but they’d have been better off with tax credits. (d) says to conduct outreach so businesses know about (c).

Section 6: Supporting Workers

What does it mean to support workers?

First it means preparing a report, since you always first prepare a report. Then another report. In this case, on how to use existing programs for workers facing job disruptions and find additional options. And to ‘develop and publish principles and best practices for employers that could be used to mitigate AI’s potential harms to employees’ well-being and maximize its potential benefits.’

There are hints that jobs are to be protected even if the employee no longer has any work to do, but it all seems highly mild and toothless.

Section 7: Advancing Equity and Civil Rights

7.1 lists all the usual places one would be worried about equity and civil rights, tells everyone involved to use all existing powers, enforce existing laws and prepare reports. And now they will do all of that with respect to AI in particular, and also have a meeting of very important people within 90 days.

7.2 orders everyone to look into uses of AI regarding public benefit programs.

7.3 reaches out into the broader economy. (a) calls for rules for AI hiring use for federal contractors within the year. (b) says to keep an eye on housing and consumer credit algorithms, (c) doubles down on paranoia about housing discrimination. (d) is a call for proposals on how to ensure that disabled people aren’t discriminated against due to not fitting into molds the AI is looking for like gaze direction and eye tracking. Historically that results in banning nice things in completely insane fashion, hopefully we can avoid that this time.

Section 8: Protecting Consumers, Patients, Passengers and Students

(a) encourages various agencies to protect people. Yay?

(b) addresses healthcare in particular, within a year there is to be a strategic plan with policies and frameworks for a whole array of concerns and opportunities. It is a laundry list of things to consider, with no direction on what considerations one might want to take away from any of it. Could result in anything.

(c) is about transportation, including self-driving. I don’t know that passengers need any protection here, this seems more like opportunity knocking.

(d) is the education talking points, focusing on equity and underserved communities rather than the potential transformation of education into a place where you might learn something.

(e) calls on the FCC to look into communications, with a fun last note about robocalls.

Section 9: Protecting Privacy

(a) tells agencies to safeguard any potentially privacy-violating information they collect, (b) has a report prepared on potential additional protections.

(c) funds the creation of a Research Coordination Network (RCN) to develop, deploy and scale privacy enhancing technologies, and says to prioritize research therein.

Do you feel like your privacy is protected now?

Section 10: Advancing Federal Government Use of AI

10.1 brings together a committee to explore getting the federal government to use AI. Then they are to prepare a report, and designate who in each agency will be responsible for promoting AI innovation and tell everyone to create an AI Governance Board. They should ensure their AIs are safe against all the usual boogeyman threats. Agencies are to have a method to track their progress in deploying AI.

I am so happy I do not work for the government. I get happier about this each line I read.

(f)(i) discourages bans on the use of general AI, instead encouraging limits on access based on specific risk assessments.

All of this is in theory in service to the mission.

10.2 calls for increasing AI talent in government. How to do that?

It is not going to be easy. I have seen and heard tales of how the government hires people. Also of how much it pays them. Good luck, everyone involved.

(a) says to identify priority mission areas.

(b) says to convene a many-agency-consulted AI and Technology Talent Task Force. They shall then be tasked with:

(ii) identifying and circulating best practices for agencies to attract, hire, retain, train, and empower AI talent, including diversity, inclusion, and accessibility best practices, as well as to plan and budget adequately for AI workforce needs;

(iii) coordinating, in consultation with the Director of OPM, the use of fellowship programs and agency technology-talent programs and human-capital teams to build hiring capabilities, execute hires, and place AI talent to fill staffing gaps; and

(iv) convening a cross-agency forum for ongoing collaboration between AI professionals to share best practices and improve retention.

I have a guess as to what the best practices are to attract talent. I have zero idea how one might implement those policies within our government. There does seem to be good talent going into the UK Foundation Models Taskforce, but that is mission-based in a way we do not have a good way to duplicate.

(d) calls for improving hiring practices via such tactics as (i) direct hiring authority, (iv) flexible pay, (vi) an interagency working group and more.

(e) says to throw any other legal tools you have at the process to help you hire people.

(f) calls for a position-description library for data scientists.

(g) calls for more training.

(h) calls for another report.

Section 11: Strengthening American Leadership Abroad

(a) calls upon SoS and others to expand engagements and establish a strong international framework. I’m sure we’ll get right on that.

(b) calls for Commerce to within 270 days establish a plan for global engagement, also to submit a report on the plan and be guided by the NIST AI Risk Management Framework and United States Government National Standards Strategy for Critical and Emerging Technology. Fun.

(c) says to also publish a Global Development Playbook and a Global AI Research Agenda that supports all the good things, for some reason emphasizing labor market implications.

(d) calls for coordination to prevent disruptions of key infrastructure.

Section 12: Implementation

The White House AI Council shall consist of more or less the 28 top people in the administration plus heads of such other agencies as get invited to participate. Except VP Kamala Harris, who instead sends her deputy. I’ve been informed this is normal, that it is because she outranks the committee.

There is also a Section 13, which tells us that none of this is investment advice.

Conclusion

Hopefully that helped boil things down for you. Overall, while I am not happy I had to spend yesterday frying my brain reading it, I am happy with the Executive Order.

Others had a mix of reactions. Many, especially many who identify as e/acc, were rather unhappy. A second post will cover their reactions.

[-][anonymous]1y6-1

Thank you for this, Zvi!

Reporting requirements for foundation models, triggered by highly reasonable compute thresholds.

I disagree, and I think the computing threshold is unreasonably high. I don't even mean this in a "it is unreasonable because an adequate civilization would do way better"– I currently mean it in a "I think our actual civilization, even with all of its flaws, could have expected better."

There are very few companies training models above 10^20 FLOP, and it seems like it would be relatively easy to simply say "hey, we are doing this training run and here are some safety measures we are using."

I understand that people are worried about overregulation and stifling innovation in unnecessary ways. But these are reporting requirements– all they do is require someone to inform the government that they are engaging in a training run.

Many people think that 10^26 FLOP has a non-trivial chance of creating xrisk-capable AGI in the next 3-5 years (especially as algorithms get better). But that's not even the main crux for me– the main crux is that reporting requirements seem so low-cost relative to the benefit of the government being able to know what's going on, track risks, and simply have access to information that could help it know what to do.

It also seems very likely to me that the public the. media would be on the side of a lower threshold. If frontier AI companies complained, I think it's pretty straightforward to just be like "wait... you're developing technology that many of you admit could cause extinction, and you don't even want to tell the government what you're up to?"

With all that said, I'm glad the EO uses a compute threshold in the first place (we could've gotten something that didn't even acknowledge compute as a useful metric).

But I think 10^26 is extremely high for a reporting requirement, and I strongly hope that the threshold is lowered.

[-]Zvi1y61

I think part 2 that details the reactions will provide important color here - if this had impacted those other than the major labs right away, I believe the reaction would have been quite bad, and that setting it substantially lower would have been a strategic error and also a very hard sell to the Biden Administration. But perhaps I am wrong about that. They do reserve the ability to change the threshold in the future.

[-]Neel Nanda1y41

My guess is that the threshold is a precursor to more stringent regulation on people above the bar, and that it's easier to draw a line in the sand now and stick to it. I feel pretty fine with it being so high

[-]Review Bot10mo*10

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?