Thanks for the post and the critiques. I won't respond at length, other than to say two things: (i) it seems right to me that we'll need something like licensing or pre-approvals of deployments, ideally also decisions to train particularly risky models. Also that such a regime would be undergirded by various compute governance efforts to identify and punish non-compliance. This could e.g. involve cloud providers needing to check if a customer buying more than X compute have the relevant license or confirm that they are not using the compute to train a model above a certain size. In short, my view is that what's needed are the more intense versions of what's proposed in the paper. Though I'll note that there are lots of things I'm unsure about. E.g. there are issues with putting in place regulation while the requirements that would be imposed on development are so nascent.
(ii) the primary value and goal of the paper in my mind (as suggested by Justin) is in pulling together a somewhat broad coalition of authors from many different organizations making the case for regulation of frontier models. Writing pieces with lots of co-authors is difficult, especially if the topic is contentious, as this one is, and will often lead to recommendations being weaker than they otherwise would be. But overall, I think that's worth the cost. It's also useful to note that I think it can be counterproductive for calls for regulation (in particular regulation that is considered particularly onerous) to be coming loudly from industry actors, who people may assume have ulterior motives.
Thank you for providing a nice overview of our Frontier AI Regulation: Managing Emerging Risks to Public Safety that was just released!
I appreciate your feedback, both the positive and critical parts. I'm also glad you think the paper should exist and that it is mostly a good step. And, I think your criticism is fair. Let me also note that I do not speak for the authorship team. We are quite a diverse group from academia, labs, industry, nonprofits, etc. It was no easy task to find common ground across everyone involved.
I think the AI Governance space is difficult in part because different political actors have different goals, even when sharing significant overlap in interests. As I saw it, the goal of this paper was to bring together a wide group of interested individuals and organizations to see if we could come to points of agreement on useful immediate next governance steps. In this way, we weren't seeking "ambitious" new policy tools, we were seeking for areas of agreement across the diverse stakeholders currently driving change in the AI development space. I think this is a significantly different goal than the Model Evaluation for Extreme Risks paper that you mention, which I agree is another important entry in this space. Additionally, one of the big differences, I think, between our effort and the model evaluation paper, is we are more focused on what governments in particular should consider doing from their available toolkits, where it seems to me that model evaluation paper is more about what companies and labs themselves should do.
A couple of other thoughts:
I don't think it's completely accurate that "It doesn't suggest government oversight of training runs or compute." As part of the suggestion around licensing we mention that the AI development process may require oversight by an agency. But, in fairness, it's not a point that we emphasize.
I think the following is a little unfair. You say: "This is overdeterminedly insufficient for safety. "Not complying with mandated standards and ignoring repeated explicit instructions from a regulator" should not be allowed to happen, because it might kill everyone. A single instance of noncompliance should not be allowed to happen, and requires something like oversight of training runs to prevent. Not to mention that denying market access or threatening prosecution are inadequate. Not to mention that naming-and-shaming and fining companies are totally inadequate. This passage totally fails to treat AI as a major risk. I know the authors are pretty worried about x-risk; I notice I'm confused." Let me explain below.
I'm not sure there's such a thing as "perfect compliance." I know of no way to ensure that "a single instance of noncompliance should not be allowed to happen." And, I don't think that's necessary for current models or even very near term future models. I think the idea here is that we setup a standard regulatory process in advance of AI models that might be capable enough to kill everyone and shape the development of the next sets of frontier models. I do think there's certainly a criticism here that naming and shaming, for example, is not a sufficiently punitive tool, but may have more impact on leading AI labs that one might assume.
I hope this helps clear up some of your confusion here. To recap: I think your criticism that the tools are not ambitious is fair. I don't think that was our goal. I saw this project as a way of providing tools for which there is broad agreement and that given the current state of AI models we believe would help steer AI development and deployment in a better direction. I do think that another reading of this paper is that it's quite significant that this group agreed on the recommendations that are made. I consider it progress in the discussion of how to effectively govern increasingly power AI models, but it's not the last word either. :)
Thanks again for sharing and for providing you feedback on these very important questions of governance.
Thanks for your reply. In brief response to your more specific points:
Edit: also I get that finding consensus is hard but after reading the consensus-y-but-ambitious Towards Best Practices in AGI Safety and Governance and Model evaluation for extreme risks I was expecting consensus on something stronger.
Thanks for the response! I appreciate the clarification on both point 1 and 2 above. I think they’re fair criticisms. Thanks for pointing them out.
This paper is about (1) "government intervention" to protect "against the risks from frontier AI models" and (2) some particular proposed safety standards. It's by Markus Anderljung, Joslyn Barnhart (Google DeepMind), Jade Leung (OpenAI governance lead), Anton Korinek, Cullen O'Keefe (OpenAI), Jess Whittlestone, and 18 others.
Abstract
Executive Summary
Commentary
It is good that this paper exists. It's mostly good because it's a step (alongside Model evaluation for extreme risks) toward making good actions for AI labs and government more mainstream/legible. It's slightly good because of its (few) novel ideas; e.g. Figure 3 helps me think slightly more clearly. I don't recommend reading beyond the executive summary.
Unfortunately, this paper's proposals are unambitious (in contrast, in my opinion, to Model evaluation for extreme risks, which I unreservedly praised), such that I'm on-net disappointed in the authors (and may ask some if they agree it's unambitious and why it is). Some quotes below, but in short: it halfheartedly suggests licensing. It doesn't suggest government oversight of training runs or compute. It doesn't discuss when training runs should be stopped/paused (e.g., when model evaluations for dangerous capabilities raise flags). (It also doesn't say anything specific about international action but it's very reasonable for that to be out of scope.)
On licensing, it correctly notes that
But then it says:
Worse, on after-the-fact enforcement, it says:
This is overdeterminedly insufficient for safety. "Not complying with mandated standards and ignoring repeated explicit instructions from a regulator" should not be allowed to happen, because it might kill everyone. A single instance of noncompliance should not be allowed to happen, and requires something like oversight of training runs to prevent. Not to mention that denying market access or threatening prosecution are inadequate. Not to mention that naming-and-shaming and fining companies are totally inadequate. This passage totally fails to treat AI as a major risk. I know the authors are pretty worried about x-risk; I notice I'm confused.
Next:
This is literally true but it tends to misinform the reader on the urgency of strong safety standards and government oversight, I think.
On open-sourcing, it's not terrible; it equivocates but says "proliferation via open-sourcing" can be dangerous and
The paper does say some good things. It suggests that safety standards exist, and that they include model evals, audits & red-teaming, and risk assessment. But it suggests nothing strong or new, I think.
The authors are clearly focused on x-risk, but they clearly tone that down. This is mostly demonstrated above, but also note that they phrase their target as mere "high severity and scale risks": "the possibility that continued development of increasingly capable foundation models could lead to dangerous capabilities sufficient to pose risks to public safety at even greater severity and scale than is possible with current computational systems." Their examples include AI "evading human control" but not killing everyone or disempowering humanity or any specific catastrophes.
I'd expect something stronger from these authors. Again, I notice I'm confused. Again, I might ask some of the authors, or maybe some will share their thoughts here or in some other public place.
Updates & addenda
Thanks to Justin, one of the authors, for replying. In short, he says:
We also have a couple disagreements about the text.
Thanks to Markus, one of the primary authors, for replying. His reply is worth quoting in full:
Note that Justin and Markus don't necessarily speak for the other authors.
GovAI has a blogpost summary.
Jess Whittlestone has a blogpost summary/commentary.
GovAI will host a webinar on the paper on July 20 at 8am PT.
Markus has a Twitter summary.
The paper is listed on the OpenAI research page and so is somewhat endorsed by OpenAI.