Listen to the AI Safety Newsletter for free on Spotify.
AI Labs Fail to Uphold Safety Commitments to UK AI Safety Institute
In November, leading AI labs committed to sharing their models before deployment to be tested by the UK AI Safety Institute. But reporting from Politico shows that these commitments have fallen through.
OpenAI, Anthropic, and Meta have all failed to share their models with the UK AISI before deployment. Only Google DeepMind, headquartered in London, has given pre-deployment access to UK AISI.
Anthropic released the most powerful publicly available language model, Claude 3, without any window for pre-release testing by the UK AISI. When asked for comment, Anthropic co-founder Jack Clark said, “Pre-deployment testing is a nice idea but very difficult to implement.”
When asked about their concerns with pre-deployment testing, Meta’s spokesperson argued that Meta is an American company and should only have to comply with American regulations, even though the US has signed agreements with the UK to collaborate on AI testing. Other lab sources mentioned the possibility of leaking intellectual property secrets, and the risk that safety testing could slow down model releases.
This is a strong signal that AI companies should not be trusted to follow through on safety commitments if those commitments conflict with their business interests. Because of the ongoing race among AI labs, all AI developers face pressure to keep up with their competitors at the expense of safety, even if they’re concerned that AI development poses catastrophic risks to humanity.
Fortunately, there are several ongoing efforts to turn voluntary commitments into legal requirements. The UK government said it plans to develop “targeted, binding requirements” to ensure safety in frontier AI development. In California, a bill is being considered which would require companies to self-certify that they’ve evaluated and mitigated catastrophic risks before releasing AI systems, and allow them to be sued for violations of this law. Given the fragility of voluntary commitments to AI safety, future policy work should aim to make those commitments legally binding.
For more on this story, check out the full Politico article here.
New Bipartisan AI Policy Proposals in the US Senate
US Senators introduced two new proposals for AI policy last week. One would establish a mandatory licensing system for frontier AI developers, while the other would encourage the development of AI evaluations and the adoption of voluntary safety standards.
Mandatory licensing of frontier AI developers focused on catastrophic risks. Senators Mitt Romney (R-UT), Jack Reed (D-RI), Jerry Moran (R-KS), and Angus King (I-ME) unveiled a new policy framework that focuses exclusively on extreme risks from AI development.
Their letter to the Senate AI working group leaders outlines the evidence that AI systems could soon meaningfully assist in the development of biological, chemical, cyber, and nuclear weapons. They note that research on this topic has been recognized by the Department of Defense, Department of State, U.S. Intelligence Community, and National Security Commission on AI as demonstrating the extreme risks that AI could soon pose to national security and public safety.
Require licenses for training and deploying large-scale AI systems, as well as owning large stocks of high-performance computing hardware
Apply to models trained with more than 1026 operations which are either general-purpose or intended for use in bioengineering, chemical engineering, cybersecurity, or nuclear development.
Establish a new oversight entity. This could be within the Department of Commerce, within the Department of Energy, as an new interagency coordinating body, or as an entirely new agency.
On the whole, this framework would establish serious safeguards against catastrophic AI risks. This is a valuable contribution to ongoing discussions about US AI policy, and a reference point for a strong regulatory framework focused on extreme risks.
Establishing processes to track AI vulnerabilities, incidents, and supply chain risks. Senators Mark Warner (D-VA) and Thom Tillis (R-NC) have introduced the Secure Artificial Intelligence Act of 2024 to improve the tracking and management of security vulnerabilities and safety incidents associated with AI systems.
The bill's key provisions would:
Require NIST to incorporate AI systems into the National Vulnerability Database (NVD), and require CISA to update the Common Vulnerabilities and Exposures (CVE) program or create a new process for tracking voluntarily reported AI vulnerabilities.
Establish a public database for tracking voluntarily reported AI safety and security incidents.
Create a multi-stakeholder process to develop best practices for managing AI supply chain risks during model training and maintenance.
Task the NSA to establish an AI Security Center that provides an AI test-bed, develops counter-AI guidance, and promotes secure AI adoption.
This act lays important groundwork for improving the visibility, tracking, and collaborative management of AI safety risks as the capabilities and adoption of AI systems grows. It complements other legislative proposals focused on evaluations, standards, and oversight for higher-risk AI applications.
Establishing NIST AI Safety Institute to develop AI evaluations and voluntary safety standards. The Future of AI Innovation Act was introduced this week by Senators Maria Cantwell (D-WA), Todd Young (R-IN), John Hickenlooper (D-CO), and Marsha Blackburn (R-TN). It would formally establish the NIST AI Safety Institute with a mandate to develop AI evaluations and voluntary safety standards.
This bill would help advance the science of AI evaluations, paving the way for future policy that would require safety testing for frontier AI developers.
Military AI in Israel and the US
Militaries are increasingly interested in AI development. Here, we cover reports that Israel is using AI to identify targets for airstrikes in Gaza, and that the US has massively increased spending on military AI systems in this year’s budget.
Lavender, an AI system used by the Israeli military to identify airstrike targets. Earlier this month, Israeli news outlets +972 Magazine and Local Call reported on Israel’s use of military AI in the ongoing conflict in Gaza. They describe the system, known as “Lavender,” as follows:
The Lavender software analyzes information collected on most of the 2.3 million residents of the Gaza Strip through a system of mass surveillance, then assesses and ranks the likelihood that each particular person is active in the military wing of Hamas or PIJ. According to sources, the machine gives almost every single person in Gaza a rating from 1 to 100, expressing how likely it is that they are a militant.
Limited human oversight of military AI. Some argue that military AI systems should be monitored by a “human in the loop” to oversee its decisions and prevent errors. But this could put militaries at a competitive disadvantage, as fast-paced wartime environments might benefit from equally speedy AI decision-making, and human operators might make more errors than an AI system working alone.
Lavender appears to have limited human oversight. People set the high-level parameters, such as the tolerance for “collateral damage” of civilian deaths when targeting militants. Lavender then provides suggestions for airstrike targets, which a human operator reviews in a matter of seconds:
“A human being had to [verify the target] for just a few seconds,” B. said, explaining that this became the protocol after realizing the Lavender system was “getting it right” most of the time. “At first, we did checks to ensure that the machine didn’t get confused. But at some point we relied on the automatic system, and we only checked that [the target] was a man — that was enough. It doesn’t take a long time to tell if someone has a male or a female voice.”
Israel disputes the report. Responding to similar reports by The Guardian, the Israeli Defense Forces released a statement: “Contrary to claims, the IDF does not use an artificial intelligence system that identifies terrorist operatives or tries to predict whether a person is a terrorist.”
US investment in military AI skyrockets. There was a 1500% increase in the potential value of US Department of Defense contracts related to AI between August 2022 and August 2023, finds a new Brookings report. “In comparison, NASA and HHS increased their AI contract values by between 25% and 30% each,” the report notes. Overall, the report says that “DoD grew their AI investment to such a degree that all other agencies become a rounding error.”
DARPA uses AI to operate a plane in a dogfight. The Defense Advanced Research Projects Agency (DARPA) recently disclosed that an AI-controlled jet successfully engaged a human pilot in a real-world dogfight test last year. Although simulations involving AI are common, this event marks the first known instance of AI piloting a US Air Force aircraft under actual combat conditions. Although a safety pilot was on board, the safety switch was not activated at any point throughout the flight. Unlike earlier systems that relied on hardcoded AI instructions known as expert systems, this test used a machine learning-based system
Previous efforts to slow military AI adoption have had limited success. In the 2010s, the Campaign to Stop Killer Robots received widespread support from the public and leaders within the AI industry, as well as UN Secretary-General António Guterres. Yet the United States and other major powers resisted these calls, saying they would use AI responsibly in military context, but not supporting bans on the technology.
More recently, countries have sought smaller commitments that could bring military powers to agree. In November, there were rumors that a meeting between President Biden and President Xi would feature an agreement to avoid automating nuclear command and control with AI, but no commitment happened.
What policy commitments on military AI are both desirable and realistic? How can military leaders better understand AI risks, and develop plans for responsibly reducing those risks? So far, attempts to answer these questions have not yielded breakthroughs in military AI policy. More work will be needed to assess and respond to this growing risk.
New Online Course on AI Safety from CAIS
Applications are open for AI Safety, Ethics, and Society, an online course running July-October 2024. Apply to take part by May 31st.
The course is based on a new AI safety textbook by Dan Hendrycks, Director of the Center for AI Safety. It is aimed at students and early-career professionals who would like to explore the core challenges in ensuring that increasingly powerful AI systems are safe, ethical and beneficial to society.
The course is delivered via interactive small-group discussions supported by facilitators, along with accompanying readings and lecture videos. Participants will also complete a personal project to extend their knowledge. The course is designed to be accessible to a non-technical audience and can be taken alongside work or other studies.
Links
Model Updates
Meta releases the weights of Llama 3, claiming it beats Google’s Gemini 1.5 Pro and Anthropic’s Claude 3 Sonnet. Mark Zuckerberg discussed the release here, including saying that he would consider not open sourcing models that significantly aid in biological weapons development.
Anthropic’s Claude can now use tools such as browsing the internet and running code.
NIST launches a new GenAI evaluations platform with a competition to see if AIs can distinguish between AI-written and human-written text.
The NSA releases guidance on AI security, with recommendations on information security for model weights and plans for securing dangerous AI capabilities.
The Center for AI Safety co-led a letter signed by more than 80 organizations urging the Senate Appropriations Committee to fully fund the Department of Commerce’s AI efforts next year.
The UK is considering legislation on AI that would require leading developers to conduct safety tests and share information about their models with governments.
“Future-Proofing AI Regulation,” a new report from CNAS, finds the cost of training an AI system of a given level of capabilities in the current paradigm fall by a factor of ~1000 over 5 years.
Welcome to the AI Safety Newsletter by the Center for AI Safety. We discuss developments in AI and AI safety. No technical background required.
Subscribe here to receive future versions.
Listen to the AI Safety Newsletter for free on Spotify.
AI Labs Fail to Uphold Safety Commitments to UK AI Safety Institute
In November, leading AI labs committed to sharing their models before deployment to be tested by the UK AI Safety Institute. But reporting from Politico shows that these commitments have fallen through.
OpenAI, Anthropic, and Meta have all failed to share their models with the UK AISI before deployment. Only Google DeepMind, headquartered in London, has given pre-deployment access to UK AISI.
Anthropic released the most powerful publicly available language model, Claude 3, without any window for pre-release testing by the UK AISI. When asked for comment, Anthropic co-founder Jack Clark said, “Pre-deployment testing is a nice idea but very difficult to implement.”
When asked about their concerns with pre-deployment testing, Meta’s spokesperson argued that Meta is an American company and should only have to comply with American regulations, even though the US has signed agreements with the UK to collaborate on AI testing. Other lab sources mentioned the possibility of leaking intellectual property secrets, and the risk that safety testing could slow down model releases.
This is a strong signal that AI companies should not be trusted to follow through on safety commitments if those commitments conflict with their business interests. Because of the ongoing race among AI labs, all AI developers face pressure to keep up with their competitors at the expense of safety, even if they’re concerned that AI development poses catastrophic risks to humanity.
Fortunately, there are several ongoing efforts to turn voluntary commitments into legal requirements. The UK government said it plans to develop “targeted, binding requirements” to ensure safety in frontier AI development. In California, a bill is being considered which would require companies to self-certify that they’ve evaluated and mitigated catastrophic risks before releasing AI systems, and allow them to be sued for violations of this law. Given the fragility of voluntary commitments to AI safety, future policy work should aim to make those commitments legally binding.
For more on this story, check out the full Politico article here.
New Bipartisan AI Policy Proposals in the US Senate
US Senators introduced two new proposals for AI policy last week. One would establish a mandatory licensing system for frontier AI developers, while the other would encourage the development of AI evaluations and the adoption of voluntary safety standards.
Mandatory licensing of frontier AI developers focused on catastrophic risks. Senators Mitt Romney (R-UT), Jack Reed (D-RI), Jerry Moran (R-KS), and Angus King (I-ME) unveiled a new policy framework that focuses exclusively on extreme risks from AI development.
Their letter to the Senate AI working group leaders outlines the evidence that AI systems could soon meaningfully assist in the development of biological, chemical, cyber, and nuclear weapons. They note that research on this topic has been recognized by the Department of Defense, Department of State, U.S. Intelligence Community, and National Security Commission on AI as demonstrating the extreme risks that AI could soon pose to national security and public safety.
Their policy framework would:
On the whole, this framework would establish serious safeguards against catastrophic AI risks. This is a valuable contribution to ongoing discussions about US AI policy, and a reference point for a strong regulatory framework focused on extreme risks.
Establishing processes to track AI vulnerabilities, incidents, and supply chain risks. Senators Mark Warner (D-VA) and Thom Tillis (R-NC) have introduced the Secure Artificial Intelligence Act of 2024 to improve the tracking and management of security vulnerabilities and safety incidents associated with AI systems.
The bill's key provisions would:
This act lays important groundwork for improving the visibility, tracking, and collaborative management of AI safety risks as the capabilities and adoption of AI systems grows. It complements other legislative proposals focused on evaluations, standards, and oversight for higher-risk AI applications.
Establishing NIST AI Safety Institute to develop AI evaluations and voluntary safety standards. The Future of AI Innovation Act was introduced this week by Senators Maria Cantwell (D-WA), Todd Young (R-IN), John Hickenlooper (D-CO), and Marsha Blackburn (R-TN). It would formally establish the NIST AI Safety Institute with a mandate to develop AI evaluations and voluntary safety standards.
This bill would help advance the science of AI evaluations, paving the way for future policy that would require safety testing for frontier AI developers.
Military AI in Israel and the US
Militaries are increasingly interested in AI development. Here, we cover reports that Israel is using AI to identify targets for airstrikes in Gaza, and that the US has massively increased spending on military AI systems in this year’s budget.
Lavender, an AI system used by the Israeli military to identify airstrike targets. Earlier this month, Israeli news outlets +972 Magazine and Local Call reported on Israel’s use of military AI in the ongoing conflict in Gaza. They describe the system, known as “Lavender,” as follows:
Limited human oversight of military AI. Some argue that military AI systems should be monitored by a “human in the loop” to oversee its decisions and prevent errors. But this could put militaries at a competitive disadvantage, as fast-paced wartime environments might benefit from equally speedy AI decision-making, and human operators might make more errors than an AI system working alone.
Lavender appears to have limited human oversight. People set the high-level parameters, such as the tolerance for “collateral damage” of civilian deaths when targeting militants. Lavender then provides suggestions for airstrike targets, which a human operator reviews in a matter of seconds:
Israel disputes the report. Responding to similar reports by The Guardian, the Israeli Defense Forces released a statement: “Contrary to claims, the IDF does not use an artificial intelligence system that identifies terrorist operatives or tries to predict whether a person is a terrorist.”
US investment in military AI skyrockets. There was a 1500% increase in the potential value of US Department of Defense contracts related to AI between August 2022 and August 2023, finds a new Brookings report. “In comparison, NASA and HHS increased their AI contract values by between 25% and 30% each,” the report notes. Overall, the report says that “DoD grew their AI investment to such a degree that all other agencies become a rounding error.”
DARPA uses AI to operate a plane in a dogfight. The Defense Advanced Research Projects Agency (DARPA) recently disclosed that an AI-controlled jet successfully engaged a human pilot in a real-world dogfight test last year. Although simulations involving AI are common, this event marks the first known instance of AI piloting a US Air Force aircraft under actual combat conditions. Although a safety pilot was on board, the safety switch was not activated at any point throughout the flight. Unlike earlier systems that relied on hardcoded AI instructions known as expert systems, this test used a machine learning-based system
Previous efforts to slow military AI adoption have had limited success. In the 2010s, the Campaign to Stop Killer Robots received widespread support from the public and leaders within the AI industry, as well as UN Secretary-General António Guterres. Yet the United States and other major powers resisted these calls, saying they would use AI responsibly in military context, but not supporting bans on the technology.
More recently, countries have sought smaller commitments that could bring military powers to agree. In November, there were rumors that a meeting between President Biden and President Xi would feature an agreement to avoid automating nuclear command and control with AI, but no commitment happened.
What policy commitments on military AI are both desirable and realistic? How can military leaders better understand AI risks, and develop plans for responsibly reducing those risks? So far, attempts to answer these questions have not yielded breakthroughs in military AI policy. More work will be needed to assess and respond to this growing risk.
New Online Course on AI Safety from CAIS
Applications are open for AI Safety, Ethics, and Society, an online course running July-October 2024. Apply to take part by May 31st.
The course is based on a new AI safety textbook by Dan Hendrycks, Director of the Center for AI Safety. It is aimed at students and early-career professionals who would like to explore the core challenges in ensuring that increasingly powerful AI systems are safe, ethical and beneficial to society.
The course is delivered via interactive small-group discussions supported by facilitators, along with accompanying readings and lecture videos. Participants will also complete a personal project to extend their knowledge. The course is designed to be accessible to a non-technical audience and can be taken alongside work or other studies.
Links
Model Updates
US AI Policy
International AI Policy
Opportunities
Research
Other
See also: CAIS website, CAIS twitter, A technical safety research newsletter, An Overview of Catastrophic AI Risks, our new textbook, and our feedback form
Listen to the AI Safety Newsletter for free on Spotify.
Subscribe here to receive future versions.