I would suggest that the set of means available to nation-states to unilaterally surveil another nation state is far more expansive than the list you have. For example, the good-old-fashioned "Paying two hundred and eighty-two thousand dollars in a Grand Cayman banking account to a Chinese bureaucrat"* appears nowhere in your list.
*If you get that this is a reference to the movie Spy Game, you are cool. If you don't, go watch Spy Game. It has a worldview on power that is extremely relevant to rationalists.
TLDR: A new paper summarizes some verification methods for international AI agreements. See also summaries on LinkedIn and Twitter.
Several co-authors and I are currently planning some follow-up projects about verification methods. There are also at least 2 other groups planning to release reports on verification methods. If you have feedback or are interested in getting involved, please feel free to reach out.
Overview
There have been many calls for potential international agreements around the development or deployment of advanced AI. If governments become more concerned about AI risks, there might be a short window of time in which ambitious international proposals are seriously considered. If this happens, I expect many questions will be raised, such as:
Our paper attempts to get readers thinking about these questions and considering the kinds of verification methods that nations could deploy. The paper is not conclusive– its main goal is to provide some framings/concepts/descriptions/examples that can help readers orient to this space & inspire future research.
I'd be especially interested in feedback on the following questions:
Abstract
What techniques can be used to verify compliance with international agreements about advanced AI development? In this paper, we examine 10 verification methods that could detect two types of potential violations: unauthorized AI training (e.g., training runs above a certain FLOP threshold) and unauthorized data centers. We divide the verification methods into three categories: (a) national technical means (methods requiring minimal or no access from suspected non-compliant nations), (b) access-dependent methods (methods that require approval from the nation suspected of unauthorized activities), and (c) hardware-dependent methods (methods that require rules around advanced hardware). For each verification method, we provide a description, historical precedents, and possible evasion techniques. We conclude by offering recommendations for future work related to the verification and enforcement of international AI governance agreements.
Executive summary
Efforts to maximize the benefits and minimize the global security risks of advanced AI may lead to international agreements. This paper outlines methods that could be used to verify compliance with such agreements. The verification methods we cover are focused on detecting two potential violations:
Verification methods
We identify 10 verification methods and divide them into three categories:
National technical means
Access-dependent methods
Hardware-dependent methods
Limitations and considerations
The verification methods we propose have some limitations, and there are many complicated national and international considerations that would influence if and how they are implemented. Some of these include:
Future directions
Our work provides a foundation for discussions on AI governance verification, but several key areas require further research: