I’m finishing my PhD in hardware/ML and I’ve been thinking vaguely about hardware approaches for AI safety recently, so it’s great to see other people are thinking about this too! I hope to have more free time once I finish my thesis in a few weeks, and I’d love to talk more to anyone else who is interested in this approach and perhaps help out if I can.
Summary and Introduction
People who want to improve the trajectory of AI sometimes think their options for object-level work are (i) technical work on AI alignment or (ii) non-technical work on AI governance. But there is a whole other category of options: technical work in AI governance. This is technical work that mainly boosts AI governance interventions, such as norms, regulations, laws, and international agreements that promote positive outcomes from AI. This piece provides a brief overview of some ways to do this work—what they are, why they might be valuable, and what you can do if you’re interested. I discuss:
[Update] Additional categories which the original version of this piece (from 2022) under-emphasized or missed are:
I expect there will likely be one or more resources providing more comprehensive introductions to many of these topics in early 2024. For now, see the above links to learn more about the topics added in the update, and see below for more discussion of the originally listed topics.
Acknowledgements
Thanks to Lennart Heim, Jamie Bernardi, Gabriel Mukobi, Girish Sastry, and others for their feedback on this post. Mistakes are my own.
Context
What I mean by “technical work in AI governance”
I’m talking about work that:
Neglectedness
As of writing, there are (by one involved expert’s estimate) ~8-15 full-time equivalents doing this work with a focus on especially large-scale AI risks.[2]
Personal fit
For you to have a strong personal fit for this type of work, technical skills are useful, of course (including but not necessarily in ML), and interest in the intersection of technical work and governance interventions presumably makes this work more exciting for someone.
Also, whatever it takes to make progress on mostly uncharted problems in a tiny sub-field[3] is probably pretty important for this work now, since that’s the current nature of these fields. That might change in a few years. (But that doesn’t necessarily mean you should wait; time’s ticking, someone has to do this early-stage thinking, and maybe it could be you.)
What I’m not saying
I’m of course not saying this is the only or main type of work that’s needed. (Still, it does seem particularly promising for technically skilled people, especially under the debatable assumption that governance interventions tend to be more high-leverage than direct work on technical safety problems.)
Types of technical work in AI governance
Engineering technical levers to make AI coordination/regulation enforceable
To help ensure AI goes well, we may need good coordination and/or regulation.[4] To bring about good coordination/regulation on AI, we need politically acceptable methods of enforcing them (i.e. catching and penalizing/stopping violators).[5] And to design politically acceptable methods of enforcement, we need various kinds of engineers, as discussed in the next several sections.[6]
Hardware engineering for enabling AI coordination/regulation
To help enforce AI coordination/regulation, it might be possible to create certain on-chip devices for AI-specialized chips or other devices at data centers. As a non-exhaustive list of speculative examples:
Part of the engineering challenge here is that, ideally (e.g. for political acceptability), we may want such devices to not only work but to also be (potentially among other desired features):
Software/ML engineering for enabling AI coordination/regulation
Software (especially ML) engineering could help enforce AI coordination/regulation in various ways[8], including the following:
Heat/electromagnetism-related engineering for enabling AI coordination/regulation
For enforcing AI coordination/regulation against particularly motivated violators, it could be helpful to be able to identify hidden chips or data centers using their heat and electromagnetic signatures. People who know a lot about heat and electromagnetism could presumably help design equipment or methods that do this (e.g. mobile equipment usable at data centers, equipment that could be installed at data centers, methods for analyzing satellite data, and methods for analyzing data collected about a facility from a nearby road.)
Part of the challenge here is that these methods should be robust to efforts to conceal heat and electromagnetic signatures.
Information security
Information security could matter for AI in various ways, including the following:
See here, here, and here (Sections 3.3 and 4.1), and listen here [podcast] for more information. As these sources suggest, information security overlaps with—but extends beyond—the engineering work mentioned above.
Forecasting AI development
AI forecasters answer questions about what AI capabilities are likely to emerge when. This can be helpful in several ways, including:
Typically, this work isn’t engineering or classic technical research; it often involves measuring and extrapolating AI trends, and sometimes it is more conceptual/theoretical. Still, familiarity with relevant software or hardware often seems helpful for knowing what trends to look for and how to find relevant data (e.g. “How much compute was used to train recent state–of-the-art models?”), as well as for being able to assess and make arguments on relevant conceptual questions (e.g. “How analogous is gradient descent to natural selection?”).
See here (Section I) and here[9] for some collections of relevant research questions; see [1], [2], [3], [4], and [5] for some examples of AI forecasting work; and listen here [podcast] for more discussion.
Technical standards development
One AI risk scenario is that good AI safety methods will be discovered, but they won’t be implemented widely enough to prevent bad outcomes.[10] To help with this, translating AI safety work into technical standards (which can then be referenced by regulations, as is often done) might help. Relatedly, standard-setting could be a way for AI companies to set guardrails on their AI competition without violating antitrust laws.
Technical expertise (specifically, in AI safety) could help standards developers (i) identify safety methods that it would be valuable to standardize, and (ii) translate safety methods into safety standards (e.g. by precisely specifying them in widely applicable ways, or designing testing and evaluation suites for use by standards[11]).
Additionally, strengthened cybersecurity standards for AI companies, AI hardware companies, and other companies who process their data could help address some of the information security issues mentioned above.
See here for more information.
Grantmaking or management to get others to do the above well
Instead of doing the above kinds of work yourself, you might be able to use your technical expertise to (as a grantmaker or manager) organize others in doing such work. Some of the problems here appear to be standard, legible technical problems, so it might be very possible for you to leverage contractors, grantees, employees, or prize challenge participants to solve these problems, even if they aren’t very familiar with or interested in the bigger picture.
Couldn’t non-experts do this well? Not necessarily; it might be much easier to judge project proposals, candidates, or execution if you have subject-matter expertise. Expertise might also be very helpful for formulating shovel-ready technical problems.
Advising on the above
Some AI governance researchers and policymakers may want to bet on certain assumptions about the feasibility of certain engineering or infosec projects, on AI forecasts, or on relevant industries. By advising them with your relevant expertise, you could help allies make good bets on technical questions. A lot of this work could be done in a part-time or “on call” capacity (e.g. while spending most of your work time on what the above sections discussed, working at a relevant hardware company, or doing other work).
Others?
I’ve probably missed some kinds of technical work that can contribute to AI governance, and across the kinds of technical work I identified, I’ve probably missed many examples of specific ways they can help.
Potential next steps if you’re interested
Contributing in any of these areas will often require you to have significant initiative; there aren’t yet very streamlined career pipelines for doing most of this work with a focus on large-scale risks. Still, there is plenty you can do; you can:
Learn more about these kinds of work, e.g. by following the links in the above sections (as well as this link, which overlaps with several hardware-related areas).
Test your fit for these areas, e.g. by taking an introductory course in engineering or information security, or by trying a small, relevant project (say, on the side or in a research internship).
Build relevant expertise, e.g. by extensively studying or working in a relevant area.
Learn about and pursue specific opportunities to contribute, especially if you have a serious interest in some of this work or relevant experience, e.g.:
Feel free to reach out to the following email address if you have questions or want to coordinate with some folks who are doing closely related work[12]:
<!-- Footnotes themselves at the bottom. -->
Notes
This includes creating knowledge that enables decision-makers to develop and pursue more promising AI governance interventions (i.e. not just boosting interventions that have already been decided on). ↩︎
Of course, there are significantly more people doing most of these kinds of work with other concerns, but such work might not be well-targeted at addressing the concerns of many on this forum. ↩︎
courage? self-motivation? entrepreneurship? judgment? analytical skill? creativity? ↩︎
To elaborate, a major (some would argue central) difficulty with AI is the potential need for coordination between countries or perhaps labs. In the absence of coordination, unilateral action and race-to-the-bottom dynamics could lead to highly capable AI systems being deployed in (sometimes unintentionally) harmful ways. By entering enforceable agreements to mutually refrain from unsafe training or deployments, relevant actors might be able to avoid these problems. Even if international agreements are infeasible, internal regulation could be a critical tool for addressing AI risks. One or a small group of like-minded countries might lead the world in AI, in which case internal regulation by these governments might be enough to ensure highly capable AI systems are developed safely and used well. ↩︎
To elaborate, international agreements and internal regulation both must be enforceable in order to work. The regulators involved must be able to catch and penalize (or stop) violators—as quickly, consistently, and harshly as is needed to prevent serious violations. But agreements and regulations don’t “just” need to be enforceable; they need to be enforceable in ways that are acceptable to relevant decision-makers. For example, decision-makers would likely be much more open to AI agreements or regulations if their enforcement (a) would not expose many commercial, military, or personal secrets, and (b) would not be extremely expensive. ↩︎
After all, we currently lack good enough enforcement methods, so some people (engineers) need to make them. (Do you know of currently existing and politically acceptable ways to tell whether AI developers are training unsafe AI systems in distant data centers? Me neither.) Of course, we also need others, e.g. diplomats and policy analysts, but that is outside the scope of this post. As a motivating (though limited) analogy, the International Atomic Energy Agency relies on a broad range of equipment to verify that countries follow the Treaty on the Non-Proliferation of Nuclear Weapons. ↩︎
Literally “tamper-proof” might be infeasible, but “prohibitively expensive to tamper with at scale” or “self-destroys if tampered with” might be good enough. ↩︎
This overlaps with cooperative AI. ↩︎
Note the author of this now considers it a bit outdated. ↩︎
In contrast, some other interventions appear to be more motivated by the worry that there won’t be time to discover good safety methods before harmful deployments occur. ↩︎
This work might be similar to the design of testing and evaluation suites for use by regulators, mentioned in the software/ML engineering section. ↩︎
I’m not managing this email; a relevant researcher who kindly agreed to coordinate some of this work is. They have a plan that I consider credible for regularly checking what this email account receives. ↩︎