There's Conjecture's Priorities for the UK Foundation Models Taskforce. This is more of a big picture list of actionable suggestions, without much discussion of reasons for choosing them, though most of it is self-explanatory. Some related ideas were discussed in The 0.2 OOMs/year target and just yesterday in Compute Thresholds: proposed rules to mitigate risk of a “lab leak” accident during AI training runs.
I collected my favorite public pieces of research on AI strategy, governance, and forecasting from 2023 so far.
If you're a researcher, I encourage you to make a quick list of your favorite pieces of research, then think about what makes it good and whether you're aiming at that with your research.
To illustrate things you might notice as a result of this exercise:
I observe that my favorite pieces of research are mostly aimed at some of the most important questions[1]– they mostly identify a very important problem and try to answer it directly.
I observe that for my favorite pieces of research, I mostly would have been very enthusiastic about a proposal to do that research– it's not like I'd have been skeptical about the topic but the research surprised me with how good its results were.[2]
1. Model evaluation for extreme risks (DeepMind, Shevlane et al., May)
See also the corresponding blogpost An early warning system for novel AI risks (DeepMind 2023).
Among the most important questions in AI governance are how can labs determine whether their training runs and deployment plans are safe? and how can they demonstrate that to external observers, or how can authorities make rules about training run and deployment safety? If powerful AI is dangerous by default, developers being able to identify and avoid unsafe training or deployment appears necessary to achieve powerful AI safely. Great model evaluations would also enable external oversight. Convincing demonstrations could also help developers and other actors understand risks from powerful AI. This paper helps its readers think clearly about evals and lays the groundwork for an evaluations-based self-governance regime– and eventually a regulatory regime.
2. Towards best practices in AGI safety and governance: A survey of expert opinion (GovAI, Schuett et al., May)
See pp. 1–4 for background, 10–14 for discussion, and 18–22 for the list and brief descriptions of 100 ideas for labs. See also the corresponding blogpost.
Perhaps the most important question in AI strategy is what should AI labs do? This question is important because some plausible lab behaviors are much safer than others, so developing better affordances for labs or informing them about relevant considerations could help them act much better. This research aims directly at this question.
Before this paper, there wasn't really a list of actions that might be good for labs to take (across multiple domains– there were a couple domain-specific or short lists; see my Ideas for AI labs: Reading list). Now there is, and it's incomplete and lacks descriptions for actions or links to relevant resources but overall high-quality and a big step forward for the what should labs do conversation. Moreover, this research is starting the process of not just identifying good actions but making that legible to everyone or building common knowledge about what labs should do.
3. What does it take to catch a Chinchilla? Verifying Rules on Large-Scale Neural Network Training via Compute Monitoring (Shavit, March)
See pp. 1–6 and §7.
Perhaps the most important question in AI strategy is how can we verify labs' compliance with rules about training runs? This question is important because preventing dangerous training runs may be almost necessary and sufficient for AI safety and such techniques could enable inspectors to verify runs' compliance with regulation or international agreements. This research aims directly at this question.
4. Survey on intermediate goals in AI governance (Rethink Priorities, Räuker and Aird, March)
Perhaps the most important question in AI strategy is what intermediate goals would it be good to pursue? Information on this question helps the AI safety community better identify and prioritize between interventions. This research aims directly at this question: the most important part of the research was a survey on respondents' attitudes on funding being directed toward various possible intermediate goals, directly giving evidence about how funders could better promote AI safety.
5. Literature Review of Transformative AI Governance (LPP, Maas, forthcoming)
I personally like the section Levers of governance.
I'm not sure why I like this review so much. It has brought a few ideas to my attention, it has pointed me toward a few helpful sources, and maybe the Levers of governancesection helped me think about AI governance from the perspective of levers.
6. “AI Risk Discussions” website: Exploring interviews from 97 AI Researchers (Gates et al., February)
See especially the Quantitative Analysis.
Disclaimer: I haven't deeply engaged with this research.
An important question in AI strategy is how do AI researchers think about AI risk, and how can they be educated about it? This question is important because AI researchers' attitudes may determine whether dangerous AI research occurs; if everyone believes certain projects are dangerous, by default they'll try to make them safer. This research aims directly at this question. Moreover, the authors' adjacent resources help advocates educate AI researchers and AI researchers educate themselves about AI risk.
7. What a compute-centric framework says about AI takeoff speeds - draft report (OpenPhil, Davidson, January)
See the short summary, model, blogpost on takeaways, presentation (slides), and/or long summary.
Disclaimer: I haven't deeply engaged with this research.
An important question in AI strategy is how fast will AI progress be when AI has roughly human-level capabilities? Information on this question informs alignment plans, informs other kinds of interventions, and is generally an important component of strategic clarity on AI. This research aims directly at a big part of this question.
Pieces 1, 2, and maybe 3 are mostly great not for their contribution to our knowledge but for laying the groundwork for good actions, largely by helping communicate their ideas to government and perhaps labs.
Another favorite AI governance piece is A Playbook for AI Risk Reduction (focused on misaligned AI) (Karnofsky, June). It doesn't feel like research, perhaps because very little of it is novel. It's worth reading.
Pieces 1, 2, 3, and 4 are aimed directly at extremely important questions; 6 and 7 are aimed directly at very important questions.
For pieces 1, 2, 3, 4, and 6 I would have been very enthusiastic about the proposal. For 5 and 7 I would have been cautiously excited or excited if the project was executed by someone who's a good fit. Note that the phenomenon of my favorite research mostly being research I expect to like is presumably partially due to selection bias in what I read. Moreover, it is partially due to the fact that I haven't deeply engaged with 6 or the technical component of 3 and only engaged with some parts of 7– so saying they're favorites is partially because they sound good before I know all of the details.