FWIW, I find it hard to make judgements on these kinds of aggregate statistics, and would be kind of surprised if other people know how to make judgements on these either. "Having worked at a scaling lab" or "being involved with a AI Safety grantmaking organization" or "being interested in AI control" just aren't that informative pieces of information, especially I don't even have individual profiles.
My sense is that if you want people to be able to actually come to trust your criteria, you will either have to be more specific, or just list people's names (the latter of which would be ideal, and also would create concrete points of accountability).
Introduction
MATS currently has more people interested in being mentors than we are able to support—for example, for the Winter 2024-25 Program, we received applications from 87 prospective mentors who cumulatively asked for 223 scholars[1] (for a cohort where we expected to only accept 80 scholars). As a result, we need some process for how to choose which researchers to take on as mentors and how many scholars to allocate each. Our desiderata for the process are as follows:
In this post, we describe the process we used to select mentors for the Winter 2024-25 Program, which will be very close to the process we will use to select mentors for the Summer 2025 Program. In a nutshell, we select advisors, who select mentors, who select scholars, who often select specific research projects, in a “chain of trust,” with MATS input and oversight at every stage. This system is designed to ensure that we make reasonable decisions about the scholars, mentors, and, ultimately, the research we support, even if MATS staff are not subject matter experts for every branch of AI safety research. We want to make this "chain of trust" structure transparent so that potential funders and collaborators can trust in our process, even if we cannot share specific details of selection (e.g., what advisor X said about prospective mentor Y).
Mentor selection
First, we solicited applications from potential mentors. These applications covered basic information about the mentors, the field they work in, their experience in research and mentoring, what projects they might supervise, and how many scholars they might supervise.
These applications were then reviewed by a team of 12 advisors. Our advisors were chosen to be people with experience in the AI existential safety community, as well as to cover a range of perspectives and subfields, as discussed above. We selected advisors by first creating a long list of approximately 100 candidates, then narrowing it down to a short list of approximately 30 candidates, who we invited to advise us. Of these 30, 12 candidates were available. The advisors include members of AI safety research non-profits, AI "scaling lab" safety teams, AI policy think-tanks, and AI safety grantmaking organizations. Breaking down advisors by field (and keeping in mind most advisors selected multiple fields):
Number of advisors who focus on various fields. Note that most advisors selected multiple fields.
Most advisors were not able to rate all applicants, but focused their energies on applicants whose research areas matched their own expertise. For each rated applicant, advisors were able to tell us:
Advisors also had a field to write free-form text notes. Not all advisors filled out all fields.
As shown in the figure below, all but one application was reviewed by at least three advisors, and the median applicant was reviewed by four advisors. One applicant who applied late was only able to be reviewed by a single reviewer.
If someone was rated by n advisors, they go in the bin between n and n+1. So, for example, there were 15 applications that were reviewed by 5 advisors.
We primarily considered the average ratings and scholar number recommendations, taking into account confidence levels and conflicts of interest. Our rule of thumb was that we accepted applicants rated 7/10 and higher, and chose some of the applicants rated between 5/10 and 7/10 to enhance research diversity (in part to counter what we believed to be potential lingering biases of our sample of advisors). To choose mentors for certain neglected research areas, we paid special attention to ratings by advisors who specialize in those research areas.
For accepted mentors, we chose scholar counts based on advisor recommendations and ratings, as well as ratings from MATS Research Managers and scholars for returning mentors. The cut-offs at 5/10 and 7/10 were chosen partly to ensure we chose highly-rated mentors, and partly in light of how many scholars we wanted to accept in total (80, in this program). For edge cases, we also considered the notes written by our advisors.
We then made adjustments based on various contingencies:
Mentor demographics
What sort of results did the above process produce? One way to understand this is to aggregate mentors by "track"—MATS’s classification of the type of AI safety research mentors perform. For the Winter 2024-25 program, we have five tracks: oversight & control, evaluations, interpretability, governance & strategy, and agency[2]. Note that these are coarse-grained, and may not perfectly represent each mentor’s research.
This is how our applicants broke down by track:
Our accepted mentors broke down this way:
Proportionally, the biggest deviations between the applying and accepted mentors were that relatively few interpretability researchers were accepted as mentors, and relatively many evaluations and oversight & control researchers were accepted.
To give a better sense of our mentors’ research interests, we can also analyse the accepted mentors by whether they focused on:
These were somewhat subjective designations and, for the latter two distinctions, some mentors did not neatly fall either way.
The yellow portion of the bar is mentors who did not neatly fall into either category.
Scholar demographics
Another way to measure the cohort's research portfolio is to look at the breakdown of scholar count assigned to each mentor.[3] Firstly, we can distinguish scholars by their mentors' research track:
This is somewhat more weighted towards interpretability and away from governance than our mentor count.
Another relevant factor is how many scholars each mentor has, shown in the histogram below. The median scholar will be working with a three-scholar mentor—that is, with two other scholars under the same mentor. This data is shown in histogram form below. Note that for the purpose of these statistics, if two mentors are co-mentoring some scholars, they are counted as one "mentor."
Numbers are provisional and subject to change. If a mentor has n scholars, they go in the bin between n and n+1. So, for example, there are 15 mentors with 2 scholars. Total scholar count will be lower than in this histogram, since mentors who have not yet determined the division of scholars between them were assigned more scholars in aggregate than they accepted.
This can be compared to the distribution of scholars per mentor in the Summer 2024 Program. In that program, the distribution was more concentrated: more scholars were working in streams of one or two scholars (the median scholar was working with a 2-scholar mentor, i.e. with only one other scholar), and there were fewer mentors with 3-5 scholars.
As with the mentors, we can also break down scholar assignments by their mentors’ research focus.
The yellow portion of the bar is scholars whose mentor did not neatly fall into either category.
Acknowledgements
This report was produced by the ML Alignment & Theory Scholars Program. Daniel Filan was the primary author of this report and Ryan Kidd scoped, managed, and edited the project. Huge thanks to the many people who volunteered to give their time to mentor scholars at MATS! We would also like to thank our 2024 donors, without whom MATS would not be possible: Open Philanthropy, Foresight Institute, the Survival and Flourishing Fund, the Long-Term Future Fund, Craig Falls, and several donors via Manifund.
More precisely: when people applied to mentor, they answered the question “What is the average number of scholars you expect to accept?”. 223 (or more precisely, 222.6) is the sum of all applicants’ answers.
By “agency”, we mean modeling optimal agents, how those agents interact with each other, and how some agents can be aligned with each other. In practice, this covers cooperative AI, agent foundations, value learning, and "shard theory" work
Note that scholar counts are not yet finalized—some co-mentoring researchers have not yet assigned scholars between themselves. This means that the per-track numbers will be correct, since those mentors are all in the same track, but the statistics about number of scholars per mentor will not be precisely accurate.