There should be more AI safety orgs

(Cross-posted to EA Forum.)

I’m a Senior Program Officer at Open Phil, focused on technical AI safety funding. I’m hearing a lot of discussion suggesting funding is very tight right now for AI safety, so I wanted to give my take on the situation.

At a high level: AI safety is a top priority for Open Phil, and we are aiming to grow how much we spend in that area. There are many potential projects we'd be excited to fund, including some potential new AI safety orgs as well as renewals to existing grantees, academic research projects, upskilling grants, and more.

At the same time, it is also not the case that someone who reads this post and tries to start an AI safety org would necessarily have an easy time raising funding from us. This is because:

All of our teams whose work touches on AI (Luke Muehlhauser’s team on AI governance, Claire Zabel’s team on capacity building, and me on technical AI safety) are quite understaffed at the moment. We’ve hired several people recently, but across the board we still don’t have the capacity to evaluate all the plausible AI-related grants, and hiring remains a top priority for us.
- And we are extra-understaffed for evaluating technical AI safety proposals in particular. I am the only person who is primarily focused on funding technical research projects (sometimes Claire’s team funds AI safety related grants, primarily upskilling, but a large technical AI safety grant like a new research org would fall to me). I currently have no team members; I expect to have one person joining in October and am aiming to launch a wider hiring round soon, but I think it’ll take me several months to build my team’s capacity up substantially.
- I began making grants in November 2022, and spent the first few months full-time evaluating applicants affected by FTX (largely academic PIs as opposed to independent organizations started by members of the EA community). Since then, a large chunk of my time has gone into maintaining and renewing existing grant commitments and evaluating grant opportunities referred to us by existing advisors. I am aiming to reserve remaining bandwidth for thinking through strategic priorities, articulating what research directions seem highest-priority and encouraging researchers to work on them (through conversations and hopefully soon through more public communication), and hiring for my team or otherwise helping Open Phil build evaluation capacity in AI safety (including separately from my team).
- As a result, I have deliberately held off on launching open calls for grant applications similar to the ones run by Claire’s team (e.g. this one); before onboarding more people (and developing or strengthening internal processes), I would not have the bandwidth to keep up with the applications.
On top of this, in our experience, providing seed funding to new organizations (particularly organizations started by younger and less experienced founders) often leads to complications that aren't present in funding academic research or career transition grants. We prefer to think carefully about seeding new organizations, and have a different and higher bar for funding someone to start an org than for funding that same person for other purposes (e.g. career development and transition funding, or PhD and postdoc funding).
- I’m very uncertain about how to think about seeding new research organizations and many related program strategy questions. I could certainly imagine developing a different picture upon further reflection — but having low capacity combines poorly with the fact that this is a complex type of grant we are uncertain about on a lot of dimensions. We haven’t had the senior staff bandwidth to develop a clear stance on the strategic or process level about this genre of grant, and that means that we are more hesitant to take on such grant investigations — and if / when we do, it takes up more scarce capacity to think through the considerations in a bespoke way rather than having a clear policy to fall back on.

[-]Evan McVail2y50

By the way, Open Philanthropy is actively hiring for roles on Ajeya’s team in order to build capacity to make more TAIS grants! You can learn more and apply here.

[-]gvst2y*31

I’m very uncertain about how to think about seeding new research organizations and many related program strategy questions.

Could you share these program strategy questions? (Assuming there are more not described in your comment). I think the community is likely to have helpful insights on at least some of them. I personally am interested in independently researching similar questions and it would be great to know what answers be useful for OpenPhil.

[-]Linch2y209

crossposted from answering a question on the EA Forum.

(My own professional opinions, other LTFF fund managers etc might have other views)

Hmm I want to split the funding landscape into the following groups:

LTFF
OP
SFF
Other EA/longtermist funders
Earning-to-givers
Non-EA institutional funders.
Everybody else

LTFF

At LTFF our two biggest constraints are funding and strategic vision. Historically it was some combination of grantmaking capacity and good applications but I think that's much less true these days. Right now we have enough new donations to fund what we currently view as our best applications for some months, so our biggest priority is finding a new LTFF chair to help (among others) address our strategic vision bottlenecks.

Going forwards, I don't really want to speak for other fund managers (especially given that the future chair should feel extremely empowered to shepherd their own vision as they see fit). But I think we'll make a bid to try to fundraise a bunch more to help address the funding bottlenecks in x-safety. Still, even if we double our current fundraising numbers or so^[1], my guess is that we're likely to prioritize funding more independent researchers etc below our current bar^[2], as well as supporting our existing grantees, over funding most new organizations.

(Note that in $ terms LTFF isn't a particularly large fraction of the longtermist or AI x-safety funding landscape, I'm only talking about it most because it's the group I'm the most familiar with).

Open Phil

I'm not sure what the biggest constraints are at Open Phil. My two biggest guesses are grantmaking capacity and strategic vision. As evidence for the former, my impression is that they only have one person doing grantmaking in technical AI Safety (Ajeya Cotra). But it's not obvious that grantmaking capacity is their true bottleneck, as a) I'm not sure they're trying very hard to hire, and b) people at OP who presumably could do a good job at AI safety grantmaking (eg Holden) have moved on to other projects. It's possible OP would prefer conserving their AIS funds for other reasons, eg waiting on better strategic vision or to have a sudden influx of spending right before the end of history.

SFF

I know less about SFF. My impression is that their problems are a combination of a) structural difficulties preventing them from hiring great grantmakers, and b) funder uncertainty.

Other EA/Longtermist funders

My impression is that other institutional funders in longtermism either don't really have the technical capacity or don't have the gumption to fund projects that OP isn't funding, especially in technical AI safety (where the tradeoffs are arguably more subtle and technical than in eg climate change or preventing nuclear proliferation). So they do a combination of saving money, taking cues from OP, and funding "obviously safe" projects.

Exceptions include new groups like Lightspeed (which I think is more likely than not to be a one-off thing), and Manifund (which has a regranters model).

Earning-to-givers

I don't have a good sense of how much latent money there is in the hands of earning-to-givers who are at least in theory willing to give a bunch to x-safety projects if there's a sufficiently large need for funding. My current guess is that it's fairly substantial. I think there are roughly three reasonable routes for earning-to-givers who are interested in donating:

pooling the money in a (semi-)centralized source
choosing for themselves where to give to
saving the money for better projects later.

If they go with (1), LTFF is probably one of the most obvious choices. But LTFF does have a number of dysfunctions, so I wouldn't be surprised if either Manifund or some newer group ends up being the Schelling donation source instead.

Non-EA institutional funders

I think as AI Safety becomes mainstream, getting funding from government and non-EA philantropic foundations becomes an increasingly viable option for AI Safety organizations. Note that direct work AI Safety organizations have a comparative advantage in seeking such funds. In comparison, it's much harder for both individuals and grantmakers like LTFF to seek institutional funding^[3].

I know FAR has attempted some of this already.

Everybody else

As worries about AI risk becomes increasingly mainstream, we might see people at all levels of wealth become more excited to donate to promising AI safety organizations and individuals. It's harder to predict what either non-Moskovitz billionaires or members of the general public will want to give to in the coming years, but plausibly the plurality of future funding for AI Safety will come from individuals who aren't culturally EA or longtermist or whatever.

[-]Nathan Helm-Burger2y*175

I am an experienced data scientist and machine learning engineer with a background in neuroscience. In my previous job I had a senior position and lead a team of people, and spent ~10k weekly on hundreds of model training runs on AWS and GCP using pipelines I wrote from scratch which profitably guided the expenditure of hundreds of thousands of dollars daily in rapid real-time bidding. I've spent many years reading LessWrong and thinking about the alignment problem. I've been working full-time on independent AI alignment research for about a year and half now. I got a small grant from the Long Term Future Fund to help support my transition to working on AI alignment. I took the AI safety fundamentals course, and found it enjoyable and helpful even though it turned out I had already read all the assigned readings. I read a lot of research papers, in the alignment field specifically and ML and neuroscience generally. I'm friends with and talk regularly with employed AI alignment researchers. I've gone to 3 EAGs and had many interesting one-on-ones with people to discuss AI alignment and safety evals and governance, etc.

Over the past year and half I've applied and been rejected from many different alignment orgs and safety teams within capabilities orgs. I'm sick of trying to work entirely on my own with no colleagues, I work much better as part of a team. I've applied for more grant funding for independent research but I'm not happy about it. I'm considering trying to find a part-time mainstream ML job just so that I can have a team and work on building stuff productively again.

I'd love to start an org to pursue my alignment agenda, and feel like I have plenty of ideas to pursue to keep a handful of employees busy, and sufficient leadership experience to manage a team.

Here's a video of a talk I gave in which I discuss some of my research ideas. This is only a small fraction of what I've been up to in the past couple years. Link to a recording of my recent talk at the Virtual AI Safety Unconference: https://youtu.be/UYqeI-UWKng

If you find my research ideas intriguing, and might be interested in forming an org with me or interviewing me as a possible fit to work at your existing org, please reach out. You can message me here on LessWrong and I'll share my resume and email.

(Edit: this post got me connected with a good position!)

[-]Iknownothing2y22

I think your ideas are some of the most promising I've seen- I'd love to see them pursued further, though I'm concerned about the air-gaping

[-]Roman Leventov2y158

@Nathan Helm-Burger's comment made me think it's worthwhile to reiterate here the point that I periodically make:

Direct "technical AI safety" work is not the only way for technical people (who think that governance & politics, outreach, advocacy, and field-building work doesn't fit them well) to contribute to the larger "project" of "ensuring that the AI transition of the civilisation goes well".

Now, as powerful LLMs are available, is the golden age to build innovative systems and tools to improve^[1]:

Politics: see https://cip.org/, Audrey Tang's projects
Social systems: innovative LLM/AI-first social networks that solve the social dilemma? (I don't have a good existing examples of such projects, though)
Psychotherapy, coaching: see Inflection
Economics: see Verses, One Project, the Gaia Consortium
Epistemic infrastructure: see Subconscious Network, Ought, the Cyborgism agenda, Quantum Leap (AI safety edtech)
Authenticity infrastructure: see Optic, proof-of-personhood projects
Cybersec/infosec: see various AI startups for cybersecurity, trustoverip.org
More?

I believe that if such projects are approached with integrity, thoughtful planning, and AI safety considerations at heart rather than with short-term thinking (specifically, not considering how the project will play out if or when AGI is developed and unleashed on the economy and the society) and profit-extraction motives, they could shape to shape the trajectory of the AI transition in a positive way, and the impact may be comparable to some direct technical AI safety/alignment work.

In the context of this post, it's important that the verticals and projects mentioned above could either be conventionally VC-funded because they could promise direct financial returns to the investors, or could receive philanthropic or government funding that wouldn't otherwise go to technical AI safety projects. Also, there is a number of projects in these areas that are already well-funded and hiring.

Joining such projects might also be a good fit for software engineers and other IT and management professionals who don't feel they are smart enough or have the right intellectual predispositions to do good technical research, anyway, even there was enough well-funded "technical AI safety research orgs". There should be some people who do science and some people who do engineering.

^{^}
I didn't do serious due diligence and impact analysis on any of the projects mentioned. The mentioned projects are just meant to illustrate the respective verticals, and are not endorsements.

[-]Magdalena Wache2yΩ4144

Thanks for writing this! I agree.

I used to think that starting new AI safety orgs is not useful because scaling up existing orgs is better:

they already have all the management and operations structure set up, so there is less overhead than starting a new org
working together with more people allows for more collaboration

And yet, existing org do not just hire more people. After talking to a few people from AIS orgs, I think the main reason is that scaling is a lot harder than I would intuitively think.

larger orgs are harder to manage, and scaling up does not necessarily mean that much less operational overhead.
coordinating with many people is harder than with few people. Bigger orgs take longer to change direction.
reputational correlation between the different projects/teams

We also see the effects of coordination costs/"scaling being hard" in industry, where there is a pressure towards people working longer hours. (It's not common that companies encourage employees to work part-time and just hire more people.)

[-]Walter Laurito2y*107

and Kaarel’s work on DLK

@Kaarel is the research lead at Cadenza Labs (previously called NotodAI), our research group which started during the first part of SERI MATS 3.0 (There will be more information about Cadenza Labs hopefully soon!)

Our team members broadly agree with the post!

Currently, we are looking for further funding to continue to work on our research agenda. Interested funders (or potential collaborators) can reach out to us at info@cadenzalabs.org.

[-]Marius Hobbhahn2y64

Nice to see you're continuing!

[-]RGRGRG2y100

For any potential funders reading this: I'd be open to starting an interpretability lab and would love to chat. I've been full-time on MI for about 4 months - here is some of my work: https://www.lesswrong.com/posts/vGCWzxP8ccAfqsrS3/thoughts-about-the-mechanistic-interpretability-challenge-2

I have a few PhD friends who are working for software jobs they don't like and would be interested in joining me for a year or longer if there were funding in place (even for just the trial period Marius proposes).

My very quick take is that interpretability has yet to understand small language models and this is a valuable direction to focus on next. (more details here: https://www.lesswrong.com/posts/ESaTDKcvGdDPT57RW/seeking-feedback-on-my-mechanistic-interpretability-research )

For any potential cofounders reading this, I have applied to a few incubators and VC funds, without any success. I think some applications would be improved if I had a co-founder. If you are potentially interested in cofounding an interpretability startup and you live in the Bay Area, I'd love to meet for coffee and see if we have a shared vision and potentially apply to some of these incubators together.

[-]Magdalena Wache2y102

I would love to see something like the Charity Entrepreneurship incubation program for AI safety.

[-][anonymous]2y83

I think more independent AI safety orgs introduces more liabilities and points of failure, such as infohazard leaks, unilateralist curse, accidental capabilities research, mental health spirals, and inter-org conflict. Rather, there should be sub-orgs that are underneath main orgs both de-jure and de-facto, with full leadership subordination and limited access to infohazardous information.

[-]MiguelDev2y20

This captures my perspective well: not everyone is suited to run organizations. I believe that AI safety organizations would benefit from the integration of established "AI safety standards," similar to existing Engineering or Financial Reporting Standards. This would make maintenance easier. However, for the time being, the focus should be on independent researchers pursuing diverse projects to first identify those standards.

[+]Iknownothing2y-7-1

[-]Stephen McAleese2y*Ω171

Thanks for the post! I think it does a good job of describing key challenges in AI field-building and funding.

The talent gap section describes a lack of positions in industry organizations and independent research groups such as SERI MATS. However, there doesn't seem to be much content on the state of academic AI safety research groups. So I'd like to emphasize the current and potential importance of academia for doing AI safety research and absorbing talent. The 80,000 Hours AI risk page says that there are several academic groups working on AI safety including the Algorithmic Alignment Group at MIT, CHAI in Berkeley, the NYU Alignment Research Group, and David Krueger's group in Cambridge.

The AI field as a whole is already much larger than the AI safety field so I think analyzing the AI field is useful from a field-building perspective. For example, about 60,000 researchers attended AI conferences worldwide in 2022. There's an excellent report on the state of AI research called Measuring Trends in Artificial Intelligence. The report says that most AI publications come from the 'education' sector which is probably mostly universities. 75% of AI publications come from the education sector and the rest are published by non-profits, industry, and governments. Surprisingly, the top 9 institutions by annual AI publication count are all Chinese universities and MIT is in 10th place. Though the US and industry are still far ahead in 'significant' or state-of-the-art ML systems such as PaLM and GPT-4.

What about the demographics of AI conference attendees? At NeurIPS 2021, the top institutions by publication count were Google, Stanford, MIT, CMU, UC Berkeley, and Microsoft which shows that both industry and academia play a large role in publishing papers at AI conferences.

Another way to get an idea of where people work in the AI field is to find out where AI PhD students go after graduating in the US. The number of AI PhD students going to industry jobs has increased over the past several years and 65% of PhD students now go into industry but 28% still go into academic jobs.

Only a few academic groups seem to be working on AI safety and many of the groups working on it are at highly selective universities but AI safety could become more popular in academia in the near future. And if the breakdown of contributions and demographics of AI safety will be like AI in general, then we should expect academia to play a major role in AI safety in the future. Long-term AI safety may actually be more academic than AI since universities are the largest contributor to basic research whereas industry is the largest contributor to applied research.

So in addition to founding an industry org or facilitating independent research, another path to field-building is to increase the representation of AI safety in academia by founding a new research group though this path may only be tractable for professors.

[-]ashesfall2y50

It would be great if there were more options. I would absolutely leave my current job, and bring my ML experience with me, to a role in AI safety. I would be okay to take a pay cut to do it. This doesn’t seem like an option to me though, after a brief bit of searching on and off over the last year.

[-]robm2y53

I have similar feelings, there's not a clear path for someone in an adjacent field. I chose my current role largely based on the expected QALYs, and I'd gladly move into AI Safety now for the same reason.

This post gives the impression that finding talent is not the current constraint, but I'm confused about why the listed salaries are so high for some of these roles if the pool is so large.

I've submitted applications to a few of these orgs, with cover letters that basically say "I'm here and willing if you need my skills". One frustration is recognizing Alignment as our greatest challenge, and not having a path to go work on it. Another is that the current labs look somewhat homogeneous and a lot like academia, which is not how I'd optimize for speed.

[-]Sheikh Abdur Raheem Ali2y50

The high rate of growth means that at any given moment, most people in the field are new. If you've been seriously investigating the alignment problem for 1-2 years, you meet the prerequisites for understanding.

The entrepreneurial mindset is not as common, but all it requires is cultivating a sense of urgency and embedded agency. And in my experience, the responsibility thrust upon your shoulders when you have people relying upon you for advice and care is deeply meaningful and sobering. Supporting and collaborating with others gives you a sense of focus and purpose that sharpens your thinking and accelerates your actions.

In the early days, you may have nothing to offer but guidance. But guidance is all that we can ever give. Even the most junior people I met at MATS were very capable... a small nudge is all that's needed to help them succeed.

[-]Chris Lakin2y40

I'm surprised that there aren't many organizations hire people who are incubating new/weird/unusual agendas. I think FAR does this, but they seem pretty small.

Otherwise for independent research it's kinda just LTFF (?)

As for how well LTFF does this , well, I applied a few months ago and finally they got back recently (about 10 weeks later) and rejected my application— no feedback, no questions. I'm not even sure they understood what my agenda is.

[-]MiguelDev2y30

I've had a similar experience with LTFF; I waited a long time without receiving any feedback. When I inquired, they told me they were overwhelmed and couldn't provide specific feedback. I find the field to be highly competitive. Additionally, there's a concerning trend I've noticed, where there seems to be growing pessimism about newcomers introducing untested theories. My perspective on this is that now is not the time to be passive about exploration. This is a crucial moment in history when we should remain open-minded to any approach that might work, even if the chances are slim.

[-]Review Bot2y*20

The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.

Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?

[-]Review Bot2y*10

[-]Prometheus2y10

I created a simple Google Doc for anyone interested in joining/creating a new org to put down their names, contact, what research they're interested in pursuing, and what skills they currently have. Overtime, I think a network can be fostered, where relevant people start forming their own research, and then begin building their own orgs/get funding. https://docs.google.com/document/d/1MdECuhLLq5_lffC45uO17bhI3gqe3OzCqO_59BMMbKE/edit?usp=sharing

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

182

There should be more AI safety orgs

182

Ω 66

182

Ω 66

Why?

Talent vs. capacity

The opportunity costs are minimal

We are wasting valuable time

The funding exists (maybe?)

How big is the talent gap?

How did we end up here?

Some common counterarguments

“We don’t need more orgs, we need more great agendas”

“The bottleneck is mentorship”

“The bottleneck is operations”

“The downside risks are too high”

“The best talent get jobs, the rest doesn’t matter”

How?

What now?