(Cross-posted from the EA Forum.)

Introduction

This payout report is meant to cover the Long-Term Future Fund's grantmaking starting January 2022 (after our December 2021 payout report), going through April 2023 (1 January 2022 - 30 April 2023).

  • Total funding recommended: $13.0M
  • Total funding paid out: $12.16M
  • Number of grants paid out: 327
  • Acceptance rate (excluding desk rejections): 50.0%  
  • Acceptance rate (including desk rejections): 37.4%
  • Report authors: Asya Bergal (chair), Linchuan Zhang, Oliver Habryka, Caleb Parikh, Thomas Larsen, Matthew Graves

52 of our grantees, worth $1.41M, requested that we not include public reports for their grants. (You can read our policy on public reporting here.) We referred 2 grants to other funders for evaluation ($0.501M). Our median response time over this period was 29 days.

The rest of our grants are listed below (either in long or short form), as well as in our public grants database.

If you’re interested in receiving funding from the Long-Term Future Fund, apply here.

(Note: The initial sections of this post were written by me, Asya Bergal.)

Other updates

We've had a substantial increase in applications since 2021-- we averaged 35 applications per month in the latter half of 2021, 69 applications per month in 2022, and 90 applications per month so far in 2023.

Our funding bar went up at the end of 2022, in response to a decrease in the overall funding available to long-term future-focused projects. If we assume our numerical ratings are consistent, then applying our new bar to our earlier 2022 funding would imply not having funded 28% of earlier grants.

We're looking for more funding. We've spent an average of ~$1M per month across March, April, and May 2023 to maintain our current bar, have $992,870.53 in reserves as of July 3, and are ideally looking to fundraise at least $10M for the coming year.

As described in this post, we're trying to increase our independence from Open Philanthropy, which provided ~45% of our funding in 2022. As a transitional measure, over the next 6 months, Open Philanthropy will be matching funding given to the Long-Term Future Fund by small donors 2:1, for up to $3.5M total, making now a particularly good time to donate. Donate here. (The Long-Term Future Fund is part of EA Funds, which is a fiscally sponsored project of Effective Ventures Foundation (UK) (EV UK) and Effective Ventures Foundation USA Inc. (EV US). Donations to the Long-Term Future Fund are donations to EV US or EV UK.)

As a temporary measure in response to uncertainty about our future funding levels, we’ve put the bottom ~40% of grants above our current funding bar on hold. I think we’ll make several of those grants after this round of fundraising is over, but I generally expect our funding bar to vary more over time and to depend more on individual donations than it has historically.

I will be stepping down as chair of the fund by the end of October (and potentially earlier)-- I've written some reflections on my time on the fund here. We're looking for additional fund managers (including potential chair candidates)-- express interest here.

The fund's current fund managers are me (Asya Bergal), Linchuan Zhang, Oliver Habryka, and Caleb Parikh as permanent fund managers, and Thomas Larsen, Daniel Eth, Matthew Gray, Lauro Langosco, and Clara Collier as guest managers.

Our legal team asked us to highlight the eligibility criteria for our grants, which you can find in the appendices.

Highlights

Our grants include:

Payout reports

Longer grant write-ups

Grants evaluated by Linchuan Zhang

Stephen Grugett, James Grugett, Austin Chen ($200,000): 4 month stipend for 3 FTE to build a forecasting platform made available to the public based on user-created play-money prediction markets

  • March 2022 Notes by Linch Zhang: This was my first substantive grant investigation. At the time, I felt shaky about it, but now I feel really good about it. The two main reasons I originally recommended this grant:
    • 1. It was an investment into the people who wanted to do EA work – getting 3 ~Google-quality engineers to do more EA/longtermist work (as opposed to counterfactuals that were earning to give or worse) seems well worth it at 200k. 
    • 2. It was an investment into the team specifically. Having cohesive software teams seems like an important component for EA becoming formidable in the future, and is somewhat (surprisingly to me) missing in EA, especially outside of AI safety and crypto trading. I heard really good things about Manifold from early users, and they appeared to be developing at a speed that blew other software projects in forecasting (Metaculus, Foretold, Cultivate, Hypermind, etc) out of the water.
    • At the time, it was not an investment into the prediction market itself/theory of change with regards to play-money prediction markets broadly, because the above two factors were sufficient to be decisive.
    • At the time, it was also unclear whether they plan to go the for-profit route or the nonprofit route.
      • They’ve since decided to go the for-profit route.
    • Looking back, still too soon to be sure, but it looks like Manifold is going quite well. Continue to develop features at phenomenal speeds, lots of EAs and others in adjacent communities use the product, team is still producing fast and are excited for the future.
      • From an “investment into team” perspective, I think Manifold now plausibly has the strongest software team in EA outside of AI safety and earning-to-give (not that I’d necessarily have enough visibility to know of all the potentially better teams, especially stealth ones).
      • I have a number of disjunctive ToCs for how Manifold (and forecasting in general) can over time make the future better, some of which is implicitly covered here
      • Though I am still uncertain about whether this particular project is the best use of the cofounders + team’s time...a lot of the evidence I have to observe this is more an update on the team’s overall skill + cohesiveness rather than an update about their comparative advantage for prediction markets specifically. 
  • Addendum June 2023:
    • I’ve grown more confused about the total impact or value of this grant. On the one hand, I think Manifold is performing at or moderately above expectations in terms of having a cohesive team that’s executing quickly, and many people in the community appear to find their product useful or at least interesting. On the other hand, the a) zero-interest-rate environment and corresponding high startup evaluations when I recommended this grant has ended in early 2022, and b) recent events have reduced a substantial fraction of EA funding, which meant 200K is arguably much more costly now than a year ago.
    • Still, I think I’m broadly glad to have Manifold in our ecosystem. I think they’re very helpful for people in our and adjacent communities in training epistemics, and I’m excited to see them branch out into experiments in regranting and retroactive funding projects; from a first-principles perspective, it’d be quite surprising if the current status of EA grantmaking is sufficiently close to optimal.

Solomon Sia ($71,000): 6-month stipend for providing consultation and recommendations on changes to the US regulatory environment for prediction markets.

  • Solomon Sia wants to talk to a range of advisers, including industry experts, users, and contacts at the CFTC, to see if there are good improvements in ways to regulate prediction markets in the US, while simultaneously protecting users and reducing regulatory risk and friction.
  • This was an exploratory grant for seeing how it’s possible to improve the US regulatory environment for prediction markets with a resulting written report provided to EA Funds.
  • I think this is a reasonable/great option to explore: 
    • I think my position on prediction markets is somewhat more cynical than that of most EAs in the forecasting space, but still, I’m broadly in favor of them and think they can be a critical epistemic intervention, both for uncovering new information and for legibility/common knowledge reasons.
    • It seemed quite plausible to me that the uncertain regulatory environment for prediction markets in the US is impeding the growth of large real-money prediction markets on questions that matter.
    • Solomon seemed unusually competent and knowledgeable about the tech regulations space, a skillset very few EAs have.
      • Cultivating this skillset and having him think about EA issues seemed valuable. 
      • A potential new caveat is that in 2023 as AI risk worries heat up, it seems increasingly likely that we might be able to draw from a diverse skillset of experienced and newly interested/worried people.
    • The for-profit motivations for this work are there but not very large, as unless a company is trying very hard to do specific regulatory capture for their company (which is bad and also practically very difficult), easing prediction market regulations has collective benefits and individual costs.
    • (weakly held) I thought trying to nail this during the Biden administration is good because it seemed plausible that the current CFTC will be more predisposed to liking prediction markets than average for the CFTC.
      • One interesting update is that EA connections are likely a mild plus in 2022, and a moderate liability in 2023.
      • NB: Solomon and his collaborator think a) that the EA connection is still a mild to moderate positive b) it’s now unclear whether the Biden administration is better or worse than a counterfactual Republican administration. 
  • I’ve thought about this grant some afterwards, and I think even with the benefit of hindsight, I'm still a bit confused about how happy I should be about this grant ex-post.
    • One thing is that I’ve grown a bit more confused about the output and tractability of interventions in this domain.
      • The successes(?) Kalshi had confused me and I haven’t had enough time to integrate this into my worldview.
      • My current impression is that CFTC is fairly open to informed opinions from others on this matter.
    • I continue to believe it’s a good grant ex-ante.

Grants evaluated by Oliver Habryka

Alexander Turner ($220,000): Year-long stipend for shard theory and RL mechanistic interpretability research

This grant has been approved but has not been paid out at the time of writing.

We’ve made grants to Alex to pursue AI Alignment research before:

We also made another grant in 2023 to a team led by Alex Turner for their post on steering vectors for $115,411 (total includes payment to 5 team members, including, without limitation, travel expenses, office space, and stipends).

This grant is an additional grant to Alex, this time covering his full-time stipend for a year to do more research in AI Alignment.

Only the first one has a public grant write-up, and the reasoning and motivation behind all of these grants is pretty similar, so I will try to explain the reasoning behind all of them here. 

As is frequently the case with grants I evaluate in the space of AI Alignment, I disagree on an inside-view level pretty strongly with the direction of the research that Alex has been pursuing for most of his AI Alignment career. Historically I have been, on my inside-view, pretty unexcited about Alex’s work on formalizing power-seekingness, and also feel not that excited about his work on shard theory. Nevertheless, I think these are probably among the best grants the LTFF has made in recent years. 

The basic reasoning here is that despite me not feeling that excited about the research directions Alex keeps choosing, within the direction he has chosen, Alex has done quite high-quality work, and also seems to often have interesting and useful contributions in online discussions and private conversations. I also find his work particularly interesting, since I think that within a broad approach I often expected to be fruitless, Alex has produced more interesting insight than I expected. This in itself has made me more interested in further supporting Alex, since someone producing work that shows that I was at least partially wrong about a research direction being not very promising is more important to incentivize than work whose effects I am pretty certain of.

I would like to go into more detail on my models of how Alex’s research has updated me, and why I think it has been high quality, but I sadly don’t have the space or time here to go into that much depth. In-short, the more recent steering vector work seems like the kind of “obvious thing to try that could maybe help” that I would really like to saturate with work happening in the field, and the work on formalizing power-seeking theorems is also the kind of stuff that seems worth having done, though I do pretty deeply regret the overly academic/formal presentation which has somewhat continuously caused people to overinterpret the strength of its results (which Alex also seems to have regretted, and is also a pattern I have frequently observed in academic work that was substantially motivated by trying to “legitimize the field”).

Another aspect of this grant that I expect to have somewhat wide-ranging consequences is the stipend level we set on. Some basic principles that have lead me to suggest this stipend level: 

  • I have been using the anchor of “industry stipend minus 30%” as a useful heuristic for setting stipend levels for LTFF grants. The goal in that heuristic was to find a relatively objective standard that would allow grantees to think about stipend expectations on their own without requiring a lot of back and forth, while hitting a middle ground in the incentive landscape between salaries being so low that lots of top talent would just go into industry instead of doing impactful work, and avoiding grifter problems with people asking for LTFF grants because they expect they will receive less supervision and can probably get away without a ton of legible progress.
  • In general I think self-employed salaries should be ~20-40% higher, to account for additional costs like health insurance, payroll taxes, administration overhead, and other things that an employer often takes care of.

I have been rethinking stipend policies, as I am sure many people in the EA community have been since the collapse of FTX, and I haven’t made up my mind on the right principles here. It does seem like a pretty enormous number of good projects are no longer having the funding to operate at their previous stipend levels, and it’s plausible to me that we should take the hit, lose out on a bunch of talent, and reduce stipend levels to a substantially lower level again to be more capable of handling funding shocks. But I am really uncertain on this, and at least in the space of AI Alignment, I can imagine the recent rise to prominence of AI Risk concerns could potentially alleviate funding shortfalls (or it could increase competition by having more talent flow into the space, which could reduce wages, which would also be great). 

See the Stipend Appendix below, “How we set grant and stipend amounts”, for more information on EA Funds’ determination of grant and stipend amounts.

Vanessa Kosoy ($100,000): Working on the learning-theoretic AI alignment research agenda

This is a grant to cover half of Vanessa’s stipend for two years (the other half being paid by MIRI). We also made another grant to Vanessa in Q4 2020 for a similar amount. 

My model of the quality of Vanessa’s work is primarily indirect, having engaged relatively little with the central learning-theoretic agenda that Vanessa has worked on. The work is also quite technically dense, and I haven’t found anyone else who could explain the work to me in a relatively straightforward way (though I have heard that Daniel Filan’s AXRP podcast with Vanessa is a better way to get started than previous material, though it hadn’t been published when I was evaluating this grant). 

I did receive a decent number of positive references for Vanessa’s work, and I have seen her make contributions to other conversations online that struck me as indicative of a pretty deep understanding of the AI Alignment problem. 

If I had to guess at the effects of this kind of work, though I should clarify I am substantially deferring to other people here in a way that makes me not particularly trust my specific predictions, I expect that the primary effect would be that the kind of inquiry Vanessa is pursuing highlights important confusions and mistaken assumptions in how we expect machine intelligence to work, which when resolved, will make researchers better at navigating the very large space of potential alignment approaches. I would broadly put this in the category of “Deconfusion Research”.

Vanessa’s research resulted in various public blog posts, which can be found here. 

Skyler Crossman ($22,000): Support for Astral Codex Ten Everywhere meetups

Especially since the collapse of FTX, I am quite interested in further diversifying the set of communities that are working on things I think are important to the future. AstralCodexTen and SlateStarCodex meetups seem among the best candidates for creating additional thriving communities with overlapping, but still substantially different norms. 

I do feel currently quite confused about what a good relationship between adjacent communities like this and Effective Altruism-labeled funders like the Long Term Future Fund should be. Many of these meetups do not aim to do as much as good as possible, or have much of an ambitious aim to affect the long term future of humanity, and I think pressures in that direction would likely be more harmful than helpful, by introducing various incentives for deception and potentially preventing healthy local communities from forming by creating a misaligned relationship between the organizers (who are paid by EA institutions to produce as much talent for longtermist priorities) and the members (who are interested in learning cool things about rationality and the world and want to meet other people with similar interests). 

Since this is a relatively small grant, I didn’t really resolve this confusion, and mostly decided to just go ahead with this. I also talked a bunch to Skyler about this, and currently think we can figure out a good relationship into the future on how it’s best to distribute funding like this, and I expect to think more about this in the coming weeks.

Grants evaluated by Asya Bergal

Any views expressed below are my personal views, and not the views of my employer, Open Philanthropy. (In particular, getting funding from the Long-Term Future Fund should not be read as an indication that the applicant has a greater chance of receiving funding from Open Philanthropy, and not receiving funding from the Long-Term Future Fund [or any risks and reservations noted in the public payout report] should not be read as an indication that the applicant has a smaller chance of receiving funding from Open Philanthropy.)

Alignment Research Center $54,543: Support for a research & networking event for winners of the Eliciting Latent Knowledge contest

  • This was funding a research & networking event for the winners of the Eliciting Latent Knowledge contest run in early 2022; the plan for the event was mainly for it to be participant-led, with participants sharing what they were working on and connecting with others, along with professional alignment researchers visiting to share their own work with participants.
  • I think the case for this grant is pretty straightforward: the winners of this contest are (presumably) selected for being unusually likely to be able to contribute to problems in AI alignment, and retreats, especially those involving interactions with professionals in the space, have a strong track record of getting people more involved with this work.

Daniel Filan ($23,544):  Funding to produce 12 more AXRP episodes, the AI X-risk Podcast. 

We recommended a grant of $23,544 to pay Daniel Filan for his time making 12 additional episodes of the AI X-risk Research Podcast (AXRP), as well as the costs of hosting, editing, and transcription.

The reasoning behind this grant was similar to the reasoning behind my last grant to AXRP

  •  I’ve listened or read through several episodes of the podcast; I thought Daniel asked good questions and got researchers to talk about interesting parts of their work. I think having researchers talk about their work informally can provide value not provided by papers (and to a lesser extent, not provided by blog posts). In particular:
  •  I’ve personally found that talks by researchers can help me understand their research better than reading their academic papers (e.g. Jared Kaplan’s talk about his scaling laws paper). This effect seems to have also held for at least one listener of Daniel’s podcast.
  •  Informal conversations can expose motivations for the research and relative confidence level in conclusions better than published work.

Daniel also shared some survey data in his grant application about how people rated AXRP compared to other AI alignment resources, though I didn't look at this closely when making the grant decision, as I already had a reasonably strong prior towards funding.

Grants evaluated by Caleb Parikh

Conjecture ($72,827): Funding for a 2-day workshop to connect alignment researchers from the US, UK, and AI researchers and entrepreneurs from Japan.

  • Conjecture applied for funding to host a two day AI safety workshop in Japan in collaboration with Araya (a Japanese AI company). They planned to invite around 40 people, with half of the attendees being AI researchers, and half being alignments researchers from the US and UK. Japanese researchers were generally senior, leading labs, holding postdoc positions in academia, or holding senior technical positions at tech companies.
  • To my knowledge, there has been very little AI safety outreach conducted amongst strong academic communities in Asia (e.g. in Japan, Singapore, South Korea …). On the current margin, I am excited about more outreach being done in these countries within ultra-high talent groups. The theory of change for the grant seemed fairly straightforward: encourage talented researchers who are currently working in some area of AI to work on AI safety, and foster collaborations between them and the existing alignment community.
  • Conjecture shared the invite list with me ahead of the event, and I felt good about the set of alignment researchers invited from the UK and US. I looked into the Japanese researchers briefly, but I found it harder to gauge the quality of invites given my lack of familiarity with the Japanese AI scene. I also trust Conjecture to execute operationally competently on events of this type, having assisted other AI safety organisations (such as SERI MATS) in the past.
  • On the other hand, I have had some concerns about Conjecture, and I felt confused about whether this conference gave Conjecture more influence in ways that I would feel concerned about given the questionable integrity and judgement of their CEO,  - see this and this section of a critique of their organisation (though note that I don’t necessarily endorse the rest of the post). It was also unclear to me how counterfactual the grant was, and how this traded off against activities that I would be less excited to see Conjecture run. I think this is a general issue with funding projects at organisations with flexible funding, as organisations are incentivised to present their most fundable projects (which they are also the most excited about), and then in cases where the funding request is successful, move funding that they would have spent on this projects to other lower impact projects. Overall, I modelled making this grant as being about a quarter as cost-effective as it might have been without these considerations (though I don’t claim this discount factor to be particularly reliable).
  • Overall, I thought this grant was pretty interesting, and I think that the ex-ante case for it was pretty solid. I haven’t reviewed the outcomes of this grant yet, but I look forward to reviewing and potentially making more grants in this area.
  • Update: Conjecture kindly directed me towards this retrospective and have informed me that some Japanese attendees of their conference are thinking of creating an alignment org.

SERI MATS program ($316,000): 8 weeks scholars program to pair promising alignment researchers with renowned mentors. (Originally evaluated by Asya Bergal)

  • SERI MATS is a program that helps established AI safety researchers find mentees. The program has grown substantially since we first provided funding, and now supports 15 mentors, but at the time, the mentors were Alex Gray, Beth Barnes, Evan Hubinger, John Wentworth, Leo Gao, Mark Xu, and Stuart Armstrong. Mentors took part in the program in Berkeley in a shared office space.
  • When SERI MATS was founded, there were very few opportunities for junior researchers to try out doing alignment research. Many opportunities were informal mentorship positions, sometimes set up through cold emails or after connecting at conferences. The program has generally received many more qualified applicants than they have places for, and the vast majority of fellows report a positive experience of the program. I also believe the program has substantially increased the number of alignment research mentorship positions available.
  • I think that SERI MATS is performing a vital role in building the talent pipeline for alignment research. I am a bit confused about why more organisations don’t offer larger internship programs so that the mentors can run their programs ‘in-house’. My best guess is that MATS is much better than most organisations running small internship programs for the first time, particularly in supporting their fellows holistically (often providing accommodation and putting significant effort into the MATS fellows community). One downside of the program relative to an internship at an organisation is that there are fewer natural routes to enter a managed position, though many fellows have gone on to receive LTFF grants for independent projects or continued their mentorship under the same mentor.

Robert Long ($10,840): travel funding for participants in a workshop on the science of consciousness and current and near-term AI systems

Please note this grant has been approved but at the time of writing it has not been paid out.

  • We funded Robert Long to run a workshop on the science of consciousness for current and near-term AI systems. Robert and his FHI colleague, Patrick Butlin, began the project on consciousness in near-term AI systems during their time at FHI, where they both worked in the digital minds research group. Since January of this year, Rob has been continuing the project while a philosophy fellow at CAIS.  There are surprisingly few people investigating the consciousness of near-term AI systems, which I find pretty worrying given the rapid pace of progress in ML. I think that it’s plausible we end up creating many copies of AI systems and use them in ways that we’d consider immoral given enough reflection , in part due to ignorance about their preferences. The workshop aimed to produce a report applying current theories of consciousness (like integrated information theory and global workspace theory) to current ml systems.
  • I think that Rob is an excellent fit for this kind of work; he is one of the few people working in this area and has written quite a lot about AI consciousness on his blog. He has a PhD in philosophy from NYU, where he was advised by David Chalmers, and has experience running workshops (e.g. in 2020, he ran a workshop on philosophy and large language models with Amanda Askell).

Jeffrey Ladish ($98,000): 6-month stipend & operational expenses to start a cybersecurity & alignment risk assessment org

Please note this grant has been approved but at the time of writing it has not been paid out.

  • Jeffrey Ladish applied for funding to set up an organisation to do AI risk communications,  with a focus on cybersecurity and alignment risks. His organisation, Palisade Research Inc., plans to conduct risk assessments and communicate those risks to the public, labs and the government. The theory of change is that communicating catastrophic risks to the public and key decision makers could increase political support for slowing down AI and other measures that might reduce AI risk. I am particularly excited about Jeffrey’s organisation demonstrating offensive AI cyber capabilities and other demos that help to communicate current risks from advanced AI systems.
  • I am pretty excited about Jeffrey’s organisation. He has worked on information security in various organisations (including Anthropic), he seems well-networked amongst people working in think tanks and AI labs, and I like his public writing on AI risk. I am generally sceptical of people doing work related to policy without having first worked in lower stakes positions in similar areas first, but I thought that Jeffrey was orienting to the downsides very reasonably and doing the sensible things, like developing plans with more experienced policy professionals.

Grants evaluated by Matthew Gray

Leap Laboratories ($195,000): One year of seed funding for a new AI interpretability research organisation.

  • Jessica Rumbelow applied for seed funding to set up an interpretability research organisation, which hopes to develop a model-agnostic interpretability engine.
  • I’m excited about this grant primarily based on the strength of research work she did with Matthew Watkins during SERI-MATS, discovering anomalous tokens like SolidGoldMagikarp.
  • I think trends in the AI development space suggest a need for model-agnostic methods.
  • More broadly, I think this showcases one of the primary benefits of interpretability research: it’s grounded in a way that makes it easy to verify and replicate.  

Daniel Kokotajlo ($10,000): Funding for a research retreat on a decision-theory/cause-prioritisation topic.

  • We funded a research retreat run by Daniel Kokotajlo on Evidential Cooperation in Large Worlds. I think research retreats like this are both quite productive and quite cheap; we only have to pay for travel and housing costs, and the attendees are filtered on intrinsic interest in the topic.
     

Grants evaluated by Thomas Larsen

Kaarel Hänni, Kay Kozaronek, Walter Laurito, and Georgios Kaklmanos ($167,480): Implementing and expanding on the research methods of the  "Discovering Latent Knowledge" paper. 

This is a team which started in SERI MATS applying for funding to continue their SERI MATS project on research checking for dishonesty in advanced AI systems. 

My cruxes for this type of grant are: 

(1) If done successfully, would this project help with alignment?

(2) How likely is this team to be successful? 

My thoughts on (1): 

This is meant to build upon Burns’ et al.'s Discovering Latent Knowledge paper (DLK), which finds a direction in activation space that is supposed to represent the 'truth' of a logical proposition. 

I think that Eliciting Latent Knowledge (ELK) is an important subproblem of alignment, and I think it can be directly applied to combat deceptive alignment. My independent impression is that this specific direction towards solving ELK is not very useful towards a full alignment solution, but that it may lead to slightly better monitoring. (In particular, I think even in a good outcome, this will only lead to an average case solution to ELK, meaning that when we explicitly train against this detector, it will fail.) I expect that AGI projects will be in a position where it's obvious that the systems they are building are capable and dangerous, and it will be apparent that instrumental incentives kick in for e.g. powerseeking and deception. I think that this technique might help us detect this danger, but given that we can't train against it, it doesn't let us actually fix the underlying problem. Thus, the lab will be in the difficult position of continuing on, or having to train against their detection system. I still think that incremental progress on detecting deception is good, because it can help push for a stop in capabilities growth before prematurely continuing to AGI. 

My thoughts on (2): 

They produced reasonable output during SERI MATS, including the beginning of a replication of the DLK paper. They weren't that specific in their grant application, but they wrote a number of ideas for ways to extend the paper in the LW post. The two ideas that seem best to me are:

  1. Connecting DLK to mechanistic interpretability. This seems hard, but maybe tinkering around in activation space can be helpful. 
  2. Creating a better confidence loss. In the original paper, only one statement was considered, and so the loss was coming from the constraint that P(q) + P(not q) = 1. They propose evaluating two propositions p & q, and getting more constraints from that. 

These ideas don't seem amazing, but they seem like reasonable things to try. I expect that the majority of the benefit will come from staring at the model internals and the results of the techniques and then iterating. I hope that this process will churn out more and better ideas. 

One reservation I have is that none of the applicants have an established research track record, though they have published several papers: 

- Kaarel's Arxiv page

- Walter's Google Scholar Profile

- Georgios's ORCID

This team did get strong references from Colin Burns and John Wentworth, which makes me a lot more excited about the project. All things considered, I'm excited about giving this team a chance to work on this project, and see how they are doing. I'm also generally enthusiastic about teams trying their hand at alignment research. 

Joseph Bloom ($50,000): Funding AI alignment research into circuits in decision transformers. 

Joseph applied for independent research funding to continue his research into decision transformer interpretability. I'm happy about Joseph's initial result, which found circuits in a decision transformer in a simple RL environment. I thought the applicant's write up was solid and gave me some updates on what cognitive machinery I expect to be induced by RL. In particular, I was excited about the preference directions in embedding space that they constructed. This seems like a useful initial step for retargeting the search, though more understanding of the circuits that are doing the optimization seems critical for this approach.

 I think interpretability on RL models is pretty neglected and very relevant for safety. 

According to a reference, the applicant was also in the top 3 ARENA participants, and was very motivated and agentic. 

The counterfactual is that Joseph tries to get funding elsewhere, and if that fails, getting a research engineer job at an AI safety org (e.g. Redwood, Conjecture, Ought, etc). I encouraged this person to apply to the AI safety orgs, as I think that working at an org is generally more productive than independent research. These jobs are quite competitive, so it's likely that Joseph won't get hired by any of them, and in this case, it seems great to pay him to do independent alignment research. 

Overall, I think that Joseph is a promising researcher, and is working on a useful direction, so I feel excited about supporting this.  

Since receiving this grant, Joseph has received some more funding (here), and was mentioned in the Anthropic May Update

Other grants we made during this period

Applicant NameGrant Summary                                                  Awarded Amount                                                                                                        Decision  Date
Thomas WoodsideSupport to work on research projects relevant to AI alignment$50,000January 2022
AnonymousSupport to study and gain a background in technical AI$3,000January 2022
Charlie SteinerSupport for researching value learning$50,000January 2022
Logan SmithSupport to create language model (LM) tools to aid alignment research through feedback and content generation$40,000January 2022
Paul CologneseEducational scholarship in AI safety$13,000January 2022
AnonymousAI governance PhD$4,129January 2022
Ruth Grace WongResearch paper about the history of philanthropy-driven national-scale movement-building strategy to inform how EA funders might go about building movements for good$2,000February 2022
Stephen Grugett, James Grugett, Austin ChenSupport to build a forecasting platform based on user-created play-money prediction markets$200,000February 2022
Marius HobbhahnResearch on AI safety$30,103February 2022
JJ HepburnHealth coaching to optimise the health and wellbeing, and thus capacity/productivity, of those working on AI safety$80,000February 2022
Vael GatesSupport for a study on AI researchers’ perceptions of safety$9,900February 2022
William BradshawSupport to work on biosecurity$11,400February 2022
Michael ParkerCatalogue the history of U.S. high-consequence pathogen regulations, evaluate their performance, and chart a way forward$34,500February 2022
Stuart ArmstrongSupport for setting up a research company in AI alignment$33,762February 2022
AnonymousAI safety field-building$32,568February 2022
AnonymousTravel funds to attend a conference and network with the community at an EA hub$1,600February 2022
Timothy UnderwoodWrite a SF/F novel based on the EA community$15,000February 2022
Simon GrimmFinancial support for work on a biosecurity research project and workshop, and travel expenses$15,000February 2022
AnonymousScholarship/teaching buy-out to finish Master's thesis and commence AI safety research$10,800February 2022
Oliver ZhangRunning an alignment theory mentorship program with Evan Hubinger$3,600February 2022
AnonymousA large conference hosting communities working on improving the long-term future$250,000February 2022
AnonymousRecording written materials that are useful for people working on AI governance$5,100March 2022
Gavin LeechResearching and documenting longtermist lessons from COVID$5,625March 2022
AnonymousSupport to work on a safe exploration project with an AI research organization$33,000March 2022
AnonymousSupport to work on a technical AI safety research project in an academic lab$45,000March 2022
Jessica CooperFunding to trial a new London organisation aiming to 10x the number of AI safety researchers$234,121March 2022
Aaron BergmanResearch on EA and longtermism$70,000March 2022
AnonymousFunding a visit to the Sculpting Evolution group for collaboration$4,000March 2022
Jan-Willem van PuttenEU Tech Policy Fellowship with ~10 trainees$68,750March 2022
Anonymous3-month funding to do an internship to develop career capital in policy advocacy$12,600March 2022
AnonymousSupport for equipment for AI Safety and Metascience research$1,905March 2022
Darryl Wright1-year research stipend (and travel and equipment expenses) for support for work on 2 AI safety projects: 1) Penalising neural networks for learning polysemantic neurons; and 2) Crowdsourcing from volunteers for alignment research.$150,000March 2022
AnonymousSupport for travel and equipment expenses for EA work on AI alignment$5,000March 2022
Tomáš GavenčiakOrganise the third Human-Aligned AI Summer School, a 4-day summer school for 150 participants in Prague, summer 2022$110,000March 2022
AnonymousIndependent alignment research at the intersection of computational cognitive neuroscience and AGI safety$55,000March 2022
Kai SandbrinkStarting funds for a DPhil project in AI that addresses safety concerns in ML algorithms and positions$3,950April 2022
Maximilian KaufmannSupport to work on technical AI alignment research$7,000April 2022
AnonymousPhD in Safe and Trusted AI with a focus on inductive biases towards the interpretability of neural networks$63,259April 2022
Chloe LeeSupport to study emerging policies in biosecurity for better understanding and global response coordination$25,000April 2022
Jack RyanSupport for alignment theory agenda evaluation$25,000April 2022
Isabel JohnsonSupport research, write, and publish a book: a survey on the unknown dangers of a contemporary nuclear strike$5,000April 2022
Nicholas GreigNeural network interpretability research$12,990April 2022
Anne le RouxGovAI salaries and overheads$401,537April 2022
Daniel SkeffingtonResearch and a report/paper on the the role of emergency powers in the governance of X-Risk$26,000April 2022
Noga AharonySupport for PhD developing computational techniques for novel pathogen detection$20,000April 2022
Tim FarrellyEquipment for AI Safety research$3,900April 2022
Sasha Cooper6 months funding for supervised research on the probability of humanity becoming interstellar given non-existential catastrophe$36,000April 2022
Kevin WangSupport to work on Aisafety.camp project, impact of human dogmatism on training$2,000April 2022
Philipp BongartzEnabling prosaic alignment research with a multi-modal model on natural language and chess$25,000May 2022
Ross GrahamStipend and research fees for completing dissertation research on public ethical attitudes towards x-risk$60,000May 2022
Nikiforos PittarasSupport and compute expenses for technical AI Safety research on penalising RL agent betrayal$14,300May 2022
Josiah Lopez-WildFunding a new computer for AI alignment work, specifically a summer PIBBSS fellowship and ML coding$2,500May 2022
Theo KnopferSupport to explore biosecurity policy projects: BWC/ European early detection systems/Deep Vision risk mitigation$27,800May 2022
Jan KirchnerSupport for working on "Language Models as Tools for Alignment" in the context of the AI Safety Camp.$10,000May 2022
Lucius Bushnaq, Callum McDougall, Avery GriffinSupport to investigate the origins of modularity in neural networks$125,000May 2022
AnonymousAdmissions fee for MPA in International Development at a top university$800May 2022
AnonymousSupport for research on international standards for AI$5,250May 2022
Rory GillisResearch project designed to map and offer preliminary assessment of AI ideal governance research$2,000May 2022
John BridgeResearch into the international viability of FHI's Windfall Clause$3,000May 2022
CHERI / Naomi NederlofStipends for students of 2022 CHERI’s summer residence$134,532May 2022
Wyatt TessariSupport to connect, expand and enable the AGI safety community in Canada$87,000May 2022
Ondrej BajgarSupport funding during 2 years of an AI safety PhD at Oxford$11,579May 2022
Neil CrawfordSupport gatherings during 12 months period for discussion of AI safety$10,000May 2022
AnonymousSupport to do AI alignment research on Truthful/Honest AI$120,000May 2022
Logan StrohlSupport to further develop a branch of rationality focused on patient and direct observation$80,000May 2022
AnonymousSupport for courses and research on AI$4,000May 2022
AnonymousSupport to explore the concept of normative risk and its potential practical consequences$20,000May 2022
Philippe RivetSupport for research into applied technical AI alignment work$10,000May 2022
AnonymousSupport to extend Udacity Deep Reinforcement Learning Nanodegree$1,400May 2022
Cindy WuML security/safety summer research project: model backdooring through pre-processing$5,000May 2022
Marius HobbhahnSupport for Marius Hobbhahn for piloting a program that approaches and nudges promising people to get into AI safety faster$50,000May 2022
Conor McGlynnUp-skill for AI governance work before starting Science and Technology Policy PhD at Harvard$17,220June 2022
AnonymousSupport to hire a shared PA for researchers working at two organisations contributing to AI safety and governance$78,000June 2022
Nora AmmannSupport the PIBBSS fellowship with more fellows than originally anticipated and to realize a local residency$180,200June 2022
Julia KarbingPaid internships for promising Oxford students to try out supervised AI Safety research projects$60,000June 2022
AnonymousSupport to take Sec401 course from SANS for cyber security professionals$8,589June 2022
AnonymousFunding for 1-year executive and research assistance to support 2 researchers working in the longtermist space$84,000June 2022
Francis Rhys WardFunding to support PhD in AI Safety at Imperial College London, technical research and community building$6,350June 2022
Peter BarnettEquipment for technical AI safety research$4,099June 2022
Jacques Thibodeau3-month research stipend to continue working on AISC project to build a dataset for alignment and a tool to accelerate alignment$22,000June 2022
Chris PatrickStipend to produce a guide about AI safety researchers and their recent work, targeted to interested laypeople$5,000June 2022
AnonymousSoftware engineering to revise and resubmit a multi-objective reinforcement learning paper$26,000June 2022
AnonymousPhD/research stipend for work on key longtermist area$30,000June 2022
Jay BaileySupport for Jay Bailey for work in ML for AI Safety$79,120June 2022
Thomas Kehrenberg6-month research stipend for AI alignment research$15,000June 2022
Solomon SiaSupport to lobby the CFTC and legalise prediction markets$138,000June 2022
Bálint PatakiSupport AI Policy studies in the ML Safety Scholars program and at Oxford$3,640June 2022
Jade Zaslavsky12-month research stipend to work on ML models for detecting genetic engineering in pathogens$85,000June 2022
Gergely Szucs6-month research stipend to develop an overview of the current state of AI alignment research, and begin contributing$70,000June 2022
AnonymousAI safety PhD funding$7,875June 2022
Conor BarnesWebsite visualising x-risk as a tree of branching futures per Metaculus predictions$3,500June 2022
Jonas Hallgren3-month research stipend to set up a distillation course helping new AI safety theory researchers to distil papers$14,600June 2022
Victor WarlopSERI MATS aims at scaling the number of alignment theorists by pairing promising applicants with renowned mentors$316,000June 2022
Patrick GrubanWeekend organised as a part of the co-founder matching process of a group to found a human data collection org$2,300June 2022
Victor Warlop Piers de RaveschootRetroactive grant for managing the MATS program, 1.0 and 2.0$27,000June 2022
AnonymousA 1-year research stipend for up-skilling in technical and general AI alignment to prepare for an impactful job in the field$110,000June 2022
Anonymous7-month research stipend to do independent AI Safety research on interpretability and upskill in ML engineering$43,600June 2022
Mario Peng LeeStanford Artificial Intelligence Professional Program Tuition$4,785July 2022
Conor SullivanDevelop and market video game to explain the Stop Button Problem to the public & STEM people$100,000July 2022
Quinn DoughertyShort meatspace workshop to hone, criticize, and evaluate hazardousness of a new research programme in alignment$9,000July 2022
Viktoria MalyasovaStipend for up-skilling in infrabayesianism prior to start of SERI MATS program$4,400July 2022
Samuel Nellessen6-month budget to self-study ML and the possible applications of a Neuro/CogScience perspective for AGI Safety$4,524July 2022
Charles WhittakerSupport for academic research projects relating to pandemic preparedness and biosecurity$8,150July 2022
Amrita A. Nair3-month funding for upskilling in technical AI Safety to test personal fit and potentially move to a career in alignment$1,000July 2022
Jeffrey OhlTuition to take one Harvard economics course in fall 2022 to be a more competitive econ graduate school applicant$6,557July 2022
AnonymousFunding to take an online course on public policy to help the applicant transition from Machine Learning to AI-Governance$2,732July 2022
Samuel Brown6-month research stipend to research AI alignment, specifically the interaction between goal-inference and choice-maximisation$47,074July 2022
AnonymousSupport for multiple ML projects to build up skills for AI safety PhD$1,100July 2022
Anonymous25-month grant funding EA-relevant dissertation that contributes to improved research on rate-limiting steps and constraints in AI research.$139,000July 2022
Kyle ScottA research and networking event for winners of the Eliciting Latent Knowledge contest to encourage collaboration on aligning future machine learning systems with human interests$72,000July 2022
Derek ShillerSupport for an academic project evaluating factors relevant to digital consciousness with the aim of better understanding how and how not to create conscious artificial intelligences.$11,000July 2022
AnonymousFunds to attend cybersecurity conferences - defcon.org and blackhat.com$5,550July 2022
Max ClarkeFinancial support for career exploration and related project in AI alignment $26,077August 2022
Anonymous2-month research stipend to build skills, and broaden action space for EA related projects to undertake in gap year$15,320August 2022
Jonathan NgFunding support for MLSS scholar to up-skill in ML for alignment, documenting key learnings, and visit Berkeley in pursuit of a career in technical AI safety.$16,000August 2022
Hamza Tariq ChaudhryEquipment expenses for summer research fellowship at CERI and organising the virtual Future of Humanity Summit$2,500August 2022
AnonymousResearch project on strategies to mitigate x-risk in Party Politics$3,000August 2022
AnonymousFunding for administrative support to the CEO for a large team working on research of interest to the longtermist community $50,847August 2022
Simon SkadeFunding for 3 months’ independent study to gain a deeper understanding of the alignment problem, publishing key learnings and progress towards finding new insights.$35,625August 2022
Ardysatrio HaroenSupport participation in MLSS program working on AI alignment.$745August 2022
Antonio FrancaEquipment stipend for MLSS scholar to do research in AI technical research$2,000August 2022
Darren McKeeSupport for a non-fiction book on threat of AGI for a general audience$50,000August 2022
Steve PetersenResearch stipend to work on the foundational issue of *agency* for AI safety$20,815.20August 2022
Ross NordbySupport for AI safety research and concrete research projects$62,500August 2022
Leah Pierson300-hour research stipend for a research assistant to help implement a survey of 2,250 American bioethicists to lead to more informed discussions about bioethics.$4,500August 2022
Luca De Leo12-month research stipend to study and get into AI Safety Research and work on related EA projects$14,000August 2022
AnonymousTwo months of independent study in alignment to start my career as an alignment researcher$8,333August 2022
Robi RahmanSupport for part-time rationality community building$4,000August 2022
Lennart JustenFunding to increase my impact as an early-career biosecurity researcher$6,000September 2022
Fabienne SandkühlerFunding for research on the effect of creatine on cognition$4,000September 2022
Chris LeongFunding for the AI Safety Nudge Competition$5,200September 2022
Brian PorterIndependent research and upskilling for one year, to transition from academic philosophy to AI alignment research$60,000September 2022
John Wentworth1-year research stipend for research in applications of natural abstraction$180,000September 2022
Anonymous6 month research stipend for SERI MATS scholar to continue working on Alignment and ML Interpretability$48,000September 2022
Nicky Pochinkov6-month research stipend for SERI MATS scholar to continue working on theoretical AI alignment research, trying to better understand how ML models work to reduce X-risk from future AGI$50,000September 2022
David Hahnemann, Luan Ademi6-month research stipend for 2 people working on modularity, a subproblem of Selection Theorems and budget for computation$26,342September 2022
Dan Valentine12-month research stipend to transition career into technical alignment research$25,000September 2022
Anonymous3-month funding to explore GCBR-focused biosecurity projects after having finished my virology PhD$25,000September 2022
Logan Smith6-month research stipend for continued work on shard theory: studying how inner values are formed by outer reward schedules$40,000September 2022
Gunnar ZarnckeOne year grant for a project to reverse-engineer human social instincts by implementing Steven Byrnes' brain-like AGI$16,600September 2022
Zach PeckSupporting participation at the Center for the Advancement of Rationality (CFAR) workshop $1,800September 2022
AnonymousAI master's thesis and research in longtermism$30,000September 2022
AnonymousUpskilling in technical AI Safety Research to contribute to the field through an engineering or research role$33,000September 2022
Adam RutkowskiPiloting an EA hardware lab for prototyping hardware relevant to longtermist priorities$44,000September 2022
AnonymousSetting up experiments with LLM to examine Strategic Instrumental Behavior in real-life setting$50,000September 2022
Egor ZverevPhD program support$6,500September 2022
Anonymous1-year research stipend to work on alignment research full time$80,000September 2022
Shavindra JayasekeraResearch in machine learning and computational statistics$38,101October 2022
Hoagy Cunningham6-month stipend for research into preventing steganography in interpretable representations using multiple agents$20,000October 2022
Joel Becker5-month research stipend to support civilizational resilience projects arising from SHELTER Weekend$27,248October 2022
Jonas Hallgren4 month research stipend to set up AI safety groups at 2 groups covering 3 universities in Sweden with eventual retreat$10,000October 2022
Anonymous4 month research stipend in technical safety, ML, and AI chip supply chains before participating in an AI governance program$11,500October 2022
Anonymous8-month research stipend to do research in AI safety$35,000October 2022
Anonymous3-month research stipend  in technical AI safety$9,750October 2022
David UdellOne-year full-time research stipend to work on alignment distillation and conceptual research with Team Shard after SERI MATS$100,000October 2022
John BurdenFunding 2 years of technical AI safety research to understand and mitigate risk from large foundation models$209,501October 2022
AnonymousAI safety research$1,500October 2022
Garrett Baker12-month research stipend to work on alignment research$96,000October 2022
Magdalena Wache9-month part-time research stipend for AI safety, test fit for theoretical research$62,040October 2022
Anshuman Radhakrishnan6-month stipend to continue upskilling in Machine Learning in order to contribute to Prosaic AI Alignment Research$55,000October 2022
Theo KnopferTravel Support to BWC RevCon & Side Events$3,500October 2022
Daniel HerrmannSupport for PhD on embedded agency, to free up my time from teaching$64,000October 2022
Jeremy Gillen6-month research stipend to work on the research I started during SERI MATS, solving alignment problems in model based RL$40,000October 2022
Anonymous3.5 months’ support for ML engineering skill-up$8,720October 2022
Edward SaperiaOne year of funding to improve an established community hub for EA in London$50,000November 2022
Chu Chen1-year research stipend for upskilling in technical AI alignment research$96,000November 2022
Anonymous12-month stipend to research assumptions underlying most existing work on AI alignment and AI forecasting$7,645November 2022
Kajetan JaniakSupport forAI safety research.$4,000November 2022
Felix Hofstätter6-month research stipend for an AI alignment research project on the manipulation of humans by AI$25,383November 2022
Maximilian Kaufmann4 month research stipend to support an early-career alignment researcher, who is taking a year to pursue research and test fit$20,000November 2022
Will Aldred6-month research stipend to: 1) Carry out independent research into risks from nuclear weapons, 2) Upskill in AI strategy$40,250November 2022
Benjamin AndersonSupport to conduct work in AI safety$5,000November 2022
Arun Jose4-month funding for Arun Jose's independent alignment research and study$15,478November 2022
AnonymousProfessional development grant for independent upskilling in AGI Safety$3,600November 2022
Matthias Georg Mayer6-months research stipend for upskilling and researching “Framing computational systems such that we can find meaningful concepts."$24,000November 2022
Johannes C. Mayer6 months research stipend. Turn intuitions, like goals, wanting, abilities, into concepts applicable to computational systems$24,000November 2022
AnonymousFunding for MSc Thesis on Language Models Safety$28,160November 2022
Paul Bricman1-year stipend and compute for conducting a research project focused on AI safety via debate in the context of LLMs$50,182November 2022
Simon Möller6-months research stipend to transition into technical AI Safety work by working through Jacob Hilton’s curriculum and a project$65,000November 2022
AnonymousFall semester stipend to work on AI Safety research, in particular adversarial robustness, monitoring, and trojaning$7,500November 2022
Alan Chan4-month research stipend for a research visit with David Krueger on evaluating non-myopia in language models and RLHF systems$12,321November 2022
Tomislav Kurtovic3-month research stipend to skill up in ML and Alignment with goal of developing a streamlined course in Math/AI$5,500November 2022
Kadri ReisSupport to participate in Biological Weapons Convention in Geneva $1,500November 2022
Skyler CrossmanTwelve month funding for global rationality organization development$130,000December 2022
Daniel O'ConnellInvestigate AI alignment options$54,250December 2022
Remmelt EllenCover participant stipends for AI Safety Camp Virtual 2023$72,500December 2022
Josiah Lopez-WildScholarship for PhD student working on research related to AI Safety$8,000January 2023
Zhengbo Xiang (Alana)Support for 18 months of independent alignment research and upskilling, focusing on developing a research agenda on corrigibility$30,000January 2023
Daniel FilanFunding to make 12 more AXRP episodes, the AI X-risk Research Podcast.$23,544January 2023
Sam Marks3-week research stipend for three people to review AI alignment agendas$26,000January 2023
Robert KirkFunding to perform human evaluations for evaluating different machine learning methods for aligning language models$10,000January 2023
Jérémy PerretSupport for AI alignment outreach in France (video/audio/text/events) & field-building$24,800January 2023
Peter Ruschhaupt3 months support for exploring career options in AI governance - upskilling, networking and writing articles summarising present AI governance work and ideas.$20,000January 2023
Charlie Griffin8 months research stipend for alignment work: assisting academics, skilling up and personal research.$35,000January 2023
Alexander Lintz6 months research stipend for independent work centred on distillation and coordination in the AI governance & strategy space$69,940January 2023
AnonymousLiving cost stipend top up while working on long-term future relevant research at a think tank$15,000January 2023
Francis Rhys WardSupport for PhD in AI safety - technical research and community building work$2,305January 2023
Lucius Bushnaq6-month research stipend for two people to find formalisms for modularity in neural networks$72,560January 2023
David QuarelSupport for a project with the Cambridge AI Safety group. The group will be working on projects related to AI alignment, in particular, setting up experimental demonstrations of deceptive alignment.$5,613January 2023
Tim FarkasFunding to run a 20-30 people 2-3 day retreat & bring together key EA thinkers/actors of the mind enhancement cause area$2,540February 2023
Wyatt Tessari3-month stipend to connect, expand and enable the AGI gov/safety community in Canada$17,000February 2023
Anonymous14-month research stipend and research costs for 3 research reports on best risk communication practices for longtermist orgs$96,000February 2023
Daniel KokotajloFunding for research retreat on a decision-theory / cause-prioritisation topic.$10,000February 2023
Alex AltairFunding for research stipend to develop a framework of optimisation.$8,000February 2023
Max LamparthFunding for technical AI safety research - using interpretability methods on large language models for AI safety.$2,500February 2023
Liam Carroll6-week research stipend to publish a series of blogposts synthesising Singular Learning Theory for a computer science audience$8,000February 2023
Amrita A. Nair3-month scholarship to support Amrita Nair's upskilling in AI Safety working on Evan Hubinger's Reward Side-Channels experiment proposal.$5,000February 2023
Gerold CsendesFunding for project transitioning from AI capabilities to AI Safety research.$8,200February 2023
AnonymousCareer transition including but not limited to exploring helping set up an x-risk research institute and working on a research project on AI ethics boards$30,000February 2023
Tamsin Leake6 months research stipend to do independent AI alignment research focused on formal alignment and agent foundations$30,000February 2023
Chris Scammell, Andrea Miotti, Katrina JoslinA 2-day workshop to connect alignment researchers from the US, UK, and AI researchers and entrepreneurs from Japan$72,827February 2023
Joseph Bloom6-month research stipend to conduct AI alignment research circuits in decision transformers$50,000February 2023
Carson Jones1 year research stipend (or less) to help alignment researchers improve their research ability via 1-on-1 conversations$10,000February 2023
Andrei AlexandruFine-tuning large language models for an interpretability challenge (compute costs)$11,300February 2023
AnonymousTwo month research stipend and bridge funding to complete an AI governance report and produce a related article$11,560February 2023
Jacob MendelGeneral support to spend 1 month working with Will Bradshaw's team at the Nucleic Acid Observatory producing reports on the merits of alternative sample choices to wastewater for metagenomic sequencing.$4,910February 2023
Max RäukerFunding for Max Rauker's part-time research stipend for a trial and developer costs to maintain and improve the AI governance document sharing hub$15,000March 2023
AnonymousA twelve month research stipend to pursue independent writing on the sociology and philosophy of longtermist effective altruism$75,346March 2023
Anonymous3-4 month stipend for AI safety upskilling and research$7,000March 2023
Fabian Schimpf6-month research stipend for AI alignment research and conduct independent research on limits of predictability$28,875March 2023
AnonymousSupport for PhD student pursuing research areas that intersect economics and EA$4,528March 2023
Kane Nicholson6-months research stipend for AI safety upskilling and research projects$26,150March 2023
David LindnerSupport for David Linder and Jeremy Scheurer to participate in Redwood Research's REMIX program on mechanistic interpretability using their new causal scrubbing methodology$4,300March 2023
Jessica RumbelowOne year of seed funding for a new AI interpretability research organisation$195,000March 2023
Alexander Large1 month general support for projects for small EA-aligned charities.$3,618March 2023
Kaarel Hänni, Kay Kozaronek, Walter Laurito, and Georgios Kaklmanos6-month research stipend for Georgios Kaklamanos, Walter Laurito, Kaarel Hänni and Kay Kozaronek to continue their SERI-MATS project on expanding the "Discovering Latent Knowledge" paper$167,480March 2023
Matt MacDermott3 month research stipend for SERI MATS extension on agent foundations research$24,000March 2023
Max Kaufmann9 months of funding for an early-career alignment researcher to work with Owain Evans and others$45,000March 2023
Anonymous40 hours of research stipend for researchers to finish a paper on governing AI via compute$1,200March 2023
Robert MilesFunding for additional fellows for the AISafety.info Distillation Fellowship, improving our single-point-of-access to AI safety$54,962March 2023
Alexander TurnerFunding Alexander Turner and team research project - Writing new motivations into a policy network by understanding and controlling its internal decision-influences$115,411March 2023
Anonymous3-months stipend for upskilling in ML to transition from mathematics (at PhD level) to AI safety work. During the grant period, project goals include replicating an interpretability paper with longer term goals of publishing project write-ups.$5,300March 2023
Anonymous3-4 month salary to help setup a new division at a US think tank doing AI governance research$26,800March 2023
Anonymous2-month living expenses while waiting to join a US think tank$12,000March 2023
Andrey Tumas4-month research stipend for conceptual/theoretical research towards perfect world-model interpretability.$30,000March 2023
Nora AmmannFunding for PIBBSS research fellowship to host 6 additional fellows$100,000March 2023
David StaleySupport to maintain a copy of the alignment research dataset etc in the Arctic World Archive for 5 years$3,000March 2023
Wesley FenzaOne-year funding of Astral Codex Ten meetup in Philadelphia$5,000March 2023
Matthew MacInnes8 months support to test fit for social scientific research related to AI governance, preparing for MPhil proposal.$9,000March 2023
Anonymous3-months funding for upskilling in AI Safety and research on hardware-enabled mechanisms for AI Governance.$48,000March 2023
AnonymousSupport for PhD Track in Health and Security at a top US university$9,800March 2023
Nicholas Kees Dupuis12-month research stipend to continue developing research agenda on new ways to make LLMs directly useful for alignment research without advancing capabilities$120,000March 2023
AnonymousScholarship for taking the Offsec Certified Professional (OSCP) certification - the industry leading Penetration Testing with Kali Linux course and online lab before taking the OSCP certification exam.$2,000March 2023
Jingyi WangOrganising OPTIC: in-person, intercollegiate forecasting tournament. Boston, Apr 22. Funding is for prizes, venue, etc.$2,100March 2023
Rusheb Shah6 months research stipend to upskill on technical AI safety through collaboration with researchers and self-study.$50,000March 2023
Alfred Harwood6-month research stipend to research geometric rationality, ergodicity economics and their applications to decision theory and AI$11,000April 2023
Alexander TurnerYear-long research stipend for shard theory and RL mech int research$220,000April 2023
Said Achmiz1 year support for developing and maintaining projects/resources used by the EA and rationality communities$60,000April 2023
Skyler CrossmanSupport Astral Codex Ten Everywhere meetups$22,000April 2023
Vanessa Kosoy2-year research stipend for work on the learning-theoretic AI alignment research agenda$100,000April 2023
Robert LongSupport participants in a workshop on the science of consciousness and current and near-term AI systems$10,840April 2023
Mateusz Bagiński6-month research stipend to up skill in maths, ML and AI alignment as well as working on non-profit projects beneficial for AI safety in pursuit of a research career.$14,136April 2023
Quentin Feuillade--MontixiFunding for Quentin Feuillade-Montixi's 4 month SERI MATS extension in London, mentored by Janus and Nicholas Kees Dupuis to work on cyborgism$32,000April 2023
Anonymous3 month research stipend for independent research into and articles on large language models, agent foundations, and AI alignment$14,019April 2023
Smitha MilliSupport to participate in  the Symposium on AGI Safety at Oxford$1,500April 2023
Anonymous6-month research stipend and course funding to upskill in AI safety before entering the Civil Service Fast Stream in September 2023 (Data & Tech)$14,488April 2023
AnonymousSupport for independent projects & upskilling for AI safety work$18,000April 2023
Sage Bergerson5-month part time research stipend for collaborating on a research paper analysing the implications of compute access$2,500April 2023
Iván Godoy6-month research stipend to dedicate full-time to upskilling/AI alignment research tentatively focused on agent foundations and start a MIRIx group in Buenos Aires.$6,000April 2023
Naoya OkamotoSupport for  Mathematics of Machine Learning course offered by the University of Illinois at Urbana-Champaign.$7,500April 2023
Joshua Reiners4-month research stipend to work on a project finding the most interpretable directions in gpt2-small's early residual stream to better understand contemporary AI systems$16,300April 2023


 

Appendix: How we set grant and stipend amounts

(Our legal team requested that we include this section; it was written by Caleb Parkih.)

Over the last year, we have directed a significant portion of our grants toward supporting individuals in the field of AI safety research. When compared to much of the non-profit sector, some of our grants may seem large. However, I believe there are strong justifications for this approach.

Our grantees often have excellent earning potential

Our grantees often exhibit extraordinary earning potential due to their skills and qualifications. Many of them are excellent researchers (or have the potential to become one in a few years) and could easily take jobs in big tech or finance, and some could command high salaries (over $400k/year) while conducting similar research at AI labs. I expect that offering lower grants would push some grantees to take higher-earning options in private industry, creating less altruistic value. My impression is that our grants are not larger than comparable grants or salaries offered by many established AI safety organizations. In fact, I anticipate our grants are likely lower.

Grants have substantive downsides relative to working in an organisation

Grants, while helpful, do have some drawbacks compared to conventional employment. We do not provide additional benefits often found in organizations, such as health insurance, office spaces, or operations support, and our stipends often offer less financial security than full-time employment. Often, a portion of a grant is designed to support grantees’ operational and living expenses while they pursue their research projects.

Generally, we expect our grantees to work full-time on their projects, with similar intensity to the work they’d do at other organizations within EA and AI safety, and we structure our grants to account for this amount of work. There are of course, benefits such as our grantees having more flexibility than they would in many organizations.

How we decide on personal stipend size

The fund operates as a collection of fund managers who sometimes have differing views on how much to fund a grantee for.

Our general process is:

  1. The fund manager assigned to a grant reviews the budget provided by the grantee and makes adjustments based on their understanding of the grant, the market rate for similar work and other factors. 
  2. The grant size is then reviewed by the fund chair (Asya Bergal) and the director of EA Funds (Caleb Parikh).

One heuristic we commonly use (especially for new, unproven grantees) is to offer roughly 70% of what we anticipate the grantee would earn in an industry role. We want to compensate people fairly and allow them to transition to impactful work without making huge sacrifices, while conserving our funding and discouraging grifters. A relatively common procedure for fund managers to use to decide how much to fund a grantee (assuming a fund manager has already decided they're overall worth funding), is to:

  1. Calculate what we expect the grantee would earn for similar work in an industry role (in the location they’re planning on performing the grant activity).
  2. Look at the amount of funding the applicant has requested, and see if that amount differs significantly from 70% of their industry salary.
  3. If it doesn't differ significantly, make the grant with the requested number.
  4. If it does differ significantly, consider adjusting the grant upwards or downwards, taking into account other factors that would affect what an appropriate funding ask would be, e.g. their pre-existing track record. (We’re more likely to adjust a grant downwards if we think the requested amount is too high, than upwards if we think the requested amount is too low).

Appendix: Eligibility criteria for LTFF grants

(Our legal team requested that we include this section; it was written by Caleb Parikh.)

Career Stage: Our interest lies in assisting grantees who are at the beginning of their careers, are contemplating a career shift towards an area of higher impact, or have accumulated several years of experience in their respective fields.

Demonstrated Skills: We require that prospective grantees exhibit evidence of possessing the skills necessary for the type of work or study they plan to undertake. This evidence could come from previous experiences, credentials, or a particularly remarkable application.

Generally, our grants fulfil one of the following additional criteria:

High-Impact Projects: The central aim of the Long-Term Future Fund is to improve humanity’s odds of a long and flourishing future. We assess proposed projects based on their potential to contribute to this goal. However, it is not mandatory for grantees to share this specific objective or to be entirely focused on improving the long-term future.

Empowering people pursuing impactful work: Grants related to career support (e.g. travel grants for conferences, scholarships for online courses, or funding to allow time for skill development) can enable grantees to increase their positive impact over the course of their careers. Grantees should demonstrate a strong interest in a priority area for the long-term future, such as biosecurity or mitigating risks from advanced AI. This could be evidenced by past experiences, credentials, or an application that shows familiarity with the field they intend to study.

Appendix: Special note on upskilling grants

(Our legal team requested that we include this section.)

One of LTFF’s overall charitable purposes is to encourage qualified and thoughtful individuals to think about and find solutions for global catastrophic risks, such as advanced artificial intelligence. We do this by funding such individuals to research issues like AI alignment so that they become more knowledgeable in and/or potentially change their career path to fully invest in these issues.

New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 11:38 PM

One downside of [MATS] relative to an internship at an organisation is that there are fewer natural routes to enter a managed position...

I think you misspelled "upside".

(Also useful post, thankyou for publishing it.)

This is very helpful, thanks! Actually, the post includes several sections, including in the appendix, that might be more interesting to many readers than the grant recommendations themselves. Maybe it would be good to change the title a bit so that people also expect other updates.

I also found parts of this post surprisingly interesting, given the ultra-dry title and intimidating reading time.

To present this kind of content in a way more readers could benefit from, another option would be to post it as a small sequence, so people could vote and comment on separate sections.