This is convincing!
If there is a shortage of staff time, then AI safety funders need to hire more staff. If they don’t have time to hire more staff, then they need to hire headhunters to do so for them. If a grantee is running up against a budget crisis before the new grantmaking staff can be on-boarded, then funders can maintain the grantee’s program at present funding levels while they wait for their new staff to become available.
+1 - and this has been a problem for many years.
I find it slightly concerning that this post is not receiving more attention.
By the time we observe whether AI governance grants have been successful, it will be too late to change course.
I don't understand this part. I think that it is possible to assess in much more granular detail the progress of some advocacy effort.
Strong upvote. A few complementary remarks:
P(doom|Anthropic builds AGI) is 15% and P(doom|some other company builds AGI) is 30% --> You need to add to this the probability that Anthropic is first and that the other companies are not going to create AGI if Anthropic already created it. this is by default not the case
I'm going to collect here new papers that might be relevant:
I was thinking about this:
OpenAI already did the hide-and-seek project a while ago: https://openai.com/index/emergent-tool-use/
While those are not examples of computer use, I think it fits the bill for a presentation of multi-agent capabilities in a visual way.
I'm happy to see that you are creating recaps for journalists and social media.
Regarding the comment on advocacy, "I think it also has some important epistemic challenges": I'm not going to deny that in a highly optimized slide deck, you won't have time to balance each argument. But also, does it matter that much? Rationality is winning, and to win, we need to be persuasive in a limited amount of time. I don't have the time to also fix civilizational inadequacy regarding epistemics, so I play the game, as is doing the other side.
Also, I'm not criticizing the work itself, but rather the justification or goal. I think that if you did the goal factoring, you could optimize for this more directly.
Let's chat in person !
I'm skeptical that this is the best way to achieve this goal, as many existing works already demonstrate these capabilities. Also, I think policymakers may struggle to connect these types of seemingly non-dangerous capabilities to AI risks. If I only had three minutes to pitch the case for AI safety, I wouldn't use this work; I would primarily present some examples of scary demos.
Also, what you are doing is essentially capability research, which is not very neglected. There are already plenty of impressive capability papers that I could use for a presentation.
For info, here is the deck of slides that I generally use in different context.
I have considerable experience pitching to policymakers, and I'm very confident that my bottleneck in making my case isn't a need for more experiments or papers, but rather more opportunities, more cold emails, and generally more advocacy.
I'm happy to jump on a call if you'd like to hear more about my perspective on what resonates with policymakers.
See also: We're Not Advertising Enough.
Thanks a lot!
We think a relatively inexpensive method for day-to-day usage would be using Sonnet to monitor Opus, or Gemini 2.5 Flash to monitor Pro. This would probably be just a +10% overhead. But we have not run this exact experiment; this would be a follow-up work.