I'll post the obvious resources:
80k's US AI Policy article
Future of Life Institute's summaries of AI policy resources
AI Governance: A Research Agenda (Allan Dafoe, FHI)
Allen Dafoe's research compilation: Probably just the AI section is relevant, some overlap with FLI's list.
The Malicious Use of Artificial Intelligence: Forecasting, Prevention, and Mitigation (2018). Brundage and Avin et al.: One of the earlier "large collaboration" papers I can recall, probably only the AI Politics and AI Ideal Governance sections are relevant for you.
Policy Desiderata for Superintelligent AI: A Vector Field Approach: Far from object-level, in Bostrom's style, but tries to be thorough in what AI policy should try to accomplish at a high level.
CSET's Reports: Very new AI policy org, but pretty exciting as it's led by the former head of IARPA so their recommendations probably have a higher chance of being implemented than the academic think tank reference class. Their work so far focuses on documenting China's developments and US policy recommendations, e.g. making US immigration more favorable for AI talent.
Published documents can trail the thinking of leaders at orgs by quite a lot. You might be better off emailing someone at the relevant orgs (CSET, GovAI, etc.) with your goals, what you plan to read, and seeing what they would recommend so you can catch up more quickly.
Idk, if humanity as a whole could have a justified 90% confidence that AI above a certain compute threshold would kill us all, I think we could ban it entirely. Like, why on earth not? It's in everybody's interest to do so. (Note that this is not the case with climate change, where it is in everyone's interest for them to keep emitting while others stop emitting.)
This seems probably true even if it was 90% confidence that there is some threshold over which AI would kill us all, that we don't yet know. In this case I imagine something more like a direct ban on most people doing it, and some research that very carefully explores what the threshold is.
A common way in which this is done is to get experts to help allocate funding, which seems like a reasonable way to do this, and probably better than the current mechanisms excepting Open Phil (current mechanism = how well you can convince random donors to give you money).
In the world where the aligned version is not competitive, a government can unilaterally pay the price of not being competitive because it has many more resources.
Also there are other problems you might care about, like how the AI system might be used. You may not be too happy if anyone can "buy" a superintelligent AI from the company that built it; this makes arbitrary humans generally more able to impact the world; if you have a group of not-very-aligned agents making big changes to the world and possibly fighting with each other, things will plausibly go badly at some point.
Telling what is / isn't safe seems decidedly easier than making an arbitrary agent safe; it feels like we will be able to be conservative about this. But this is mostly an intuition.
I think a general response to your intuition is that I don't see technical solutions as the only options; there are other ways we could be safe (1, 2).
Cruxes: