Props for proposing a new and potentially fruitful framing.
I would like to propose training Wise AI Advisors as something that could potentially meet your two criteria:
• Even if AI is pretty much positive, wise AI advisors would allow us get closer to maximising these benefits
• We can likely save the world if we make sufficiently wise decisions[1]
There's also a chance that we're past the point of no return, but if that's the case, we're screwed no matter what we do. Okay, it's slightly more complicated because there's a chance that we aren't yet past the point of no return, but if we pursue wise AI advisors, instead of redirecting these resources to another project that we will be past the point of no return by the time we produce such advisors. This is possible, but my intuition is that it's worth pursuing anyway.
Securing AI labs against powerful adversaries seems like something that almost everyone can get on board with. Also, posing it as a national security threat seems to be a good framing.
AI alignment is probably the most pressing issue of our time. Unfortunately it's also become one of the most controversial, with AI accelerationists accusing AI doomers/ai-not-kill-everyoneism-ers of being luddites who would rather keep humanity shackled to the horse and plow than risk any progress, whilst the doomers in turn accuse accels of rushing humanity as fast as it can straight off a cliff.
As Robin Hanson likes to point out, trying to change policy on a polarised issue is backbreaking work. But if you can find a way to pull sideways you can find ways to make easy progress with noone pulling the other way.
So can we think of a research program that:
a) will produce critically useful results even if AI isn't dangerous/benefits of AI far outweigh costs.
b) would likely be sufficient to prevent doom if the project is successful and AI does turn out to be dangerous.
?
I think we can. Here are some open questions that I think could make strong research areas for such a research program.
You would be hard pressed to object to any of these research areas on accelerationist grounds. You might deprioritise them, but you would agree that spending money on them is more useful than setting it on fire.
Yes this has a huge amount of overlap with existing AI alignment research directions, but that's exactly the point. Take the non controversial bits of AI alignment, rebrand them, and try to get broader buy in for them.
It might be worth creating a new movement, AI Lawfulness, that focuses on these questions without taking an explicit accel/decel stance. Given it's focus on law, it should be possible to push for this to be a research priority for governments and hopefully get significant government funding. And if it is successful in part or in whole, it would be in a good position to push for legislation requiring these innovations to be implemented on all AI models.