Our epistemic rationality has probably gotten way ahead of our instrumental rationality
-Scott Alexander
This is a question post:
Why was the AI Alignment community so unprepared for engaging with the wider world when the moment finally came?
EDIT Based on comment feedback: This is a genuine question about why something that is so obvious now with hindsight bias, was not clear back then and understand why not. Not an attempt to cast blame on any person or group.
I have been a LW reader for at least 10 years, but I confess that until the last ~1.5 years I mostly watched the AI alignment conversation float by. I knew of the work, but I did not engage with the work. Top people were on it, and I had nothing valuable to add.
All that to say: Maybe this has been covered before and I have missed it in the archives.
Lately (throughout this year), there have been a flurry of posts essentially asking: How do we get better at communicating to and convincing the rest of the world about the dangers of AI alignment?
All three of which were posted in April 2023.
The subtext being: If it is possible to not-kill-everyone this is how we are going to have to do it. Why are we failing so badly at doing this?
At this risk of looking dumb or ignorant, I feel compelled to ask: Why did this work not start 10 or 15 years ago?
To be clear: I do not mean true nuts and bolts ML researcher Alignment work, which this community and MIRI were clearly the beginning and end for nearly 2 decades.
I do not even mean outreach work to adjacent experts who might conceivably help the cause. Again, here I think great effort was clearly made.
I also do not mean that we should have been actively doing these things before it was culturally relevant.
I am asking: Why did the Alignment community not prepare tools and plans years in advance for convincing the wider infosphere about AI safety? Prior to the Spring 2023 inflection point.
Why were there no battle plans in the basement of the pentagon that were written for this exact moment?
It seems clear to me, based on the posts linked above and the resulting discussion generated, that this did not happen.
I can imagine an alternate timeline where there was a parallel track of development within the community circa 2010-2020(?) where much discussion and planning covered media outreach and engagement, media training, materials for public discourse, producing accessible [1]content for every level of education and medium. For every common "normie" argument and every easy-to-see-coming news headline. Building and funding policy advocates, contacts, and resources in the political arena. Catchy slogans, buttons, bumper stickers, art pieces, slam dunk tweets.
Heck, 20+ years is enough time to educate, train, hire and surgically insert an entire generation of people into key positions in the policy arena specifically to accomplish this one goal like sleeper cell agents.[2] Likely much, much, easier than training highly qualified alignment researchers.
It seems so obvious in retrospect that this is where the battle would be won or lost.
Didn't we pretty much always know it was going to come from one or a few giant companies or research labs? Didn't we understand how those systems function in the real world? Capitalist incentives, Moats, Regulatory Capture, Mundane utility, and International Coordination problems are not new.
Why was it not obvious back then? Why did we not do this? Was this done and I missed it?
(First time poster: I apologize if this violates the guidelines about posts being overly-meta discussion)
- ^
Which it seems we still cannot manage to do
- ^
Programs like this have been done before with inauspicious beginnings and great effect https://en.wikipedia.org/wiki/Federalist_Society#Methods_and_influence
I reject the premise. Actually, I think public communication has gone pretty dang well since ChatGPT. Not only has AI existential risk become a mainstream, semi-respectable concern (especially among top AI researchers and labs, which count the most!), but this is obviously because of the 20 years of groundwork the rationality and EA communities have laid down.
We had well-funded organizations like CAIS able to get credible mainstream signatories. We've had lots and lots of favorable or at least sympathetic articles in basically every mainstream Western newspaper. Public polling shows that average people are broadly responsive. The UK is funding real AI safety to the tune of millions of dollars. And all this is despite the immediately-preceding public relations catastrophe of FTX!
The only perspective from which you can say there's been utter failure is the Yudkowskian one, where the lack of momentum toward strict international treaties runs spells doom. I grant that this is a reasonable position, but it's not the majority one in the community, so it's hardly a community-wide failure for that not to happen. (And I believe it is a victory of sorts that it's gotten into the Overton window at all.)
I don't think they're closely tied in the public mind, but I do think the connection is known to the organs of media and government that interact with AI alignment. It comes up often enough, in the background - details like FTX having a large stake in Anthropic, for example. And the opponents of AI x-risk and EA certainly try to bring it up as often as possible.
Basically, my model is that FTX seriously undermined the insider credibility of AINotKillEveryoneIsm's most institutionally powerful proponents, but the remaining credibility was enough to work with.