I’ve written a draft report evaluating a version of the overall case for existential risk from misaligned AI, and taking an initial stab at quantifying the risk from this version of the threat. I’ve made the draft viewable as a public google doc here (Edit: arXiv version here, video presentation here, human-narrated audio version here). Feedback would be welcome.
This work is part of Open Philanthropy’s “Worldview Investigations” project. However, the draft reflects my personal (rough, unstable) views, not the “institutional views” of Open Philanthropy.
Thanks for reading, and for your comments on the doc. I replied to specific comments there, but at a high level: the formal work you’ve been doing on this does seem helpful and relevant (thanks for doing it!). And other convergent phenomena seem like helpful analogs to have in mind.