Cheat sheet of AI X-risk
This document was made as part of my internship at EffiSciences. Thanks to everyone who helped review it, in particular Charbel-Raphaël Segerie, Léo Dana, Jonathan Claybrough and Florent Berthet! Introduction Clarifying AI X-risk is a summary of a literature review of AI risk models by DeepMind that (among other things) categorizes the threat models it studies along two dimensions: the technical cause of misalignment and the path leading to X-risk. I think this classification is inadequate because the technical causes of misalignment that are considered in some of the presented risk models should be differently nuanced. So, I made a document that would go more in-depth and accurately represent what (I believe) the authors had in mind. Hopefully, it can help researchers quickly remind themselves of the characteristics of the risk model they are studying. Here it is, for anyone who wants to use it! Anyone can suggest edits to the cheat sheet. I hope that this can foster healthy discussion about controversial or less detailed risk models. If you just want to take a look, go right ahead! If you would like to contribute, the penultimate section explains the intended use of the document, the conventions used and how to suggest edits. There is also a more detailed version if you are interested, but that one is not nearly as well formatted (i.e. ugly as duck), nor is it up to date. It is where I try to express more nuanced analysis than allowed by a dropdown list. Content Disclaimer Some information in this cheat sheet is controversial; for example, the inner/outer framework is still debated. In addition, some information might be incorrect. I did my best, but I do not agree with all the sources I have read (notably, Deepmind classifies Cohen’s scenario as specification gaming, but I think it is more like goal misgeneralization) so remain critical. My hope is that after a while the controversial cell comments will contain links and conversations to better un
I did not realize this was tagged fiction and at first I thought this was the introduction of a Scott-like post, then I kept getting more and more disappointed as it slipped into conspiracy theory (because there's scant if any justification for the death of the creature making California slip into the sea - corpses don't just disappear - and the connection with biotech seemed tenuous at best).