Where I agree and disagree with Eliezer
(Partially in response to AGI Ruin: A list of Lethalities. Written in the same rambling style. Not exhaustive.) Agreements 1. Powerful AI systems have a good chance of deliberately and irreversibly disempowering humanity. This is a much more likely failure mode than humanity killing ourselves with destructive physical technologies. 2. Catastrophically risky AI systems could plausibly exist soon, and there likely won’t be a strong consensus about this fact until such systems pose a meaningful existential risk per year. There is not necessarily any “fire alarm.” 3. Even if there were consensus about a risk from powerful AI systems, there is a good chance that the world would respond in a totally unproductive way. It’s wishful thinking to look at possible stories of doom and say “we wouldn’t let that happen;” humanity is fully capable of messing up even very basic challenges, especially if they are novel. 4. I think that many of the projects intended to help with AI alignment don't make progress on key difficulties and won’t significantly reduce the risk of catastrophic outcomes. This is related to people gravitating to whatever research is most tractable and not being too picky about what problems it helps with, and related to a low level of concern with the long-term future in particular. Overall, there are relatively few researchers who are effectively focused on the technical problems most relevant to existential risk from alignment failures. 5. There are strong social and political pressures to spend much more of our time talking about how AI shapes existing conflicts and shifts power. This pressure is already playing out and it doesn’t seem too likely to get better. 6. Even when thinking about accident risk, people’s minds seem to go to what they think of as “more realistic and less sci fi” risks that are much less likely to be existential (and sometimes I think less plausible). It’s very possible this dynamic won’t change until after actually existing AI