Toby Ord recently published a nice piece On the Value of Advancing Progress about mathematical projections of far-future outcomes given different rates of progress and risk levels. The problem with that and many arguments for caution is that people usually barely care about possibilities even twenty years out.
We could talk about sharp discounting curves in decision-making studies, and how that makes sense given evolutionary pressures in tribal environments. But I think this is pretty obvious from talking to people and watching our political and economic practices.
Utilitarianism is a nicely self-consistent value system. Utilitarianism pretty clearly implies longtermism. Most people don't care that much about logical consistency,[1] so they are happily non-utilitarian and non-longtermist in a variety of ways. Many arguments for AGI safety are longtermist, or at least long-term, so they're not going to work well for most of humanity.
This is a fairly obvious, but worth-keeping-in-mind point.
One non-obvious lemma of this observation is that much skepticism about AGI x-risk is probably based on skepticism about AGI happening soon. This doesn't explain all skepticism, but it's a significant factor worth addressing. When people dig into their logic, that's often a central point. They start out saying "AGI wouldn't kill humans" then over the course of a conversation it turns out that they feel that way primarily because they don't think real AGI will happen in their lifetimes. Any discussion of AGI x-risks isn't productive, because they just don't care about it.
The obvious counterpoint is "You're pretty sure it won't happen soon? I didn't know you were an expert in AI or cognition!" Please don't say this - nothing convinces your opponents to cling to their positions beyond all logic like calling them stupid.[2] Something like "well, a lot of people with the most relevant expertise think it will happen pretty soon. A bunch more think it will take longer. So I just assume I don't know which is right, and it might very well happen pretty soon".
It looks to me like discussing whether AGI might threaten humans is pretty pointless if the person is still assuming it's not going to happen for a long time. Once you're past that, it might make sense to actually talk about why you think AGI would be risky for humans.[3]
- ^
This is an aside, but you'll probably find that utilitarianism isn't that much more logical than other value systems anyway. Preferring what your brain wants you to prefer, while avoiding drastic inconsistency, has practical advantages over values that are more consistent but that clash with your felt emotions. So let's not assume humanity isn't utilitarian just because it's stupid.
- ^
Making sure any discussions you have about x-risk are pleasant for all involved is probably actually the most important strategy. I strongly suspect that personal affinity weighs more heavily than logic on average, even for fairly intellectual people. (Rationalists are a special case; I think we're resistant but not immune to motivated reasoning).
So making a few points in a pleasant way, then moving on to other topics they like is probably way better than making the perfect logical argument while even slightly irritating them.
- ^
From there you might be having the actual discussion on why AGI might threaten humans. Here are some things I've seen be convincing.
People seem to often think "okay fine it might happen soon, but surely AI smarter than us still won't have free will and make its own goals". From there you could point out that it needs goals to be useful, and if it misunderstands those goals even slightly, it might be bad. Russell's "you can't fetch the coffee if you're dead" is my favorite intuitive explanation of instrumental convergence creating unexpected consequences. This requires explaining that we wouldn't screw it up in quite such an obvious way, but the metaphor goes pretty deep into more subtle complexities of goals and logic.
The other big points, in my observation, are "people screw up complex projects a lot, especially on the first try" and "you'd probably think it was dangerous if advanced aliens were landing, right?". One final intuitive point to make is that even if they do always correctly follow human instructions, some human will accidentally or deliberately give them very bad instructions.
All good points.
I agree that people will care more if their decisions clearly matter in producing that future.
This isn't easy to apply to the AGI situation, because what actions will help which outcomes is quite unclear and vigorously argued. Serious thinkers argue for both trying to slow down (PauseAI), and for defensive acceleration (Buterin, Aschenbrenner, etc). And it's further complicated in that many of us think that accelerating will probably produce a better world in a few years, then shortly after that, humanity is dead or sadly obsolete. This pits short-term directly against long-term concerns.
I very much agree that helping people imagine either a very good or a very bad future will cause them to care more about it. I think that's been established pretty thoroughly in the decision-making empirical literature.
Here I'm reluctant to say more than "futures so good they're difficult to imagine" since the my actual predictions sound like batshit-crazy scifi to most people right now. Sometimes I say things like people won't have to work and global warming will be easy to solve; then people fret about what they'd do with their time if they didn't have to work. I've also tried talking about dramatic health extension, to which people question how much longer they'd want to live any (except old people, who never do - but they're ironically exactly the ones who probably won't benefit from AGI-designed life extension).
That's all specific points in agreement with your take that really good outcomes are hard for modern humans to conceive.
I agree that describing good futures is worth some more careful thinking.
One thought is that it might be easier for most folks to imagine a possible dystopian outcome, in which humans aren't wiped out but made obsolete and simply starve to death when they can't compete with AI wages for any job. I don't think that's the likeliest catastrophe, but it seems possible and might be a good point of focus.