Just up front: I have no qualifications on this so adjust accordingly. I teach AP Calc/AP Stats if you want to know my deal. Putting this down because half the time I'm reading a post I'm thinking to myself "I wish I knew what this person's deal is" :)
People who believe there is a >50% possibility of doom in the next 50 years or so strike me as overconfident. Just to say the general public is obviously way underestimating the risk (0% too low lol), but I believe many people on this site are overestimating the risk.
The complexity of the system is just so high. How can we predict what a superintelligence that hasn't even been created yet will behave? I understand that it's reasonable to assume it will want to accumulate resources, eliminate threats (us lol) etc., but how can anyone be for instance 90%+ sure that it will end with us dead?
There are so many links of reasoning that all have at least a small chance of going awry. Just spit-balling off the top of my head-- I'm sure all of these can be refuted, but like are you 100% sure? Just to say these specifics are not the point of my argument: I'm just arguing that there are a million things that could in theory go wrong, and even if each is unlikely, it's kind of a swiss-cheese defense against doom:
- Maybe when a certain level of intelligence is reached, consciousness comes online, and that affects the behavior of the system in an unpredictable way.
- Maybe alignment works!
- Maybe the system is unbelievably super-intelligent, but for some reason true social reasoning is basically impossible for LLMs and the like, and we need to go down a different, distant path before that becomes possible. We can still easily trick it!
- Maybe superintelligence is subject to rot. Many complex systems just kind of decay, and maybe for some reason this intelligence is unable to constantly update its code and maintain itself perfectly.
- Maybe it's actually worse than it seems, but fledgling AIs go crazy in 2035 and kill 50 people in a sufficiently brutal way that the world wakes up and we destroy every computer we can see with a sledgehammer before scaled up evil-AIs are possible.
It just seems like there a million things that could potentially go wrong. If you ask me to predict what would happen if a hungry tiger came into a room with me and three of my friends, I couldn't do it. There are just too many unknowns. Maybe it eats us? Maybe it is scared of us? Maybe I start crying but Richard saves the day? Who knows! This is that to the infinity power.
Obvously, I understand the basic argument and I totally accept it. I believe that misaligned AI is the largest existential threat we face, and I believe there is a serious threat that I will die because of it, and an even more serious threat that my daughter will die from it before the end of her natural life. I'm frightened, but I believe the people who put p(doom) in the next 100 years above 50% are overconfident.
I agree with you that we shouldn't be too confident. But given how sharply capabilities research is accelerating—timelines on TAI are being updated down, not up—and in the absence of any obvious gating factor (e.g. current costs of training LMs) that seems likely to slow things down much if at all, the changeover from a world in which AI can't doom us to one in which it can doom us might happen faster than seems intuitively possible. Here's a quote from Richard Ngo on the 80,000 Hours podcast that I think makes this point (episode link: https://80000hours.org/podcast/episodes/richard-ngo-large-language-models/#transcript):
"I think that a lot of other problems that we’ve faced as a species have been on human timeframes, so you just have a relatively long time to react and a relatively long time to build consensus. And even if you have a few smaller incidents, then things don’t accelerate out of control.
"I think the closest thing we’ve seen to real exponential progress that people have needed to wrap their heads around on a societal level has been COVID, where people just had a lot of difficulty grasping how rapidly the virus could ramp up and how rapidly people needed to respond in order to have meaningful precautions.
"And in AI, it feels like it’s not just one system that’s developing exponentially: you’ve got this whole underlying trend of things getting more and more powerful. So we should expect that people are just going to underestimate what’s happening, and the scale and scope of what’s happening, consistently — just because our brains are not built for visualising the actual effects of fast technological progress or anything near exponential growth in terms of the effects on the world."
I'm not saying Richard is an "AI doomer", but hopefully this helps explain why some researchers think there's a good chance we'll make AI that can ruin the future within the next 50 years.