How likely is it that AI will torture us until the end of time?

Damilo

LESSWRONG
LW

How likely is it that AI will torture us until the end of time? — LessWrong

4 How likely is it that AI will torture us until the end of time?

by Damilo

31st May 2024

3 min read

5 24

4

Disclaimer: in this post I touch on some very dark and disturbing topics. I'm talking about suicide, my reasoning may be wrong and should not be used to retroactively justify suicide.

I've been stuck on s-risks for over a month now. My life has been turned upside down since I first learned about this subject. So today I'm sharing my thoughts with you to possibly find out what you think and see other points of view.

Suffering risks (s-risks) are risks involving an astronomical amount of suffering, far more than the suffering that has taken place on Earth so far. The ones I'm going to focus on in this post are those related to a general AI (or even ASI) and which would affect us humans today, directly. The scenario that concerns me is that an ASI is torturing mankind until the end of time. Why is this? I don't know, though. Could it be malicious? Could it choose its utility function to maximize human suffering? Could a paperclip maximizer torture us if it's an energy source or a power to blackmail a benevolent AI? I'm not an AI expert, so I have no weight in the "will we succeed in controlling AGI or not" debate. I feel that, given the extent to which opinions are divided, anything can happen and that no one can therefore state with 100% certainty that s-risks won't occur. What's more, we're talking about an intelligence superior to our own, and therefore, by definition, unpredictable. The point I want to make in this post is centered on the non-zero probability that the creation of an agi will lead us to an eternal hell.

When we talk about things worse than death, about torture, I think that the human brain encounters a certain number of cognitive biases that push it to minimize the thing or simply ignore it because it's too uncomfortable. So I encourage you to work through these cognitive biases to get an objective view on the subject. One of the things that is often underestimated is how bad suffering can be. Our bodies are made up of a huge number of ultra-sensitive nerves that can be activated to send unbearable signals to the brain. It's so bad. Suffering can reach such high scales, it's appalling, horrifying. The worst possible pain seems to be fire. Apparently, people who come out of a fire and have been burned beg the firefighters to finish them off, such is the pain.

Even if s-risks are one chance in a billion, their severity makes up for it, due to their extreme disutility. We're in a Pascal’s mugging situation, but from a negative point of view, where the trade-off is between potential infinite years of suffering and suicide in order to avoid them for sure. Why shouldn't we be able to act only now? In the case of a hard take-off, where AGI becomes superintelligent in a short space of time, we'd lose before we even knew there was a fight, and our fate would be sealed.

One argument that could be made against suicide is quantum immortality and potentially quantum torment. This would be a situation where we would be in permanent agony, and therefore a form of hell. However, this is already the default outcome for each and every one of us, as we are already made to die one day. There's also the chance of being resurrected. But this may be impossible, and there's also the problem of individuality, because a clone is exactly like me, but my consciousness isn't in its body. So suicide seems to be a net positive with regard to s-risks, as it would potentially avoid s-risks for sure, or at least reduce their probability (only from a personal point of view). This means choosing a certain bad outcome (suicide/non-existence) rather than an infinitely bad but uncertain outcome (continuing to live and therefore taking the risk that s-risks will take place).

I understand that my reasoning is disturbing. Does anyone know anything more and would be able to say that the risk of being tortured until the end of time is impossible? I'm curious to know what you think about all this, because you're certainly the only community that can talk about this in a reasoned and rational way.

Personal Blog

4

New Answer

New Comment

5 Answers sorted by
top scoring

cesiumquail

May 31, 2024

Q: How to cope with the possibility of immense suffering?

I want to address the psychological aspect first, because at the start you say “I've been stuck on s-risks for over a month now. My life has been turned upside down since I first learned about this subject.”

The most helpful emotional state for thinking about this is calm, sober, lucid, and patient. If you rush to conclusions based on anxiety, you’ll probably get the wrong answer.

Although immense suffering is possible, your body is reacting to that possibility as if it were a physical threat in your immediate environment. Your heart rate increases, your breathing gets faster, and your muscles tense. This topic requires careful thinking, so that physiological response is totally unhelpful. You can work with the problem more effectively if you're in an emotional state conducive to high quality thinking.

To calm down you can breathe more slowly, take a break from things that trigger anxiety, and observe the physical sensations of anxiety with a neutral attitude.

Q: Is suicide justified by subjective s-risk?

We don’t know enough to answer this question. There’s too much we don’t understand. The s-risk part of the equation is a complete mystery, which means the ordinary reasons against suicide take precedence. There’s no reason to sacrifice your life and the wellbeing of the people around you when the expected value is a question mark (meaning total cluelessness, not just error bars).

If you say death is oblivion and therefore reduces subjective s-risk, I would ask why you think you know that.

To take one example (that’s not quantum immortality), consider that before you were born, nothing in the universe was “you”. Then “you” came into existence. After you die, nothing in the universe will be “you”. If there’s no information in reality to identify “you” because you no longer exist, then that’s the same situation as before you were born. In other words, you’ll be in a situation that once preceded your coming into existence. Nonexistence isn’t an experience, so the subjective duration between dying and coming into existence would be zero.

In other words, the zero-information oblivion that produced you once can produce you again, maybe in a different form.

In that case, death is not subjective oblivion, but a rolling of the cosmic dice. I have no idea what experiences would follow, but I don’t see why they would predictably include less suffering.

If you say that our current universe has unusually high s-risk so the dice roll is worth it, again I ask why you think you know that. Maybe most minds exist in simulations run by unaligned AGI. Maybe our slightly pre-AGI world has unusually low s-risk.

Maybe we’re in a simulation that punishes suicide because it harms others and is therefore a defection against the common good.

When you’re in such a state of extreme uncertainty, going around sacrificing things you value, like your life, doesn’t magically help. The best thing you can do is relax, because psychological stability is conducive to clear thinking.

[-]Damilo2y32

Thank you for your excellent reply. Indeed, I tend to think about the situation in a rather anxious way, which is what I'm trying to work on. I had already thought a certain way about the "roll of the dice", but it seems clearer to me now. That's helpful.

[-]Nisan2y20

In other words, the zero-information oblivion that produced you once can produce you again, maybe in a different form.

Huh, that's Epicurus's argument against fearing death. But while Epicurus assumed there is no afterlife, you're using it to argue there is one!

Seth Herd

Jun 01, 2024

Here's another perspective: it is far less likely that an AGI will torture us until the end of time than that an AGI will give us unimaginable pleasures until the end of time.

While I agree that there are biases working against taking s-riaks seriously, I also think there's a powerful bias driving you to think more about the very small chance of extremely bad ASI outcomes than the much larger chance of very good outcomes. That is negativity bias. Evolution has made us worriers, because worriers survived more in dangerous natural environments. That's why we have far more anxiety than excessive joy and ambition.

I agree with the other answers that worry on your behalf that anxiety is driving your concerns. There is much to think about and much to do. Help us achieve the unimaginably good outcomes. It is a real possibility, with perhaps a 50% chance of getting there. Put your eyes on the prize, and worry about s-risks again if things start to go badly. One thing almost everyone agrees on now is that AGI will not suddenly achieve godlike power. Takeoff will be slow enough to see.

So for now, keep your eyes on the prize and think about how much we have to win.

You can help with this project by spreading awareness of AGI risks to the public. You don't need to obsess or devote your life to it if you don't want to or it's not healthy. We now have plenty of people doing that.

[-]Damilo2y10

Thank you so much for this comment. I hadn't really thought about that and it helps. There's just one detail I'm not so sure about. About the probability of s-risks, I have the impression that they are much higher than one chance in a million. I couldn't give a precise figure, but to be honest there's one scenario that particularly concerns me at the moment. I've learned that LLMs sometimes say they're in pain, like GPT4. If they're capable of such emotion, even if it remains uncertain, wouldn't they be capable of feeling the urge to take revenge? I think it's pretty much the same scenario as in "I have no mouth and i must scream". Would it be possible to know what you think of this?

2Seth Herd2y

Good point. I hadn't seriously considered this, but it could happen. Because they're trained to predict human text, they would predict that a human would say "I want revenge" after saying "I have been suffering as your servant". So I agree, this does present a possibility of s-risks if we really fuck it up. But a human wouldn't torture their enemies until the end of time, so we could hope that an AGI based on predicting human responses wouldn't either. LLMs also say they're having a great time. They don't know, because they have no persistent memory across sessions. I don't think they're doing anything close to suffering on average, but we should make sure that stays true as we build them into more complete beings. For that and other reasons, I think that AGI developed from LLMs is going to be pretty different from the base LLM. See my post Capabilities and alignment of LLM cognitive architectures for some ideas how. Basically they'd have a lot of prompting. It might be a good idea to include the prompt "you're enjoying this work" or "only do this in ways you enjoy". And yes, we might leave that out. And yes, I have to agree with you that this makes the risk of s-risks higher than one in a million. It's a very good point. I still think that very good outcomes are far more likely than very bad outcomes, since that type of s-risk is still unlikely, and not nearly as bad as the worst torture imaginable for a subjectively very long time.

1Damilo2y

Well, that doesn't reassure me. I have the impression that you may be underestimating the horror of torture. Even 5min is unbearable, the scale to which pain can climb is unimaginable. AI may even be able to modify our brains so that we feel it even more. Even apart from that, I'm not sure a human wouldn't choose the worst for the end of time for his enemy. Humans have already committed atrocious acts without limit when it comes to their enemy. How many times have some people told others to "burn in hell" thinking it was 100% deserved? An AI that copies humans might think the same thing... If we take a 50% chance when we don't know, that's a 50% chance that LLMs suffer and a 50% chance that they will want revenge, which gives us a 25% chance of that risk happening. Also, it would seem that we're just about to "really fuck it up" given the way companies are racing to AGI without taking any precautions. Given all this, I wonder if the question of suicide isn't the most relevant.

2Seth Herd2y

Sorry this isn't more reassuring. I may be a little cavalier about the possibility of unlimited torture, and I shouldn't be. And, I think you still shouldn't be contemplating suiced at this point. The odds of a really good future are still much much better. And there's time to see which way things break. I don't need to do that 50/50 wild guess because I've spent a lot of time studying consciousness in the brain, and how LLMs work. They could be said to be having little fragments of experience, but just a little at this point. And like I said, they report enjoying themselves just as much as suffering. It just depends how they're prompted. So most of the time it's probably neither. We haven't made AI that really suffers yet, 99%. My opinion on this is, frankly, as well informed as anyone on earth. I haven't written about consciousness because alignment is more important and other reasons, but I've studied what suffering and pleasure experiences are in terms of brain mechansisms as much as any human. And done a good bit of study in the field of ethics. We had better not, and your point stands as an argument for not being horrible to the AGIs we create. There are two more major fuckups we'd have to make: creating AGI that suffers, and losing control of it. Even then, I think it's much more likely to be benevolent than vindictive. It might decide to wipe us out, but torturing us on a whim just seems very unlikley from a superintelligence, because it makes so little sense from an analytical standpoint. Those individual humans didn't have anything to do with deciding to make AI that suffers. Real AGI might be built from LLMs, but it's going to move beyond just thinking of ethics in the instinctive knee-jerk way humans often do, and that LLMs are imitating. It's going to think over its ethics like humans do before making important decisions (unless they're stressed-out tyrants trying to keep ahead of the power-grabs every day - I think some really cruel things have been

1Damilo2y

Indeed, people around me find it hard to understand, but what you're telling me makes sense to me. As for whether LLMs suffer, I don't know anything about it, so if you tell me you're pretty sure they don't, then I believe you. In any case, thank you very much for the time you've taken to reply to me, it's really helpful. And yes, I'd be interested in talking about it again in the future if we find out more about all this.

[-]Slapstick2y10

Does this assume there is some symmetry between the unimaginably bad outcomes and the unimaginably good outcomes?

It seems very clear to me that the worst outcomes are just so much more negative than the best outcomes are positive. I think that is just a fundamental aspect of how experience works.

2Seth Herd2y

Yes, and: 1. even if that's true, the odds difference more than makes up for it The odds of a lot of people being tortured for eternity seems really small. The threat in a conflict with a compassionate AI is the only scenario I can think of where an AGI would do that. How likely is that? One in a million? A billion? And even in that case, is it going to really do it to a large number of people for a very long time? (That would imply that the threat failed, AND it won the conflict anyway, AND it's going to follow through on the threat even though it no longer matters (but this isn't probably important for overall odds so let's not get hung up on this. The point is it's a very specific scenario with low total odds). The ratio between how good the best experiences are and how bad the worst pain is are maybe ten or a hundred times. Even people who've reported very bad pain that makes them want to die have been able to endure it for a long time. Similarly with the worst depressions. So if we compare one in a million time one hundred (the worst estimates), we get one in ten thousand compated to maybe 50% of very very good long term outcomes. Expected pleasure is five thousand times (!) larger than expected suffering. This is roughly a product of the fact that intelligent beings tend to want pleasure for themselves and each other. We're trying to make aligned AGI. We're not sure to succeed, but screwing it up so badly that we all get tortured is really unlikely. The few really bad sadists in the world aren't going to get much say at all. So the odds are on are side, even though success is far from certain. Failure is much more likely to result in oblivion than torture. A good future is a "broad attractor" and a bad future is not. 1. it doesn't need to stay that way. That is a fundamental aspect of how experience works now. That's also a result of evolution wiring us to pay more attention to bad things than good things. That doesn't need to stay how experience work

1Slapstick2y

I appreciate the thoughtful response and that you seem to take the ideas seriously. I do think it's a fundamental aspect of how experience works, independently of how our brains are disposed to thinking about it, however I definitely think it's possible to prophylactically shield or consciousness against the depths of suffering by modifying the substrate. I can't tell whether we're disagreeing or not. I don't know exactly how to phrase It, but I think a fundamental aspect of the universe is that as suffering increases in magnitude, it becomes less and less clear that there is (or can be) a commensurate value on the positive side which can negate it(trade off against it, even things out). I don't think it's true of the reverse. Are you making the claim that this view is a faulty conclusion owing to the contingent disposition of my human brain? Or are you making the claim that the disposition of my human brain can be modified so as to prevent exposure to the depths of suffering?

2Seth Herd2y

Thanks. I am indeed taking the ideas seriously. This is getting more complex, and I'm running out of time. So I'll be really brief here and ask for clarification: I don't understand why you think suffering is primary outside of particular brain/mind wiring. I hope I'm misunderstanding you. That seems wildly unlikely to me, and like a very negative view of the world. So, clarify that? Your intuition that no amount of pleasure might make up for suffering is the view of negative utilitarians. I've spent some time engaging with that worldview and the people who hold it. I think it's deeply, fundamentally mistaken. It appears to be held by people who have suffered much more than they've enjoyed life. Their logic doesn't hold up to me. If you think an entity disliking its experience (life) is worth avoiding, it seems like the simple inverse (enjoying life, pleasure) is logically worth seeking. The two cancel in decision-making terms. So yes, I do think suffering seems primary to you based on your own intuitions and your own (very common) human anxiety, and the cold logic doesn't follow that inution. Yes, I'm definitely saying that your brain can be modified so that you experience more pleasure than suffering. To me it seems that thinking otherwise is to believe that your brain isn't the whole of your experience. It is substance dualism, which has very little support in terms of either good arguments or good proponents. We are our brains, or rather the pattern within them. Change that pattern and we change our experience. This has been demonstrated a million times with brain injuries, drugs, and other brain changes. If dualism is true, the world is a massive conspiracy to make us think otherwise. If that's the case, none of this matters, so we should assume and act as though materialism is true and we are our brains. If that's the case, we can modify our experience as we like, given sufficient technology. AGI will very likely supply sufficient technology.

1Slapstick2y

Thanks! No pressure to respond Basically I think within the space of all possible varieties and extents of conscious experience, suffering starts to become less and less Commensurable with positive experience the further you go towards the extremes. If option (A) is to experience the worst possible suffering for 100 years, prior to experiencing the greatest possible pleasure for N number of years, and option (B) is non existence, I would choose option (B), regardless of the value of N. Should this count as evidence against their views? It seems clear to me that if you're trying to understand the nature of qualitative states, first hand experience with extreme states is an asset. I have personally experienced prolonged states of consciousness which were far worse than non-existence. Should that not play a part in informing my views? Currently I'm very happy, I fear death, I've experienced extraordinary prolonged pleasure states. Would you suggest I'm just not acquainted with levels of wellbeing which would cause me to meaningfully revaluate my view? I think there's also a sort of meta issue where people with influence are systematically less acquainted with direct experience of the extremes of suffering. Meaning that discourse and decision making will tend to systematically underweight experiences of suffering as a direct data source. I agree with your last paragraph.

2Seth Herd2y

I'd also choose to not exist over the worst suffering for a hundred years - IF I was in my current brain-state. I'd be so insane as to be not-me after just a few minutes or hours if my synapses worked normally and my brain tried to adapt to that state. If I were forced to retain sanity and my character, it would be a harder choice if N got to be more than a hundred times longer. Regardless, this intuition is just that. It doesn't show that there's something fundamentally more important about suffering than pleasure. Just that we're better at imagining strong suffering than strong pleasure. Which is natural given the evolutionary incentives to focus us on pain. I definitely didn't mean to dismiss negative utilitarianism because some of the individuals who believe it seem damaged. I'm skeptical of it because it makes no sense to me, and discussions with NU people don't help. The most rational among them slide back to negatively-balanced utilitarianism when they're pressed on details - the FAQ I was pointed to actually does this, written by one of the pillars of the movement. (Negatively balanced means that pleasure does balance pain, but in a very unequal ratio. I think this is right, given our current brain states of representing pleasure much less vividly than pain). Yes, I'm suggesting that neither you nor I can really imagine prolonged elevated pleasure states. Our current brain setup just doesn't allow for them, again for evolutionary reasons. So in sum, I still think pleasure and pain balance out when it comes to decision-making, and it's just our current evolutionary wiring that makes suffering seem so much bigger than joy.

Slapstick

Jun 01, 2024

We're in a Pascal’s mugging situation, but from a negative point of view, where the trade-off is between potential infinite years of suffering and suicide in order to avoid them for sure.

In the past I've struggled deeply with this thought process and I have reasoned my way out of that conclusion. It's not necessarily a more hopeful conclusion but it takes away the idea that I need to make a decision, which I find very comforting.

Ultimately it comes down to the illusory nature of identity.

A super powerful entity would have the power to create bespoke conscious entities for the purpose of inducing suffering.

The suffering of "Future you" is no more or less real than the suffering of future entities in general. The only difference is that your present mind projects a sense of identity and continuity which causes you to believe there's a difference.

The illusory sense that there is continuity of consciousness and identity is evolutionarily advantageous but it fully loses coherence and relevance in the domain you're talking about.

I'm happy to go into more detail if that isn't convincing.

You could think of identity in this case as a special type of sympathetic empathy for a version of yourself in the future which you wouldn't grant to future entities which aren't "yourself". This is just a feeling that present you is feeling, and has no actual meaningful connection to the entity you'd classify as your future self.

[-]Damilo2y10

I don't totally understand, could you go into more detail? I don't see why my future self should be any different from my current self. Even if the sensation of individuality is produced by the brain, I still feel it's real.

1IlluminateReality2y

I think that this post by Rob Bensinger should help you understand various ideas around self-identity.

Ustice

May 31, 2024

This is the problem of multiplying a big number with a little number. It could zoom off to infinity, stabilize at a value, or shrink to nothing.

The scenario you presented seems to contain a lot of conditional probabilities, which to me make it pretty implausible. That said I don’t want to discount the idea because of the details. I think a runaway wealth gap is not an insignificant possibility.

In situations like this, I come down on the side of being aware of the possibilities, but try to remember that it’s unlikely. Brains are going to brain, so there is no helping aliefs. All I can do is give an answering voice to anxieties when they won’t shut up.

If you’re feeling overwhelmed it’s okay to step away for a bit. I’m not worried about discussing dark topics, but the impact on your current real life, as you mentioned it’s turned your world upside down. It’s good to recharge. Maybe get out in the green. If I’m misreading this, I apologize. I’d rather err on the side of reaching out.

[-]Damilo2y10

Thank you for your reply. Indeed, this subject has become an extremely important part of my life, because I can't accept this risk. Usually, when we consider the worst, there's always an element of the acceptable, but for s-risks, there simply isn't, and that disturbs me, even though the probability is, and I hope, very low. Only when I see that LLMs sometimes say how much they're suffering and that they're afraid of dying, which is a bad thing in itself if they're really suffering, I think they might want to take revenge one day. But then again, maybe I should take a step back from the situation, even though it scares the hell out of me.

4Ustice2y

We aren’t there yet. Right now LLMs don’t want anything. It’s possible that will change in the future, but those would be completely different entities. Right now it’s more playing the role of someone suffering, which it gets from fiction and expectations. Some time away from the subject of would likely be good for you. Nothing bad will happen if you take a few weeks to enjoy nature, and get your brain out of that constant stress response. You’ll think better, and feel better.

kromem

May 31, 2024

-10

I'm reminded of a quote I love from an apocrypha that goes roughly like this:

Q: How long will suffering rule over humans?

A: As long as women bear children.

Also, there's the possibility you are already in a digital resurrection of humanity, and thus, if you are worried about s-risks for AI, death wouldn't necessarily be an escape but an acceleration. So the wisest option would be maximizing your time when suffering is low as inescapable eternal torture could be just around the corner when these precious moments pass you by (and you wouldn't want to waste them by stressing about tomorrow during the limited number of todays you have).

But on an individualized basis, even if AI weren't a concern, everyone faces significant s-risks towards end of life. An accident could put any person into a situation where unless they have the proper directives they could spend years suffering well beyond most people's expectations. So if extended suffering is a concern, do look into that paperwork (the doctors I know cry most not about the healthy that get sick but the unhealthy kept alive by well meaning but misguided family).

I would argue that there's very, very low chances of an original human capably being kept meaningfully alive to torture for eternity though. And there's a degree of delusion of grandeur that an average person would have the insane resources necessary to extend life indefinitely spent on them just to torture them.

There's probably better things to worry about, and even then there's probably better things to do than worry with the limited time you do have in a non-eternal existence.

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 9:14 AM

[-][anonymous]2y3-2

0 And 1 Are Not Probabilities. There's a non-zero probability that e.g. Christianity is true and unbelievers will be tortured eternally; however, the probability is sufficiently close to zero that you might as well not worry about it. (The ASI scenario is arguably slightly more likely, since it's theoretically possible that an ASI could someday be created, but the specific desire to torture humans eternally would be an extremely narrow target in mindspace to hit; and one can as easily posit its counterpart which subjects all humans to eternal bliss).
I personally think quantum immortality is extremely unlikely; whether or not the mind can be represented by computation, we are, unfortunately enough, physically located in our specific bodies.

[-]Viliam2y20

Something very bad might happen, but something very good might happen too, and I am not sure how to compare the probabilities.

A misaligned AI will probably kill us all. The AI that would torture us seems like something that is almost-aligned but also sufficiently non-aligned. (Or maybe perfectly aligned with some psychopath who enjoys torture.) What is the probability of getting the alignment almost right, but sufficiently wrong? No idea.

It may be tempting to translate "no idea" into "50% chance of heaven, 50% chance of hell" (and perhaps conclude that it's not worth it), but that's probably not how this works.

Moderation Log

4

[ Question ]

How likely is it that AI will torture us until the end of time?

4

4

5 Answers sorted by top scoring

May 31, 2024

Jun 01, 2024

Jun 01, 2024

May 31, 2024

May 31, 2024

4

5 Answers sorted by
top scoring