S-risks: Why they are the worst existential risks, and how to prevent them

It seems like an s-risk outcome (even one that keeps some people happy) could be more than a million times worse than an x-risk outcome, while not being a million times more improbable, so focusing on s-risks is correct. The argument wasn't as clear to me before. Does anyone have good counterarguments? Why shouldn't we all focus on s-risk from now on?

(Unsong had a plot point where Peter Singer declared that the most important task for effective altruists was to destroy Hell. Big props to Scott for seeing it before the rest of us.)

[-]paulfchristiano9y450

I don't buy the "million times worse," at least not if we talk about the relevant E(s-risk moral value) / E(x-risk moral value) rather than the irrelevant E(s-risk moral value / x-risk moral value). See this post by Carl and this post by Brian. I think that responsible use of moral uncertainty will tend to push you away from this kind of fanatical view

I agree that if you are million-to-1 then you should be predominantly concerned with s-risk, I think they are somewhat improbable/intractable but not that improbable+intractable. I'd guess the probability is ~100x lower, and the available object-level interventions are perhaps 10x less effective. The particular scenarios discussed here seem unlikely to lead to optimized suffering, only "conflict" and "???" really make any sense to me. Even on the negative utilitarian view, it seems like you shouldn't care about anything other than optimized suffering.

The best object-level intervention I can think of is reducing our civilization's expected vulnerability to extortion, which seems poorly-leveraged relative to alignment because it is much less time-sensitive (unless we fail at alignment and so end up committi... (read more)

9cousin_it9y

Paul, thank you for the substantive comment! Carl's post sounded weird to me, because large amounts of human utility (more than just pleasure) seem harder to achieve than large amounts of human disutility (for which pain is enough). You could say that some possible minds are easier to please, but human utility doesn't necessarily value such minds enough to counterbalance s-risk. Brian's post focuses more on possible suffering of insects or quarks. I don't feel quite as morally uncertain about large amounts of human suffering, do you? As to possible interventions, you have clearly thought about this for longer than me, so I'll need time to sort things out. This is quite a shock.

[-]paulfchristiano9y100

large amounts of human utility (more than just pleasure) seem harder to achieve than large amounts of human disutility (for which pain is enough).

Carl gave a reason that future creatures, including potentially very human-like minds, might diverge from current humans in a way that makes hedonium much more efficient. If you assigned significant probability to that kind of scenario, it would quickly undermine your million-to-one ratio. Brian's post briefly explains why you shouldn't argue "If there is a 50% chance that x-risks are 2 million times worse, than they are a million times worse in expectation." (I'd guess that there is a good chance, say > 25%, that good stuff can be as efficient as bad stuff.)

I would further say: existing creatures often prefer to keep living even given the possibility of extreme pain. This can be easily explained by an evolutionary story, which suffering-focused utilitarians tend to view as a debunking explanation: given that animals would prefer keep living regardless of the actual balance of pleasure and pain, we shouldn't infer anything from that preference. But our strong dispreference for intense suffering has a similar evolutionary origin, and is no more reflective of underlying moral facts than is our strong preference for survival.

5Kaj_Sotala9y

Can you elaborate on what you mean by this? People like Brian or others at FRI don't seem particularly averse to philosophical deliberation to me... [...] I support this compromise and agree not to destroy the world. :-)

[-]Lukas_Gloor9y100

Those of us who sympathize with suffering-focused ethics have an incentive to encourage others to think about their values now, at least in crudely enough terms to take a stance on prioritizing preventing s-risks vs. making sure we get to a position where everyone can safely deliberate their values further and then everything gets fulfilled. Conversely, if one (normatively!) thinks the downsides of bad futures are unlikely to be much worse than the upsides of good futures, then one is incentivized to promote caution about taking confident stances on anything population-ethics-related, and instead value deeper philosophical reflection. The latter also has the upside of being good from a cooperation point of view: Everyone can work on the same priority (building safe AI that helps with philosophical reflection) regardless of one's inklings about how personal value extrapolation is likely to turn out.

(The situation becomes more interesting/complicated for suffering-focused altruists once we add considerations of multiverse-wide compromise via coordinated decision-making, which, in extreme versions at least, would call for being "updateless" about the direction of one's own values.)

4paulfchristiano9y

People vary in what kinds of values change they would consider drift vs. endorsed deliberation. Brian has in the past publicly come down unusually far on the side of "change = drift," I've encountered similar views on one other occasion from this crowd, and I had heard second hand that this was relatively common. Brian or someone more familiar with his views could speak more authoritatively to that aspect of the question, and I might be mistaken about the views of the suffering-focused utilitarians more broadly.

4RomeoStevens9y

In support of this, my system 1 reports that if it sees more intelligent people taking S-risk seriously it is less likely to nuke the planet if it gets the chance. (I'm not sure I endorse nuking the planet, just reporting emotional reaction).

2ESRogs9y

Did you mean to say, "if the latter" (such that x-risk and s-risk reduction are aligned when suffering-hating civilizations decrease s-risk), rather than "if the former"?

[-]komponisto9y190

I feel a weird disconnect on reading comments like this. I thought s-risks were a part of conventional wisdom on here all along. (We even had an infamous scandal that concerned one class of such risks!) Scott didn't "see it before the rest of us" -- he was drawing on an existing, and by now classical, memeplex.

It's like when some people spoke as if nobody had ever thought of AI risk until Bostrom wrote Superintelligence -- even though that book just summarized what people (not least of whom Bostrom himself) had already been saying for years.

[-]cousin_it9y140

I guess I didn't think about it carefully before. I assumed that s-risks were much less likely than x-risks (true) so it's okay not to worry about them (false). The mistake was that logical leap.

In terms of utility, the landscape of possible human-built superintelligences might look like a big flat plain (paperclippers and other things that kill everyone without fuss), with a tall sharp peak (FAI) surrounded by a pit that's astronomically deeper (many almost-FAIs and other designs that sound natural to humans). The pit needs to be compared to the peak, not the plain. If the pit is more likely, I'd rather have the plain.

Was it obvious to you all along?

4Wei Dai9y

Didn't you realize this yourself back in 2012?

1cousin_it9y

I didn't realize then that disutility of human-built AI can be much larger than utility of FAI, because pain is easier to achieve than human utility (which doesn't reduce to pleasure). That makes the argument much stronger.

1Wei Dai9y

This argument doesn't actually seem to be in the article that Kaj linked to. Did you see it somewhere else, or come up with it yourself? I'm not sure it makes sense, but I'd like to read more if it's written up somewhere. (My objection is that "easier to achieve" doesn't necessarily mean the maximum value achievable is higher. It could be that it would take longer or more effort to achieve the maximum value, but the actual maximums aren't that different. For example, maybe the extra stuff needed for human utility (aside from pleasure) is complex but doesn't actually cost much in terms of mass/energy.)

3cousin_it9y

The argument somehow came to my mind yesterday, and I'm not sure it's true either. But do you really think human value might be as easy to maximize as pleasure or pain? Pain is only about internal states, and human value seems to be partly about external states, so it should be way more expensive.

8David Althaus9y

One of the more crucial points, I think, is that positive utility is – for most humans – complex and its creation is conjunctive. Disutility, in contrast, is disjunctive. Consequently, the probability of creating the former is smaller than the latter – all else being equal (of course, all else is not equal). In other words, the scenarios leading towards the creation of (large amounts of) positive human value are conjunctive: to create a highly positive future, we have to eliminate (or at least substantially reduce) physical pain and boredom and injustice and loneliness and inequality (at least certain forms of it) and death, etc. etc. etc. (You might argue that getting "FAI" and "CEV" right would accomplish all those things at once (true) but getting FAI and CEV right is, of course, a highly conjunctive task in itself.) In contrast, disutility is much more easily created and essentially disjunctive. Many roads lead towards dystopia: sadistic programmers or failing AI safety wholesale (or "only" value-loading or extrapolating, or stable self-modification), or some totalitarian regime takes over, etc. etc. It's also not a coincidence that even the most untalented writer with the most limited imagination can conjure up a convincing dystopian society. Envisioning a true utopia in concrete detail, on the other hand, is nigh impossible for most human minds. Footnote 10 of the above mentioned s-risk-static makes a related point (emphasis mine): "[...] human intuitions about what is valuable are often complex and fragile (Yudkowsky, 2011), taking up only a small area in the space of all possible values. In other words, the number of possible configurations of matter constituting anything we would value highly (under reflection) is arguably smaller than the number of possible configurations that constitute some sort of strong suffering or disvalue, making the incidental creation of the latter ceteris paribus more likely." Consequently, UFAIs such as paperclippers are m

2cousin_it9y

Yeah, I also had the idea about utility being conjunctive and mentioned it in a deleted reply to Wei, but then realized that Eliezer's version (fragility of value) already exists and is better argued. On the other hand, maybe the worst hellscapes can be prevented in one go, if we "just" solve the problem of consciousness and tell the AI what suffering means. We don't need all of human value for that. Hellscapes without suffering can also be pretty bad in terms of human value, but not quite as bad, I think. Of course solving consciousness is still a very tall order, but it might be easier than solving all philosophy that's required for FAI, and it can lead to other shortcuts like in my recent post (not that I'd propose them seriously).

4Lukas_Gloor9y

Some people at MIRI might be thinking about this under nonperson predicate. (Eliezer's view on which computations matter morally is different from the one endorsed by Brian, though.) And maybe it's important to not limit FAI options too much by preventing mindcrime at all costs – if there are benefits against other very bad failure modes (or – cooperatively – just increased controllability for the people who care a lot about utopia-type outcomes), maybe some mindcrime in the early stages to ensure goal-alignment would be the lesser evil.

-1dogiv9y

Human disutility includes more than just pain too. Destruction of the humanity (the flat plain you describe) carries a great deal of negative utility for me, even if I disappear without feeling any pain at all. There's more disutility if all life is destroyed, and more if the universe as a whole is destroyed... I don't think there's any fundamental asymmetry. Pain and pleasure are the most immediate ways of affecting value, and probably the ones that can be achieved most efficiently in computronium, so external states probably don't come into play much at all if you take a purely utilitarian view.

0[anonymous]9y

Our values might say, for example, that a universe filled with suffering insects is very undesirable, but a universe filled with happy insects isn't very desirable. More generally, if our values are a conjunction of many different values, then it's probably easier to create a universe where one is strongly negative and the rest are zero, than a universe where all are strongly positive. I haven't seen the argument written up, I'm trying to figure it out now.

4Kaj_Sotala9y

Huh, I feel very differently. For AI risk specifically, I thought the conventional wisdom was always "if AI goes wrong, the most likely outcome is that we'll all just die, and the next most likely outcome is that we get a future which somehow goes against our values even if it makes us very happy." And besides AI risk, other x-risks haven't really been discussed at all on LW. I don't recall seeing any argument for s-risks being a particularly plausible category of risks, let alone one of the most important ones. It's true that there was That One Scandal, but the reaction to that was quite literally Let's Never Talk About This Again - or alternatively Let's Keep Bringing This Up To Complain About How It Was Handled, depending on the person in question - but then people always only seemed to be talking about that specific incident and argument. I never saw anyone draw the conclusion that "hey, this looks like an important subcategory of x-risks that warrants separate investigation and dedicated work to avoid".

8Wei Dai9y

There was some discussion back in 2012 and sporadically since then. (ETA: You can also do a search for "hell simulations" and get a bunch more results.) [...] I've always thought that in order to prevent astronomical suffering, we will probably want to eventually (i.e., after a lot of careful thought) build an FAI that will colonize the universe and stop any potential astronomical suffering arising from alien origins and/or try to reduce suffering in other universes via acausal trade etc., so the work isn't very different from other x-risk work. But now that the x-risk community is larger, maybe it does make sense to split out some of the more s-risk specific work?

2paulfchristiano9y

It seems like the most likely reasons to create suffering come from the existence of suffering-hating civilizations. Do you think that it's clear/very likely that it is net helpful for there to be more mature suffering-hating civilizations? (On the suffering-focused perspective.)

8Wei Dai9y

My intuition is that there is no point in trying to answer questions like these before we know a lot more about decision theory, metaethics, metaphilosophy, and normative ethics, so pushing for a future where these kinds of questions eventually get answered correctly (and the answers make a difference in what happens) seems like the most important thing to do. It doesn't seem to make sense to try to lock in some answers (i.e., make our civilization suffering-hating or not suffering-hating) on the off chance that when we figure out what the answers actually are, it will be too late. Someone with much less moral/philosophical uncertainty than I do would perhaps prioritize things differently, but I find it difficult to motivate myself to think really hard from their perspective.

2paulfchristiano9y

This question seems like a major input into whether x-risk reduction is useful.

1Wei Dai9y

If we try to answer the question now, it seems very likely we'll get the answer wrong (given my state of uncertainty about the inputs that go into the question). I want to keep civilization going until we know better how to answer these types of questions. For example if we succeed in building a correctly designed/implemented Singleton FAI, it ought to be able to consider this question at leisure, and if it becomes clear that the existence of mature suffering-hating civilizations actually causes more suffering to be created, then it can decide to not make us into a mature suffering-hating civilization, or take whatever other action is appropriate. Are you worried that by the time such an FAI (or whatever will control our civilization) figures out the answer, it will be too late? (Why? If we can decide that x-risk reduction is bad, then so can it. If it's too late to alter or end civilization at that point, why isn't it already too late for us?) Or are you worried more that the question won't be answered correctly by whatever will control our civilization?

1paulfchristiano9y

If you are concerned exclusively with suffering, then increasing the number of mature civilizations is obviously bad and you'd prefer that the average civilization not exist. You might think that our descendants are particularly good to keep around, since we hate suffering so much. But in fact almost all s-risks occur precisely because of civilizations that hate suffering, so it's not at all clear that creating "the civilization that we will become on reflection" is better than creating "a random civilization" (which is bad). To be clear, even if we have modest amounts of moral uncertainty I think it could easily justify a "wait and see" style approach. But if we were committed to a suffering-focused view then I don't think your argument works.

2Wei Dai9y

It seems just as plausible to me that suffering-hating civilizations reduce the overall amount of suffering in the multiverse, so I think I'd wait until it becomes clear which is the case, even if I was concerned exclusively with suffering. But I haven't thought about this question much, since I haven't had a reason to assume an exclusive concern with suffering, until you started asking me to. [...] Earlier in this thread I'd been speaking from the perspective of my own moral uncertainty, not from a purely suffering-focused view, since we were discussing the linked article, and Kaj had written: [...] What's your reason for considering a purely suffering-focused view? Intellectual curiosity? Being nice to or cooperating with people like Brian Tomasik by helping to analyze one of their problems?

1paulfchristiano9y

Understanding the recommendations of each plausible theory seems like a useful first step in decision-making under moral uncertainty.

0Lukas_Gloor9y

Perhaps this, in case it turns out to be highly important but difficult to get certain ingredients – e.g. priors or decision theory – exactly right. (But I have no idea, it's also plausible that suboptimal designs could patch themselves well, get rescued somehow, or just have their goals changed without much fuss.)

0komponisto9y

That sort of subject is inherently implicit in the kind of decision-theoretic questions that MIRI-style AI research involves. More generally, when one is thinking about astronomical-scale questions, and aggregating utilities, and so on, it is a matter of course that cosmically bad outcomes are as much of a theoretical possibility as cosmically good outcomes. Now, the idea that one might need to specifically think about the bad outcomes, in the sense that preventing them might require strategies separate from those required for achieving good outcomes, may depend on additional assumptions that haven't been conventional wisdom here.

0Kaj_Sotala9y

Right, I took this idea to be one of the main contributions of the article, and assumed that this was one of the reasons why cousin_it felt it was important and novel.

3[anonymous]9y

Thanks for voicing this sentiment I had upon reading the original comment. My impression was that negative utilitarian viewpoints / things of this sort had been trending for far longer than cousin_it's comment might suggest.

2Kaj_Sotala9y

The article isn't specifically negative utilitarian, though - even classical utilitarians would agree that having astronomical amounts of suffering is a bad thing. Nor do you have to be a utilitarian in the first place to think it would be bad: as the article itself notes, pretty much all major value systems probably agree on s-risks being a major Bad Thing: [...]

3AlexMennen9y

Yes, but the claim that that risk needs to be taken seriously is certainly not conventional wisdom around here.

0komponisto9y

Decision theory (which includes the study of risks of that sort) has long been a core component of AI-alignment research.

0AlexMennen9y

No, it doesn't. Decision theory deals with abstract utility functions. It can talk about outcomes A, B, and C where A is preferred to B and B is preferred to C, but doesn't care whether A represents the status quo, B represents death, and C represents extreme suffering, or whether A represents gaining lots of wealth and status, B represents the status quo, and C represents death, so long as the ratios of utility differences are the same in each case. Decision theory has nothing to do with the study of s-risks.

0komponisto9y

The first and last sentences of the parent comment do not follow from the statements in between.

0Kaj_Sotala9y

That doesn't seem to refute or change what Alex said?

0komponisto9y

What Alex said doesn't seem to refute or change what I said. But also: I disagree with the parent. I take conventional wisdom here to include support for MIRI's agent foundations agenda, which includes decision theory, which includes the study of such risks (even if only indirectly or implicitly).

0[anonymous]9y

Fair enough. I guess I didn't think carefully about it before. I assumed that s-risks were much less likely than x-risks (true) and so they could be discounted (false). It seems like the right way to imagine the landscape of superintelligences is a vast flat plain (paperclippers and other things that kill everyone without fuss) with a tall thin peak (FAIs) surrounded by a pit that's astronomically deeper (FAI-adjacent and other designs). The right comparison is between the peak and the pit, because if the pit is more likely, I'd rather have the plain.

4casebash9y

I think the reason why cousin_it's comment is upvoted so much is that a lot of people (including me) weren't really aware of S-risks or how bad they could be. It's one thing to just make a throwaway line that S-risks could be worse, but it's another thing entirely to put together a convincing argument. Similar ideas have been in other articles, but they've framed it in terms of energy-efficiency while defining weird words such as computronium or the two-envelopes problem, which make it much less clear. I don't think I saw the links for either of those articles before, but if I had, I probably wouldn't have read them. I also think that the title helps as well. S-risks is a catchy name, especially if you already know x-risks. I know that this term has been used before, but it wasn't used in the title. Further, while being quite a good article, you can read the summary, introduction and conclusion without encountering the idea that the author believes that s-risks are much greater than x-risks, as opposed to being just yet another risk to worry about. I think there's definitely an important lesson to be drawn here. I wonder how many other articles have gotten close to an important truth, but just failed to hit it out fo the park for some reason or another.

2Lukas_Gloor9y

Interesting! [...] I'm only confident about endorsing this conclusion conditional on having values where reducing suffering matters a great deal more than promoting happiness. So we wrote the "Reducing risks of astronomical suffering" article in a deliberately 'balanced' way, pointing out the different perspectives. This is why it didn't come away making any very strong claims. I don't find the energy-efficiency point convincing at all, but for those who do, x-risks are likely (though not with very high confidence) still more important, mainly because more futures will be optimized for good outcomes rather than bad outcomes, and this is where most of the value is likely to come from. The "pit" around the FAI-peak is in expectation extremely bad compared to anything that exists currently, but most of it is just accidental suffering that is still comparatively unoptimized. So in the end, whether s-risks or x-risks are more important to work on on the margin depends on how suffering-focused or not someone's values are. Having said that, I totally agree that more people should be concerned about s-risks and it's concerning that the article (and the one on suffering-focused AI safety) didn't manage to convey this point well.

0[comment deleted]9y

2Jiro9y

That sounds like a recipe for Pascal's Mugging.

3cousin_it9y

Only if you think one in a million events are as rare as meeting god in person.

2David Althaus9y

The article that introduced the term "s-risk" was shared on LessWrong in October 2016. The content of the article and the talk seem similar. Did you simply not come across it or did the article just (catastrophically) fail to explain the concept of s-risks and its relevance?

3cousin_it9y

I've seen similar articles before, but somehow this was the first one that shook me. Thank you for doing this work!

1ignoranceprior9y

And the concept is much older than that. The 2011 Felicifia post "A few dystopic future scenarios" by Brian Tomasik outlined many of the same considerations that FRI works on today (suffering simulations, etc.), and of course Brian has been blogging about risks of astronomical suffering since then. FRI itself was founded in 2013.

0Lumifer9y

Iain Banks' Surface Detail published in 2010 featured a war over the existence of virtual hells (simulations constructed explicitly to punish the ems of sinners).

2tristanm9y

The only counterarguments I can think of would be: * The claim that the likelihood of s-risks being close to that of x-risks seems not well argued to me. In particular, conflict seems to be the most plausible scenario (and one which has a high prior placed on it as we can observe that much suffering today is caused by conflict), but it seems to be less and less likely of a scenario once you factor in superintelligence, as multi-polar scenarios seem to be either very short-lived or unlikely to happen at all. * We should be wary of applying anthropomorphic traits to hypothetical artificial agents in the future. Pain in biological organisms may very well have evolved as a proxy to negative utility, and might not be necessary in "pure" agent intelligences which can calculate utility functions directly. It's not obvious to me that implementing suffering in the sense that humans understand it would be cheaper or more efficient for a superintelligence to do instead of simply creating utility-maximizers when it needs to produce a large number of sub-agents. * High overlap between approaches to mitigating x-risk and approaches to mitigating s-risks. If the best chance of mitigating future suffering is trying to bring about a friendly artificial intelligence explosion, then it seems that the approaches we are currently taking should still be the correct ones. * More speculatively: If we focus heavily on s-risks, does this open us up to issues regarding utility-monsters? Can I extort people by creating a simulation of trillions of agents and then threaten to minimize their utility? (If we simply value the sum of utility, and not necessarily the complexity of the agent having the utility, then this should be relatively cheap to implement).

2cousin_it9y

I think the most general response to your first three points would look something like this: Any superintelligence that achieves human values will be adjacent in design space to many superintelligences that cause massive suffering, so it's quite likely that the wrong superintelligence will win, due to human error, malice, or arms races. As to your last point, it looks more like a research problem than a counterargument, and I'd be very interested in any progress on that front :-)

0Lumifer9y

Why so? Flipping the sign doesn't get you "adjacent", it gets you "diametrically opposed". If you really want chocolate ice cream, "adjacent" would be getting strawberry ice cream, not having ghost pepper extract poured into your mouth.

3Good_Burning_Plastic9y

They said "adjacent in design space". The Levenshtein distance between return val; and return -val; is 1.

-1Lumifer9y

So being served a cup of coffee and being served a cup of pure capsaicin are "adjacent in design space"? Maybe, but funny how that problem doesn't arise or even worry anyone...

7dogiv9y

More like driving to the store and driving into the brick wall of the store are adjacent in design space.

0cousin_it9y

That's a twist on a standard LW argument, see e.g. here: [...] It seems to me that fragility of value can lead to massive suffering in many ways.

0Lumifer9y

You're basically dialing that argument up to eleven. From "losing a small part could lead to unacceptable results" you are jumping to "losing any small part will lead to unimaginable hellscapes": [...]

0cousin_it9y

Yeah, not all parts. But even if it's a 1% chance, one hellscape might balance out a hundred universes where FAI wins. Pain is just too effective at creating disutility. I understand why people want to be optimistic, but I think being pessimistic in this case is more responsible.

0Lumifer9y

So basically you are saying that the situation is asymmetric: the impact/magnitude of possible bad things is much much greater than the impact/magnitude of possible good things. Is this correct?

3cousin_it9y

Yeah. One sign of asymmetry is that creating two universes, one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us. Another sign is that pain is an internal experience, while our values might refer to the external world (though it's very murky), so the former might be much easier to achieve. Another sign is that in our world it's much easier to create a life filled with pain than a life that fulfills human values.

3dogiv9y

Yes, many people intuitively feel that a universe of pleasure and a universe of pain add to a net negative. But I suspect that's just a result of experiencing (and avoiding) lots of sources of extreme pain in our lives, while sources of pleasure tend to be diffuse and relatively rare. The human experience of pleasure is conjunctive because in order to survive and reproduce you must fairly reliably avoid all types of extreme pain. But in a pleasure-maximizing environment, removing pain will be a given. It's also true that our brains tend to adapt to pleasure over time, but that seems simple to modify once physiological constraints are removed.

2CarlShulman9y

"one filled with pleasure and the other filled with pain, feels strongly negative rather than symmetric to us" Comparing pains and pleasures of similar magnitude? People have a tendency not to do this, see the linked thread. "Another sign is that pain is an internal experience, while our values might refer to the external world (though it's very murky" You accept pain and risk of pain all the time to pursue various pleasures, desires and goals. Mice will cross electrified surfaces for tastier treats. If you're going to care about hedonic states as such, why treat the external case differently? Alternatively, if you're going to dismiss pleasure as just an indicator of true goals (e.g. that pursuit of pleasure as such is 'wireheading') then why not dismiss pain in the same way, as just a signal and not itself a goal?

0cousin_it9y

My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making? For example, I'd strongly disagree to create a box of pleasure and a box of pain, do you think my preference would go away after extrapolation?

1CarlShulman9y

"My point was comparing pains and pleasures that could be generated with similar amount of resources. Do you think they balance out for human decision making?" I think with current tech it's cheaper and easier to wirehead to increase pain (i.e. torture) than to increase pleasure or reduce pain. This makes sense biologically, since organisms won't go looking for ways to wirehead to maximize their own pain, evolution doesn't need to 'hide the keys' as much as with pleasure or pain relief (where the organism would actively seek out easy means of subverting the behavioral functions of the hedonic system). Thus when powerful addictive drugs are available, such as alcohol, human populations evolve increased resistance over time. The sex systems evolve to make masturbation less rewarding than reproductive sex under ancestral conditions, desire for play/curiosity is limited by boredom, delicious foods become less pleasant when full or the foods are not later associated with nutritional sensors in the stomach, etc. I don't think this is true with fine control over the nervous system (or a digital version) to adjust felt intensity and behavioral reinforcement. I think with that sort of full access one could easily increase the intensity (and ease of activation) of pleasures/mood such that one would trade them off against the most intense pains at ~parity per second, and attempts at subjective comparison when or after experiencing both would put them at ~parity. People will willingly undergo very painful jobs and undertakings for money, physical pleasures, love, status, childbirth, altruism, meaning, etc. Unless you have a different standard for the 'boxes' than used in subjective comparison with rich experience of the things to be compared I think we just haggling over the price re intensity. We know the felt caliber and behavioral influence of such things can vary greatly. It would be possible to alter nociception and pain receptors to amp up or damp down any particular

0cousin_it9y

We could certainly make agents for whom pleasure and pain would use equal resources per util. The question is if human preferences today (or extrapolated) would sympathize with such agents to the point of giving them the universe. Their decision-making could look very inhuman to us. If we value such agents with a discount factor, we're back at square one.

2CarlShulman9y

That's what the congenital deafness discussion was about. You have preferences over pain and pleasure intensities that you haven't experienced, or new durations of experiences you know. Otherwise you wouldn't have anything to worry about re torture, since you haven't experienced it. Consider people with pain asymbolia: [...] Suppose you currently had pain asymbolia. Would that mean you wouldn't object to pain and suffering in non-asymbolics? What if you personally had only happened to experience extremely mild discomfort while having lots of great positive experiences? What about for yourself? If you knew you were going to get a cure for your pain asymbolia tomorrow would you object to subsequent torture as intrinsically bad? We can go through similar stories for major depression and positive mood. Seems it's the character of the experience that matters. Likewise, if you've never experienced skiing, chocolate, favorite films, sex, victory in sports, and similar things that doesn't mean you should act as though they have no moral value. This also holds true for enhanced experiences and experiences your brain currently is unable to have, like the case of congenital deafness followed by a procedure to grant hearing and listening to music.

0cousin_it9y

Music and chocolate are known to be mostly safe. I guess I'm more cautious about new self-modifications that can change my decisions massively, including decisions about more self-modifications. It seems like if I'm not careful, you can devise a sequence that will turn me into a paperclipper. That's why I discount such agents for now, until I understand better what CEV means.

0Kaj_Sotala9y

This seems plausible but not obvious to me. Humans are superintelligent as compared to chimpanzees (let alone, say, Venus flytraps), but humans have still formed a multipolar civilization.

0tristanm9y

When thinking about whether s-risk scenarios are tied to or come about by similar means as x-risk scenarios (such as a malign intelligence explosion), the relevant issue to me seems to be whether or not such a scenario could result in a multi-polar conflict of cosmic proportions. I think the chance of that happening is quite low, since intelligence explosions seem to be most likely to result in a singleton.

0[anonymous]9y

Due to complexity and fragility of human values, any superintelligence that fulfills them will probably be adjacent in design space to many other superintelligences that cause lots of suffering (which is also much cheaper), so a wrong superintelligence might take over due to human error or malice or arms races. That's where most s-risk is coming from, I think. The one in a million number seems optimistic, actually.

2turchin9y

I agree that preventing s-risks is important, but I will try to look on possible counter arguments: 1. Benevolent AI will able to fight acasual war against evil AI in the another branch of the multiverse by creating more my happy copies, or more paths from suffering observer-moment to happy observer-moment. So creating benevolent superintelligence will help against suffering everywhere in the multiverse. 2. Non-existence is the worst form of suffering if we define suffering as action against our most important value. Thus x-risks are s-risks. Pain is not always suffering, as masochists exist. 3. If we value too much attention to animal suffering, we give ground to projects like Voluntary human extinction movement. So we increase chances of human extinction, as humans created animal farms. Moreover, if we agree that non-existence is not suffering, we could kill all life on earth and stop all sufferings - which is not right. 4. Benevolent AI will able to resurrect all possible sentient beings and animals and provide them infinite paradise thus compensating any current suffering of animals. 5. Only infinite and unbearable suffering are bad. We should distinguish unbearable sufferings like agony, and ordinary sufferings which just reinforcement learning signals for wetware of our brain and inform us about the past wrong decisions or the need to call a doctor.

5cousin_it9y

I think all of these are quite unconvincing and the argument stays intact, but thanks for coming up with them.

2turchin9y

1. I think longer explanation is needed to show how benevolent AI will save observers from evil AI. It is not just compensation for sufferings. It is based on the idea of the indexical uncertainty of equal observers. If two equal observers-moments exist, he doesn't know, which one them he is. So a benevolent AI creates 1000 copies of an observer-moment which is in jail of evil AI, and construct to each copy pleasant next moment. From the point of view of the jailed observer-moment, there will be 1001 expected future moments for him, and only 1 of them will consist of continued sufferings. So expected duration of his suffering will be less than a second. However, to win such game benevolent AI need to have the overwhelming advantage in computer power and some other assumptions about nature of personal identity need to be resolved. 2. I agree that some outcomes, like eternal very strong suffering are worse, but it is important to think about non-existence as a form of suffering, as it will help us in utilitarian calculations and will help to show that x-risks are the type of s-risks. 3. There more people in the world who care about animal sufferings than about x-risks, and giving them new argument increases the probability of x-risks. 4. What do you mean by "Also it's about animals for some reason, let's talk about them when hell freezes over."? We could provide happiness to all animals and provide infinitely survival to their species, which otherwise will extinct completely in millions years. 5. Do you mean finite, but unbearable sufferings, like intensive pain for one year? EDITED: It looks like you changed your long reply while I was writing the long answer on all your counterarguments.

1RomeoStevens9y

X-risk is still plausibly worse in that we need to survive to reach as much of the universe as possible and eliminate suffering in other places. Edit: Brian talks about this here: https://foundational-research.org/risks-of-astronomical-future-suffering/#Spread_of_wild_animals-2

[-][anonymous]9y40

Interesting to see another future philosophy.

I think my own rough future philosophy is making sure that the future has an increase in autonomy for humanity. I think it transforms into S-risk reduction assuming that autonomous people will chose to reduce their suffering and their potential future suffering if they can. It also transforms the tricky philosophical question of defining suffering into the tricky philosophical question of defining autonomy, that might be trade that is preferred.

I think I prefer the autonomy increase because I do not have to try... (read more)

[-]Lumifer9y30

As usual, xkcd is relevant.

[-][anonymous]9y20

So I don't have much experience with philosophy; this is mainly a collection of my thoughts as I read through.

1) S-risks seem to basically describe hellscapes, situations of unimaginable suffering. Is that about right?

2) Two assumptions here seem to be valuing future sentience and the additive nature of utility/suffering. Are these typical stances to be taking? Should there be some sort of discounting happening here?

3) I'm pretty sure I'm strawmanning here, but I can't but feel like there's some sort of argument by definition here where we first defined s-... (read more)

6Viliam9y

To maximize human suffering per unit of space-time, you need a good model of human values, just like a Friendly AI. But to create astronomical amount of human suffering (without really maximizing it), you only need to fill astronomical amount of space-time with humans living in bad conditions, and prevent them from escaping those conditions. Relatively easier. Instead of Thamiel, imagine immortal Pol Pot with space travel.

6[anonymous]9y

Ah, okay. Thanks for the clarification here.

[-]turchin9y20

We could also add a-risks: that human civilisation will destroy alien life and alien civilizations. For example, LHC-false vacuum-catastrophe or UFAI could dangerously affect all visible universe and kill an unknown number of the alien civilisations or prevent their existence.

Preventing risks to alien life is one of the main efforts in the sterilisation of Mars rovers and sinking of Galileo and Cassini in Jupiter and Saturn after the end of their missions.

7ignoranceprior9y

The flip side of this idea is "cosmic rescue missions" (term coined by David Pearce), which refers to the hypothetical scenario in which human civilization help to reduce the suffering of sentient extraterrestrials (in the original context, it referred to the use of technology to abolish suffering). Of course, this is more relevant for simple animal-like aliens and less so for advanced civilizations, which would presumably have already either implemented a similar technology or decided to reject such technology. Brian Tomasik argues that cosmic rescue missions are unlikely. Also, there's an argument that humanity conquering aliens civs would only be considered bad if you assume that either (1) we have non-universalist-consequentialist reasons to believe that preventing alien civilizations from existing is bad, or (2) the alien civilization would produce greater universalist-consequentialist value than human civilizations with the same resources. If (2) is the case, then humanity should actually be willing to sacrifice itself to let the aliens take over (like in the "utility monster" thought experiment), assuming that universalist consequentialism is true. If neither (1) nor (2) holds, then human civilization would have greater value than ET civilization. Seth Baum's paper on universalist ethics and alien encounters goes into greater detail.

1turchin9y

Thanks for links. My thought was that we may give higher negative utility to those x-risks which are able to become a-risks too, that is LHC and AI. If you know Russian science fiction by Strugatsky, there is an idea in it of "Progressors" - the people who are implanted into other civilisations to help them develop quickly. At the end, the main character concluded that such actions violate value of any civilization to determine their own way and he returned to earth to search and stop possible alien progressors on here.

2ignoranceprior9y

Oh, in those cases, the considerations I mentioned don't apply. But I still thought they were worth mentioning. In Star Trek, the Federation has a "Prime Directive" against interfering with the development of alien civilizations.

3Lumifer9y

The main role of which is to figure in this recurring dialogue: -- Captain, but the Prime Directive! -- Screw it, we're going in.

0Lumifer9y

Iain Banks has similar themes in his books -- e.g. Inversions. And generally speaking, in the Culture universe, the Special Circumstances are a meddlesome bunch.

[-]ignoranceprior9y20

Want to improve the wiki page on s-risk? I started it a few months ago but it could use some work.

[-]username29y10

Feedback: I had to scroll a very long way until I found out what "s-risk" even was. By then I had lost interest, mainly because generalizing from fiction is not useful.

5ignoranceprior9y

You might like this better: https://foundational-research.org/reducing-risks-of-astronomical-suffering-a-neglected-priority/

2Max_Daniel9y

Thank you for your feedback. I've added a paragraph at the top of the post that includes the definition of s-risk and refers readers already familiar with the concept to another article.

2Brian_Tomasik9y

Thanks for the feedback! The first sentence below the title slide says: "I’ll talk about risks of severe suffering in the far future, or s-risks." Was this an insufficient definition for you? Would you recommend a different definition?

[-]Lumifer9y10

Direct quote: "So, s-risks are roughly as severe as factory farming"

/facepalm

5Brian_Tomasik9y

Is it still a facepalm given the rest of the sentence? "So, s-risks are roughly as severe as factory farming, but with an even larger scope." The word "severe" is being used in a technical sense (discussed a few paragraphs earlier) to mean something like "per individual badness" without considering scope.

1[anonymous]9y

I think the claim that s-risks are roughly as severe as factory farming "per individual badness" is unsubstantiated. But it is reasonable to claim that experiencing either would be worse than death, "hellish". Remember, Hell has circles.

1fubarobfusco9y

The section presumes that the audience agrees wrt veganism. To an audience who isn't on board with EA veganism, that line comes across as the "arson, murder, and jaywalking" trope.

2Lukas_Gloor9y

A lot of people who disagree with veganism agree that factory farming is terrible. Like, more than 50% of the population I'd say.

-1Lumifer9y

Notably, the great majority of them don't have the slightest clue about farming in general or factory farming in particular. Don't mistake social signaling for actual positions.

5komponisto9y

As the expression about knowing "how the sausage is made" attests, generally the more people learn about it, the less they like it. Of course, veganism is very far from being an immediate consequence of disliking factory farming. (Similarly, refusing to pay taxes is very far from being an immediate consequence of disliking government policy.)

1Lumifer9y

That's not obvious to me. I agree that the more people are exposed to anti-factory-farming propaganda, the more they are influenced by it, but that's not quite the same thing, is it?

-3Lumifer9y

Facepalm was a severe understatement, this quote is a direct ticket to the loony bin. I recommend poking your head out of the bubble once in a while -- it's a whole world out there. For example, some horrible terrible no-good people -- like me -- consider factory farming to be an efficient way of producing a lot of food at reasonable cost. This sentence reads approximately as "Literal genocide (e.g. Rwanda) is roughly as severe as using a masculine pronoun with respect to a nonspecific person, but with an even larger scope". The steeliest steelman that I can come up with is that you're utterly out of touch with the Normies.

6Lukas_Gloor9y

I sympathize with your feeling of alienation at the comment, and thanks for offering this perspective that seems outlandish to me. I don't think I agree with you re who the 'normies' are, but I suspect that this may not be a fruitful thing to even argue about. Side note: I'm reminded of the discussion here. (It seems tricky to find a good way to point out that other people are presenting their normative views in a way that signals an unfair consensus, without getting into/accused of identify politics or having to throw around words like "loony bin" or fighting over who the 'normies' are.)

2Lumifer9y

Yes, we clearly have very different worldviews. I don't think alienation is the right word here, it's just that different people think about the world differently and IMHO that's perfectly fine (to clarify, I mean values and normative statements, not facts). And, of course, you have no obligation at all to do something about it.

2[anonymous]9y

Yeah, that part is phrased poorly :-/

[-]turchin9y-20

If it makes sense to continue adding letters to different risks, l-risks could be identified, that is the risks that kill all life on earth. The main difference for us, humans, that there are zero chances of the new civilisation of Earth in that case.

But y-risks term is free. What could it be?

1[anonymous]9y

The risk that we think about risks too much and never do anything interesting?

2turchin9y

Or may be risks of accidental p-zombing.... but it is z-risk.

[-]Regex9y-30

The S is for "Skitter"

Moderation Log