In a recent essay, Brian Tomasik argues that meme-spreading has higher expected utility than x-risk reduction. His analysis assumes a classical utilitarian ethic, but it may be generalizable to other value systems.  Here's the summary:

I personally do not support efforts to reduce extinction risk because I think space colonization would potentially give rise to astronomical amounts of suffering. However, even if I thought reducing extinction risk was a good idea, I would not work on it, because spreading your particular values has generally much higher leverage than being one more voice for safety measures against extinction in a world where reducing extinction risk is hard and almost everyone has some incentives to invest in the issue.

New Comment
20 comments, sorted by Click to highlight new comments since:

one more voice for safety measures against extinction in a world where reducing extinction risk is hard and almost everyone has some incentives to invest in the issue.

Who are these almost everyone, what are they currently working on and what is their budget?

This is a good point. :) I added an additional objection to the piece.

As an empirical matter, extinction risk isn't being funded as much as you suggest it should be if almost everyone has some incentives to invest in the issue.

There's a lot of "extinction risk" work that's not necessarily labeled as such: Biosecurity, anti-nuclear proliferation, general efforts to prevent international hostility by nation states, general efforts to reduce violence in society and alleviate mental illnesses, etc. We don't necessarily see huge investments in AI safety yet, but this will probably change in time, as we begin to see more AIs that get out of control and cause problems on a local scale. 99+% of catastrophic risks are not extinction risks, so as the catastrophes begin happening and affecting more people, governments will invest more in safeguards than they do now. The same can be said for nanotech.

In any event, even if budgets for extinction-risk reduction are pretty low, you also have to look at how much money can buy. Reducing risks is inherently difficult, because so much is out of our hands. It seems relatively easier to win over hearts and minds to utilitronium (especially at the margin right now, by collecting the low-hanging fruit of people who could be persuaded but aren't yet). And because so few people are pushing for utilitronium, it seems far easier to achieve a 1% increase in support for utilitronium than a 1% decrease in the likelihood of extinction.

We don't necessarily see huge investments in AI safety yet, but this will probably change in time, as we begin to see more AIs that get out of control and cause problems on a local scale.

Once we see an out of control AI it's to late to do AI safety. Given current computer security the AI could hack itself into every computer in the world and resist easy shutdown.

When it comes to low probability high impact events waiting for small problem to cause awareness of the issue is just dangerous.

As we begin seeing robots/computers that are more human-like, people will take the possibility of AGIs getting out of control more seriously. These things will be major news stories worldwide, people will hold natural-security summits about them, etc. I would assume the US military is already looking into this topic at least a little bit behind closed doors.

There will probably be lots of not-quite-superhuman AIs / AGIs that cause havoc along the road to the first superhuman ones. Yes, it's possible that FOOM will take us from roughly a level like where we are now to superhuman AGI in a matter of days, but this scenario seems relatively unlikely to me, so any leverage you want to make on it has to be multiplied by that small probability of it happening.

--

BTW, I'm curious to hear more about the mechanics of your scenario. The AGI hacks itself onto every (Internet-connected) computer in the world. Then what? Presumably this wouldn't cause extinction, just a lot of chaos and maybe years' worth of setback to the economy? Maybe it would increase chances of nuclear war, especially if the AGI could infect nuclear-warhead-related computer systems.

This could be an example of the non-extinction-level AGI disasters that I was referring to. Let me know if there are more ways in which it might cause total extinction, though.

BTW, I'm curious to hear more about the mechanics of your scenario. The AGI hacks itself onto every (Internet-connected) computer in the world. Then what?

Then the AI does precisely nothing other than hide its presence and do the following:

Send one email to a certain nano-something research scientist whom the AI has identified as "easy to bribe into building stuff he doesn't know about in exchange for money". The AI hacks some money (or maybe even earns it "legitimately"), sends it to the scientist, then tells the scientist to follow some specific set of instructions for building a specific nanorobot.

The scientist builds the nanorobot. The nanorobot proceeds to slowly and invisibly multiply until it has reached 100% penetration to every single human-inhabited place on Earth. Then it synchronously begins a grey goo event where every human is turned into piles of carbon and miscellaneous waste, and every other thinghy required for humans (or other animals) to survive on earth is transformed into more appropriate raw materials for the AI to use next.

And I'm only scratching the surface of a limited sample of some of the most obvious ways an AI could cause an extinction event from the comfort of only a few university networks, let alone every single computer connected to the Internet.

As we begin seeing robots/computers that are more human-like

It's not at all clear that a AGI will be human-like, anyone than humans are dog-like.

BTW, I'm curious to hear more about the mechanics of your scenario. The AGI hacks itself onto every (Internet-connected) computer in the world. Then what?

How do you fight the AGI past that point?

It controls total global communication flow. It can play out different humans against each other till it effectively rules the world. After it has total political control it can move more and more resources to itself.

Maybe it would increase chances of nuclear war, especially if the AGI could infect nuclear-warhead-related computer systems.

That's not even needed. You just need to set up a bunch of convincing false flag attacks that implicate Pakistan attacking India.

A clever AI might provoke such conflicts to distract humans from fighting it.

Don't underrate how a smart AI can fight conflicts. Having no Akrasia, no need for sleep, the ability to self replicate your mind and being able to plan very complex conflicts rationally are valuable for fighting conflicts.

For the AGI it's even enough to get political control over a few countries will the other countries have their economy collapse due to lack of computers the AGI could help those countries that it controls to overpower the others over the long run.

I think he meant more along the lines of computers/robots/ non-super AIs becoming more powerful, IDK.

It's not at all clear that a AGI will be human-like, anyone than humans are dog-like.

Ok, bad wording on my part. I meant "more generally intelligent."

How do you fight the AGI past that point?

I was imagining people would destroy their computers, except the ones not connected to the Internet. However, if the AGI is hiding itself, it could go a long way before people realized what was going on.

Interesting scenarios. Thanks!

However, if the AGI is hiding itself, it could go a long way before people realized what was going on.

Exactly. On the one hand the AGI doesn't try to let humans get wind of it's plans. On the other hand it's going to produce distractions.

You have to remember how delusional some folks are. Imaging trying to convince the North Korean's that they have to destroy their computers because those computer are infested with an evil AI.

Even in the US nearly half of the population still believes in creationism. How many of them can be convinced that the evil government is trying to take away their computers to establish a dictatorship?

Before the government goes attempts to trash the computer the AI sent an email to a conspiracy theory website, where it starts revealing some classified documents it aquired through hacking that show government misbehavior.

Then it sents an email to the same group saying that the US government is going to shut down all civilian computers because freedom of speech is to dangerous to the US government and that the US government will be using the excuse that the computers are part of a Chinese botnet.

In our time you need computers to stock supermarket shelves with goods. Container ships need GPS and see charts to navigate.

People start fighting each other. Some are likely to blame the people who wanted to thrash the computers as responsible for the mess.

Even if you can imagine shutting of all computer in 2013, in 2033 most cars will be computers in which the AI can rest. A lot of military firepower will be in drones that the AI can control.

Some really creative ideas, ChristianKl. :)

Even with what you describe, humans wouldn't become extinct, barring other outcomes like really bad nuclear war or whatever.

However, since the AI wouldn't be destroyed, it could bide its time. Maybe it could ally with some people and give them tech/power in exchange for carrying out its bidding. They could help build the robots, etc. that would be needed to actually wipe out humanity.

Obviously there's a lot of conjunction here. I'm not claiming this scenario specifically is likely. But it helps to stimulate the imagination to work out an existence proof for the extinction risk from AGI.

Maybe it could ally with some people and give them tech/power in exchange for carrying out its bidding.

Some AI's already do this today. The outsource work they can't do to Amazon's mechanical turk where humans get payed money to do tasks for the AI.

Other humans take on job on rentacoder where they never see the human that's hiring them.

Even with what you describe, humans wouldn't become extinct, barring other outcomes like really bad nuclear war or whatever.

Human's wouldn't get extinct in a short time frame but if the AGI has decades of time than it can increase it's own power over time and decrease it's dependence on humans. Sooner or later the humans wouldn't be useful for the AGI anymore and then go extinct.

Agreed: While I am doubtful about the 'incredibly low budget nano bootstrap', I would say that uncontained foomed AIs are very dangerous if they are interested in performing almost any action whatsoever.

What about cases when x-risk and memespread or meme-protection come together? These opportunities seem potentially very valuable.

In a recent essay, Brian Tomasik argues that meme-spreading has higher expected utility than x-risk reduction. His analysis assumes a classical utilitarian ethic, but it may be generalizable to other value systems.

This is particularly the case if you expect those who do manage X-Risk to do so in a way that may adopt some kind of 'average' of human values. ie. It exploits FAI>. (Probably, by a default interpretation of the incompletely defined outline.)

Relevant only to some parts of Tomasik's essay:

Ord, Why I'm Not a Negative Utilitarian

Thanks, Luke. See also this follow-up discussion to Ord's essay.

As you suggest with your "some" qualifier, my essay that benthamite shared doesn't make any assumptions about negative utilitarianism. I merely inserted parentheticals about my own views into it to avoid giving the impression that I'm personally a positive-leaning utilitarian.

Nice discussion there! Thanks for the link.

Yeah, I linked to Tomasik's earlier musings on this a while back in a comment.

I must say I am very impressed by this kind of negative-utilitarian reasoning, as it has captured a concern of mine that I once naively assumed to be unquantifiable by utilitarian ethics. There might be many plausible future worlds where scenarios like "Omelas" or "SCP-231" would be the norm, possibly with (trans)humanity acquiescing to them or perpetuating them for a rational reason.
What's worse, such futures might not even be acknowledged as disastrous/Unfriendly by people contemplating the perspective. Consider the perspective of transhuman values simply diverging so widely that some groups in a would-be "libertarian utopia" would perpetrate things (to their own unwilling members or other sentinents) which the rest of us would find abhorrent - yet the only way to influence such groups could be by aggression and total non-cooperation. Which might not be viable for the objecting factions due to game-theoretical reasons (avoiding a "cascade of defection"), ideological motives or an insufficient capability to project military force. See Three Worlds Collide for some ways this might plausibly play out.

Brian is, so far, the only utilitarian thinker I've read who even mentions Omelas as a potential grave problem, along with more standard transhumanist concerns such as em slavery or "suffering subroutines". I agree with the implications that he draws. I would further add that an excessive focus on reducing X-risk (and, indeed, on ensuring security and safety of all kinds) could have very scary present-day political implications, not just future ones.

(Which is why I am so worried and outspoken about the growth of a certain socio-political ideology among transhumanists and tech geeks; X-risk even features in some of the arguments for it that I've read - although much of it can be safely dismissed as self-serving fearmongering and incoherent apocalyptic fantasies.)

I must say I am very impressed by this kind of negative-utilitarian reasoning, as it has captured a concern of mine that I once naively assumed to be unquantifiable by utilitarian ethics

Do you mean that given certain comparisons of outcomes A and B, you agree with its ranking? Or that it captures your reasons? The latter seems dubious, unless you mean you buy negative utilitarianism wholesale.

If you don't care about anything good, then you don't have to worry about accepting smaller bads to achieve larger goods, but that goes far beyond "throwing out the baby with the bathwater." Toby Ord gives some of the usual counterexamples.

If you're concerned about deontological tradeoffs as in those stories, a negative utilitarian of that stripe would eagerly torture any finite number of people if that would kill a sufficiently larger population that suffers even occasional minor pains in lives that are overall quite good.

If you don't care about anything good [...]

This seems to presuppose "good" being synonymous with "pleasurable conscious states". Referring to broader (and less question-begging) definitions for "good" like e.g. "whatever states of the world I want to bring about" or "whatever is in accordance with other-regarding reasons for actions", negative utilitarians would simply deny that pleasurable consciousness-states fulfill the criterion (or that they fulfill it better than non-existence or hedonically neutral flow-states).

Ord concludes that negative utilitarianism leads to outcomes where "everyone is worse off", but this of course also presupposes an axiology that negative utilitarians would reject. Likewise, it wouldn't be a fair criticism of classical utilitarianism to say that the very repugnant conclusion leaves everyone worse off (even though from a negative or prior-existence kind of perspective it seems like it), because at least according to the classical utilitarians themselves, existing slightly above "worth living" is judged better than non-existence.