Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
A new popular science book on existential risks and mass extinctions from Annalee Newitz, the founding editor of io9.com
It probably won't display the same rigour as Global Catastrophic Risks (Bostrom, Cirkovic et al.), but that was published five years ago and is a bit academic. A new book written in a popular, journalistic way seems pretty appealing - it might even be a good introduction for family/friends. Anyway I'm looking forward to reading it, and I expect enough other LWers will be interested in this news to warrant the post.
If anyone has any other existential risk book recommendations, please comment.
In a recent essay, Brian Tomasik argues that meme-spreading has higher expected utility than x-risk reduction. His analysis assumes a classical utilitarian ethic, but it may be generalizable to other value systems. Here's the summary:
I personally do not support efforts to reduce extinction risk because I think space colonization would potentially give rise to astronomical amounts of suffering. However, even if I thought reducing extinction risk was a good idea, I would not work on it, because spreading your particular values has generally much higher leverage than being one more voice for safety measures against extinction in a world where reducing extinction risk is hard and almost everyone has some incentives to invest in the issue.
A few days ago I was rereading one of my favourite graphic novels. In it the supervillain commits mass murder to prevent nuclear war - he kills millions to save billions. This got me thinking about how a lot of LessWrong/Effective Altruism people approach existential risks (xrisks). An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development (Bostrom 2002). I'm going to point out an implication of this approach, show how this conflicts with a number of intuitions, and then try to clarify the conflict.
If murder would reduce xrisk, one should commit the murder. The argument for this is that compared to billions or even trillions of future people, and/or the amount of valuable things they could instantiate (by experiencing happiness or pleasure, performing acts of kindness, creating great artworks, etc) the importance of one present person, and/or the badness of commiting (mass) murder is quite small. The large number on the 'future' side outweighs or cancels the far smaller number on the 'present' side.
I can think of a number of scenarios in which murder of one or more people could quite clearly reduce existential risk, such as the people who know the location of some secret refuge
Indeed at the extreme it would seem that reducing xrisk would justify some truly terrible things, like a preemptive nuclear strike on a rogue country.
This implication does not just hold for simplistic act-utilitarians, or consequentialists more broadly - it affects any moral theory that accords moral weight to future people and doesn't forbid murder.
This implication is implicitly endorsed in a common choice many of us make between focusing our resources on xrisk reduction as opposed to extreme poverty reduction. This is sometimes phrased as being about choosing to save one life now or far more future lives. While bearing in mind some complications (such as the debate over doing vs allowing and the Doctrine of Double Effect), it seems that 'letting several people die from extreme poverty to try to reduce xrisk' is in an important way similar to 'killing several people to try to reduce xrisk'.
II. Simple Objection:
A natural reaction to this implication is that this is wrong, one shouldn't commit murder to reduce xrisk. To evade some simple objections let us assume that we can be highly sure that the (mass) murder will indeed reduce xrisk: maybe no-one will find out about the murder, or it won't open a position for someone even worse.
Let us try and explain this reaction, and offer an objection: The idea that we should commit (mass) murder conflicts with some deeply held intuitions, such as the intuition that one shouldn't kill, and the intuition that one shouldn't punish a wrong-doer before she/he commits a crime.
One response - the most prominent advocate of which is probably Peter Singer - is to cast doubt onto our intuitions. We may have these intuitions, but they may have been induced by various means i.e. by evolution or society. Racist views were common in past societies. Moreover there is some evidence that humans may have a evolutionary predisposition to be racist. Nevertheless we reject racism, and therefore (so the argument goes) we should reject a number of other intuitions. So perhaps we should reject the intuitions we have, shrug off the squeamishness and agree that (mass) murder to reduce xrisk is justified.
[NB: I'm unsure about how convincing this response is. Two articles in Philosophy and Public Affairs dispute Singer's argument (Berker 2009) (Kamm 2009). One must also take into account the problem of applying our everyday intuitions to very unusual situations - see 'How Outlandish Can Imaginary Cases Be?' (Elster 2011)]
The trope of the supervillain justifying his or her crimes by claiming it had to be done for 'the greater good' (or similar) is well established. Tv tropes calls it Utopia Justifies The Means. I find myself slightly troubled when my moral beliefs lead me to agree with fictional supervillains. Nevertheless, is the best option to bite the bullet and side with the supervillains?
III. Complex Objection:
Let us return to the fictional example with which we started. Part of the reason his act seems wrong is that, in real life, the supervillain's mass murder was not necessary to prevent nuclear war - the Cold War ended without large-scale direct conflict between the USA and USSR. This seems to point the way to (some) clarification.
I find my intuitions change when the risk seems higher. While I'm unsure that murder is the right answer in the examples given above, it seems clearer in a situation where the disaster is in the midst of occurring, and murder or mass murder is the only way to prevent an existential disaster. The hypothetical that works for me is imagining some incredibly virulent disease or 'grey-goo' nano-replicator that has swept over Australia and is about to spread, and the only way to stop it is a nuclear strike.
One possibility is that my having a different intuition is simply because the situation is similar to hypotheticals that seem more familiar, such as shooting a hostage-taker or terrorist if that was the only way to prevent loss of innocent life.
But I'd like to suggest that it perhaps reflects a problem with xrisks, that it is the idea of doing something awful for a very uncertain benefit. The problem is the uncertainty. If a (mass) murder would prevent an existential disaster, then one should do it, but when it merely reduces xrisk it is less clear. Perhaps there should be some sort of probability threshold - if one has good reason to think the probability is over certain limits (10%, 50%, etc) then one is justified in committing gradually more heinous acts.
In this post I've been trying to explain a troubling worry - to lay out my thinking - more than I have been trying to argue for or against an explicit claim. I have a problem with the claim that xrisk reduction is the most important task for humanity and/or me. On the one hand it seems convincing, yet on the other it seems to lead to some troubling implications - like justifying not focusing on extreme poverty reduction, or justifying (mass) murder.
Comments and criticism of the argument are welcomed. Also, I would be very interested in hearing people's opinions on this topic. Do you think that 'reducing xrisk' can justify murder? At what scale? Perhaps more importantly, does that bother you?
DISCLAIMER: I am in no way encouraging murder. Please do not commit murder.
[Summary: The fact we do not observe (and have not been wiped out by) an UFAI suggests the main component of the 'great filter' cannot be civilizations like ours being wiped out by UFAI. Gentle introduction (assuming no knowledge) and links to much better discussion below.]
The Great Filter is the idea that although there is lots of matter, we observe no "expanding, lasting life", like space-faring intelligences. So there is some filter through which almost all matter gets stuck before becoming expanding, lasting life. One question for those interested in the future of humankind is whether we have already 'passed' the bulk of the filter, or does it still lie ahead? For example, is it very unlikely matter will be able to form self-replicating units, but once it clears that hurdle becoming intelligent and going across the stars is highly likely; or is it getting to a humankind level of development is not that unlikely, but very few of those civilizations progress to expanding across the stars. If the latter, that motivates a concern for working out what the forthcoming filter(s) are, and trying to get past them.
One concern is that advancing technology gives the possibility of civilizations wiping themselves out, and it is this that is the main component of the Great Filter - one we are going to be approaching soon. There are several candidates for which technology will be an existential threat (nanotechnology/'Grey goo', nuclear holocaust, runaway climate change), but one that looms large is Artificial intelligence (AI), and trying to understand and mitigate the existential threat from AI is the main role of the Singularity Institute, and I guess Luke, Eliezer (and lots of folks on LW) consider AI the main existential threat.
The concern with AI is something like this:
- AI will soon greatly surpass us in intelligence in all domains.
- If this happens, AI will rapidly supplant humans as the dominant force on planet earth.
- Almost all AIs, even ones we create with the intent to be benevolent, will probably be unfriendly to human flourishing.
Or, as summarized by Luke:
... AI leads to intelligence explosion, and, because we don’t know how to give an AI benevolent goals, by default an intelligence explosion will optimize the world for accidentally disastrous ends. A controlled intelligence explosion, on the other hand, could optimize the world for good. (More on this option in the next post.)
So, the aim of the game needs to be trying to work out how to control the future intelligence explosion so the vastly smarter-than-human AIs are 'friendly' (FAI) and make the world better for us, rather than unfriendly AIs (UFAI) which end up optimizing the world for something that sucks.
'Where is everybody?'
So, topic. I read this post by Robin Hanson which had a really good parenthetical remark (emphasis mine):
Yes, it is possible that the extremely difficultly was life’s origin, or some early step, so that, other than here on Earth, all life in the universe is stuck before this early extremely hard step. But even if you find this the most likely outcome, surely given our ignorance you must also place a non-trivial probability on other possibilities. You must see a great filter as lying between initial planets and expanding civilizations, and wonder how far along that filter we are. In particular, you must estimate a substantial chance of “disaster”, i.e., something destroying our ability or inclination to make a visible use of the vast resources we see. (And this disaster can’t be an unfriendly super-AI, because that should be visible.)
This made me realize an UFAI should also be counted as an 'expanding lasting life', and should be deemed unlikely by the Great Filter.
Another way of looking at it: if the Great Filter still lies ahead of us, and a major component of this forthcoming filter is the threat from UFAI, we should expect to see the UFAIs of other civilizations spreading across the universe (or not see anything at all, because they would wipe us out to optimize for their unfriendly ends). That we do not observe it disconfirms this conjunction.
[Edit/Elaboration: It also gives a stronger argument - as the UFAI is the 'expanding life' we do not see, the beliefs, 'the Great Filter lies ahead' and 'UFAI is a major existential risk' lie opposed to one another: the higher your credence in the filter being ahead, the lower your credence should be in UFAI being a major existential risk (as the many civilizations like ours that go on to get caught in the filter do not produce expanding UFAIs, so expanding UFAI cannot be the main x-risk); conversely, if you are confident that UFAI is the main existential risk, then you should think the bulk of the filter is behind us (as we don't see any UFAIs, there cannot be many civilizations like ours in the first place, as we are quite likely to realize an expanding UFAI).]
A much more in-depth article and comments (both highly recommended) was made by Katja Grace a couple of years ago. I can't seem to find a similar discussion on here (feel free to downvote and link in the comments if I missed it), which surprises me: I'm not bright enough to figure out the anthropics, and obviously one may hold AI to be a big deal for other-than-Great-Filter reasons (maybe a given planet has a 1 in a googol chance of getting to intelligent life, but intelligent life 'merely' has a 1 in 10 chance of successfully navigating an intelligence explosion), but this would seem to be substantial evidence driving down the proportion of x-risk we should attribute to AI.
What do you guys think?
His forthcoming book may be of interest to LWers: A Crisis of Faith: Atheism, Emerging Technologies, and the Future of Humanity. Mostly it's a beginner's book about atheism, but chapter 20 discusses cognitive enhancement and mind uploading, and chapter 21 discusses existential risks as one of the most important things for humans to address once they've stopped fooling around with religion. There's also an appendix on the simulation argument.
"All that is necessary for evil to triumph is that good men do nothing."
155,000 people are dying, on average, every day. For those of us who are preference utilitarians, and also believe that a Friendly singularity is possible, and capable of ending this state of affairs, it also puts a great deal of pressure on us. It doesn't give us leave to be sloppy (because human extinction, even multiplied by a low probability, is a massive negative utility). But, if we see a way to achieve similar results in a shorter time frame, the cost to human life of not taking it is simply unacceptable.
I have some concerns about CEV on a conceptual level, but I'm leaving those aside for the time being. My concern is that most of the organizations concerned with a first-mover X-risk are not in a position to be that first mover -- and, furthermore, they're not moving in that direction. That includes the Singularity Institute. Trying to operationalize CEV seems like a good way to get an awful lot of smart people bashing their heads against a wall while clever idiots trundle ahead with their own experiments. I'm not saying that we should be hasty, but I am suggesting that we need to be careful of getting stuck in dark intellectual forests with lots of things that are fun to talk about until an idiot with the tinderbox burns it down.
My point, in short, is that we need to be looking for better ways to do things, and to do them extremely quickly. We are working on a very, very, existentially tight schedule.
So, if we're looking for quicker paths to a Friendly, first-mover singularity, I'd like to talk about one that seems attractive to me. Maybe it's a useful idea. If not, then at least I won't waste any more time thinking about it. Either way, I'm going to lay it out and you guys can see what you think.
So, Friendliness is a hard problem. Exactly how hard, we don't know, but a lot of smart people have radically different ideas of how to attack it, and they've all put a lot of thought into it, and that's not a good sign. However, designing a strongly superhuman AI is also a hard problem. Probably much harder than a human can solve. The good news is, we don't expect that we'll have to. If we can build something just a little bit smarter than we are, we expect that bootstrapping process to take off without obvious limit.
So let's apply the same methodology to Friendliness. General goal optimizers are tools, after all. Probably the most powerful tools that have ever existed, for that matter. Let's say we build something that's not Friendly. Not something we want running the universe -- but, Friendly enough. Friendly enough that it's not going to kill us all. Friendly enough not to succumb to the pedantic genie problem. Friendly enough we can use it to build what we really want, be it CEV or something else.
I'm going to sketch out an architecture of what such a system might look like. Do bear in mind this is just a sketch, and in no way a formal, safe, foolproof design spec.
So, let's say we have an agent with the ability to convert unstructured data into symbolic relationships that represent the world, with explicitly demarcated levels of abstraction. Let's say the system has the ability to build Bayesian causal relationships out of its data points over time, and construct efficient, predictive models of the behavior of the concepts in the world. Let's also say that the system has the ability to take a symbolic representation of a desired future distribution of universes, a symbolic representation of the current universe, and map between them, finding valid chains of causality leading from now to then, probably using a solid decision theory background. These are all hard problems to solve, but they're the same problems everyone else is solving too.
This system, if you just specify parameters about the future and turn it loose, is not even a little bit Friendly. But let's say you do this: first, provide it with a tremendous amount of data, up to and including the entire available internet, if necessary. Everything it needs to build extremely effective models of human beings, with strongly generalized predictive power. Then you incorporate one or more of those models (say, a group of trusted people) as a functional components: the system uses them to generalize natural language instructions first into a symbolic graph, and then into something actionable, working out the details of what it meant, rather than what is said. Then, when the system is finding valid paths of causality, it takes its model of the state of the universe at the end of each course of action, feeds them into its human-models, and gives them a veto vote. Think of it as the emergency regret button, iterated computationally for each possibility considered by the genie. Any of them that any of the person-models find unacceptable are disregarded.
(small side note: as described here, the models would probably eventually be indistinguishable from uploaded minds, and would be created, simulated for a short time, and destroyed uncountable trillions of times -- you'd either need to drastically limit the simulation depth of a models, or ensure that everyone who you signed up to be one of the models knew the sacrifice they were making)
So, what you've got, plus or minus some spit and polish, is a very powerful optimization engine that understands what you mean, and disregards obviously unacceptable possibilities. If you ask it for a truly Friendly AI, it will help you first figure out what you mean by that, then help you build it, then help you formally prove that it's safe. It would turn itself off if you asked it too, and meant it. It would also exterminate the human species if you asked it to and meant it. Not Friendly, but Friendly enough to build something better.
With this approach, the position of the Friendly AI researcher changes. Instead of being in an arms race with the rest of the AI field with a massive handicap (having to solve two incredibly hard problems against opponents who only have to solve one), we only have to solve a relatively simpler problem (building a Friendly-enough AI), which we can then instruct to sabotage unFriendly AI projects and buy some time to develop the real deal. It turns it into a fair fight, one that we might actually win.
Anyone have any thoughts on this idea?
The panel surfaced a number of issues that contribute to our inability to date to make serious strides on global challenges, including income inequality, failure of governance and lack of leadership. It also explored some deeper issues around pysche and society – people’s inability to convert information to wisdom, the loss of sense of self, the challenges of hyperconnectivity, and questions about economic models and motivations that have long underpinned concepts of growth and wellbeing. The session was filmed, and we’ll make public that link once the file is available. In the meantime, here are some of the more memorable quotes (which may not be verbatim, but this is how I wrote them down):
“When people say something is impossible, that just means it’s hard.”
“Inequality is becoming an existential threat.”
“We’re at a crossroads. We can make progress against these big issues or we can kill ourselves.”
“We need inclusive globalization, to give everyone a stake in the future.”
‘Fatalism is our most deadly adversary.”
“What we’re lacking is not IQ, but wisdom.”
“We need to tap into the timeless to solve the urgent.”
What we mean by global threats
Global threats have the potential to kill or debilitate very large numbers of people or cause significant economic or social dislocation or paralysis throughout the world. Global threats cannot be solved by any one country; they require some sort of a collective response. Global threats are often non-linear, and are likely to become exponentially more difficult to manage if we don’t begin making serious strides in the right direction in the next 5-10 years.
More on existential risks: wiki.lesswrong.com/wiki/Existential_risk
A list of organisations and charities concerned with existential risk research.
- Singularity Institute
- The Future of Humanity Institute
- The Oxford Martin Programme on the Impacts of Future Technology
- Global Catastrophic Risk Institute
- Saving Humanity from Homo Sapiens
- Skoll Global Threats Fund (To Safeguard Humanity from Global Threats)
- Foresight Institute
- Defusing the Nuclear Threat
- Leverage Research
- The Lifeboat Foundation
New: the Global Catastrophic Risk Institute (Seth Baum & Tony Barrett).
I've also heard that the following people are working to set up x-risk departments/organizations:
Huw Price at Cambridge
Newton Howard at MIT
I'm woefully underinformed on this topic, but this doesn't seem good at all:
ROTTERDAM, THE NETHERLANDS—Locked up in the bowels of the medical faculty building here and accessible to only a handful of scientists lies a man-made flu virus that could change world history if it were ever set free.
The virus is an H5N1 avian influenza strain that has been genetically altered and is now easily transmissible between ferrets, the animals that most closely mimic the human response to flu. Scientists believe it's likely that the pathogen, if it emerged in nature or were released, would trigger an influenza pandemic, quite possibly with many millions of deaths.
In a 17th floor office in the same building, virologist Ron Fouchier of Erasmus Medical Center calmly explains why his team created what he says is "probably one of the most dangerous viruses you can make"—and why he wants to publish a paper describing how they did it. Fouchier is also bracing for a media storm. After he talked toScienceInsider yesterday, he had an appointment with an institutional press officer to chart a communication strategy.
Fouchier's paper is one of two studies that have triggered an intense debate about the limits of scientific freedom and that could portend changes in the way U.S. researchers handle so-called dual-use research: studies that have a potential public health benefit but could also be useful for nefarious purposes like biowarfare or bioterrorism.
The other study—also on H5N1, and with comparable results—was done by a team led by virologist Yoshihiro Kawaoka at the University of Wisconsin, Madison, and the University of Tokyo, several scientists toldScienceInsider. (Kawaoka did not respond to interview requests.) Both studies have been submitted for publication, and both are currently under review by the U.S. National Science Advisory Board for Biosecurity (NSABB), which on a few previous occasions has been asked by scientists or journals to review papers that caused worries.
NSABB chair Paul Keim, a microbial geneticist, says he cannot discuss specific studies but confirms that the board has "worked very hard and very intensely for several weeks on studies about H5N1 transmissibility in mammals." The group plans to issue a public statement soon, says Keim, and is likely to issue additional recommendations about this type of research. "We'll have a lot to say," he says
I feel as though I ought provide more commentary instead of just an article dump, but I feel more strongly than that that what I have to say would be obvious or stupid or both, so.
The Oxford Martin Programme on the Impacts of Future Technology (aka FutureTech) is a new research department at Oxford University, roughly a spin-off of FHI, but focusing on AI and nanotech risks and differential technological development. Like FHI, this department is directed by Nick Bostrom. They'll be hiring more researchers soon. Basically, this means more people and money being devoted to existential risk reduction.
Okay, now back to work.