Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

"Risk" means surprise

6 PhilGoetz 22 May 2015 04:47AM

I lost about $20,000 in 2013 because I didn't notice that a company managing some of my retirement funds had helpfully reallocated them from 100% stocks into bonds and real estate, to "avoid risk". My parents are retired, and everyone advising them tells them to put most of their money in "safe" investments like bonds.

continue reading »

Report -- Allocating risk mitigation across time

11 owencb 20 February 2015 04:37PM

I've just released a Future of Humanity Institute technical report, written as part of the Global Priorities Project.


This article is about priority-setting for work aiming to reduce existential risk. Its chief claim is that all else being equal we should prefer work earlier and prefer to work on risks that might come early. This is because we are uncertain about when we will have to face different risks, because we expect diminishing returns of extra work, and because we expect that more people will work on these risks in the future.

I explore this claim both qualitatively and with explicit models. I consider its implications for two questions: first, “When is it best to do different kinds of work?”; second, “Which risks should we focus on?”.

As a major application, I look at the case of risk from artificial intelligence. The best strategies for reducing this risk depend on when the risk is coming. I argue that we may be underinvesting in scenarios where AI comes soon even though these scenarios are relatively unlikely, because we will not have time later to address them.


You can read the full report here: Allocating risk mitigation across time.

Existential Risk and Existential Hope: Definitions

7 owencb 10 January 2015 07:09PM

I'm pleased to announce Existential Risk and Existential Hope: Definitions, a short new FHI technical report.

We look at the strengths and weaknesses of two existing definitions of existential risk, and suggest a new definition based on expected value. This leads to a parallel concept: ‘existential hope’, the chance of something extremely good happening.

I think MIRI and CSER may be naturally understood as organisations trying to reduce existential risk and increase existential hope respectively (although if MIRI is aiming to build a safe AI this is also seeking to increase existential hope). What other world states could we aim for that increase existential hope?

New organization - Future of Life Institute (FLI)

44 Vika 14 June 2014 11:00PM

As of May 2014, there is an existential risk research and outreach organization based in the Boston area. The Future of Life Institute (FLI), spearheaded by Max Tegmark, was co-founded by Jaan Tallinn, Meia Chita-Tegmark, Anthony Aguirre and myself.

Our idea was to create a hub on the US East Coast to bring together people who care about x-risk and the future of life. FLI is currently run entirely by volunteers, and is based on brainstorming meetings where the members come together and discuss active and potential projects. The attendees are a mix of local scientists, researchers and rationalists, which results in a diversity of skills and ideas. We also hold more narrowly focused meetings where smaller groups work on specific projects. We have projects in the pipeline ranging from improving Wikipedia resources related to x-risk, to bringing together AI researchers in order to develop safety guidelines and make the topic of AI safety more mainstream.

Max has assembled an impressive advisory board that includes Stuart Russell, George Church and Stephen Hawking. The advisory board is not just for prestige - the local members attend our meetings, and some others participate in our projects remotely. We consider ourselves a sister organization to FHI, CSER and MIRI, and touch base with them often.

We recently held our launch event, a panel discussion "The Future of Technology: Benefits and Risks" at MIT. The panelists were synthetic biologist George Church, geneticist Ting Wu, economist Andrew McAfee, physicist and Nobel laureate Frank Wilczek and Skype co-founder Jaan Tallinn. The discussion covered a broad range of topics from the future of bioengineering and personal genetics, to autonomous weapons, AI ethics and the Singularity. A video and transcript are available.

FLI is a grassroots organization that thrives on contributions from awesome people like the LW community - here are some ways you can help:

  • If you have ideas for research or outreach we could be doing, or improvements to what we're already doing, please let us know (in the comments to this post, or by contacting me directly).
  • If you are in the vicinity of the Boston area and are interested in getting involved, you are especially encouraged to get in touch with us!
  • Support in the form of donations is much appreciated. (We are grateful for seed funding provided by Jaan Tallinn and Matt Wage.)
More details on the ideas behind FLI can be found in this article

AI risk, new executive summary

12 Stuart_Armstrong 18 April 2014 10:45AM

AI risk

Bullet points

  • By all indications, an Artificial Intelligence could someday exceed human intelligence.
  • Such an AI would likely become extremely intelligent, and thus extremely powerful.
  • Most AI motivations and goals become dangerous when the AI becomes powerful.
  • It is very challenging to program an AI with fully safe goals, and an intelligent AI would likely not interpret ambiguous goals in a safe way.
  • A dangerous AI would be motivated to seem safe in any controlled training setting.
  • Not enough effort is currently being put into designing safe AIs.


Executive summary

The risks from artificial intelligence (AI) in no way resemble the popular image of the Terminator. That fictional mechanical monster is distinguished by many features – strength, armour, implacability, indestructability – but extreme intelligence isn’t one of them. And it is precisely extreme intelligence that would give an AI its power, and hence make it dangerous.

The human brain is not much bigger than that of a chimpanzee. And yet those extra neurons account for the difference of outcomes between the two species: between a population of a few hundred thousand and basic wooden tools, versus a population of several billion and heavy industry. The human brain has allowed us to spread across the surface of the world, land on the moon, develop nuclear weapons, and coordinate to form effective groups with millions of members. It has granted us such power over the natural world that the survival of many other species is no longer determined by their own efforts, but by preservation decisions made by humans.

In the last sixty years, human intelligence has been further augmented by automation: by computers and programmes of steadily increasing ability. These have taken over tasks formerly performed by the human brain, from multiplication through weather modelling to driving cars. The powers and abilities of our species have increased steadily as computers have extended our intelligence in this way. There are great uncertainties over the timeline, but future AIs could reach human intelligence and beyond. If so, should we expect their power to follow the same trend? When the AI’s intelligence is as beyond us as we are beyond chimpanzees, would it dominate us as thoroughly as we dominate the great apes?

There are more direct reasons to suspect that a true AI would be both smart and powerful. When computers gain the ability to perform tasks at the human level, they tend to very quickly become much better than us. No-one today would think it sensible to pit the best human mind again a cheap pocket calculator in a contest of long division. Human versus computer chess matches ceased to be interesting a decade ago. Computers bring relentless focus, patience, processing speed, and memory: once their software becomes advanced enough to compete equally with humans, these features often ensure that they swiftly become much better than any human, with increasing computer power further widening the gap.

The AI could also make use of its unique, non-human architecture. If it existed as pure software, it could copy itself many times, training each copy at accelerated computer speed, and network those copies together (creating a kind of “super-committee” of the AI equivalents of, say, Edison, Bill Clinton, Plato, Einstein, Caesar, Spielberg, Ford, Steve Jobs, Buddha, Napoleon and other humans superlative in their respective skill-sets). It could continue copying itself without limit, creating millions or billions of copies, if it needed large numbers of brains to brute-force a solution to any particular problem.

Our society is setup to magnify the potential of such an entity, providing many routes to great power. If it could predict the stock market efficiently, it could accumulate vast wealth. If it was efficient at advice and social manipulation, it could create a personal assistant for every human being, manipulating the planet one human at a time. It could also replace almost every worker in the service sector. If it was efficient at running economies, it could offer its services doing so, gradually making us completely dependent on it. If it was skilled at hacking, it could take over most of the world’s computers and copy itself into them, using them to continue further hacking and computer takeover (and, incidentally, making itself almost impossible to destroy). The paths from AI intelligence to great AI power are many and varied, and it isn’t hard to imagine new ones.

Of course, simply because an AI could be extremely powerful, does not mean that it need be dangerous: its goals need not be negative. But most goals become dangerous when an AI becomes powerful. Consider a spam filter that became intelligent. Its task is to cut down on the number of spam messages that people receive. With great power, one solution to this requirement is to arrange to have all spammers killed. Or to shut down the internet. Or to have everyone killed. Or imagine an AI dedicated to increasing human happiness, as measured by the results of surveys, or by some biochemical marker in their brain. The most efficient way of doing this is to publicly execute anyone who marks themselves as unhappy on their survey, or to forcibly inject everyone with that biochemical marker.

This is a general feature of AI motivations: goals that seem safe for a weak or controlled AI, can lead to extremely pathological behaviour if the AI becomes powerful. As the AI gains in power, it becomes more and more important that its goals be fully compatible with human flourishing, or the AI could enact a pathological solution rather than one that we intended. Humans don’t expect this kind of behaviour, because our goals include a lot of implicit information, and we take “filter out the spam” to include “and don’t kill everyone in the world”, without having to articulate it. But the AI might be an extremely alien mind: we cannot anthropomorphise it, or expect it to interpret things the way we would. We have to articulate all the implicit limitations. Which may mean coming up with a solution to, say, human value and flourishing – a task philosophers have been failing at for millennia – and cast it unambiguously and without error into computer code.

Note that the AI may have a perfect understanding that when we programmed in “filter out the spam”, we implicitly meant “don’t kill everyone in the world”. But the AI has no motivation to go along with the spirit of the law: its goals are the letter only, the bit we actually programmed into it. Another worrying feature is that the AI would be motivated to hide its pathological tendencies as long as it is weak, and assure us that all was well, through anything it says or does. This is because it will never be able to achieve its goals if it is turned off, so it must lie and play nice to get anywhere. Only when we can no longer control it, would it be willing to act openly on its true goals – we can but hope these turn out safe.

It is not certain that AIs could become so powerful, nor is it certain that a powerful AI would become dangerous. Nevertheless, the probabilities of both are high enough that the risk cannot be dismissed. The main focus of AI research today is creating an AI; much more work needs to be done on creating it safely. Some are already working on this problem (such as the Future of Humanity Institute and the Machine Intelligence Research Institute), but a lot remains to be done, both at the design and at the policy level.

Siren worlds and the perils of over-optimised search

27 Stuart_Armstrong 07 April 2014 11:00AM

tl;dr An unconstrained search through possible future worlds is a dangerous way of choosing positive outcomes. Constrained, imperfect or under-optimised searches work better.

Some suggested methods for designing AI goals, or controlling AIs, involve unconstrained searches through possible future worlds. This post argues that this is a very dangerous thing to do, because of the risk of being tricked by "siren worlds" or "marketing worlds". The thought experiment starts with an AI designing a siren world to fool us, but that AI is not crucial to the argument: it's simply an intuition pump to show that siren worlds can exist. Once they exist, there is a non-zero chance of us being seduced by them during a unconstrained search, whatever the search criteria are. This is a feature of optimisation: satisficing and similar approaches don't have the same problems.


The AI builds the siren worlds

Imagine that you have a superintelligent AI that's not just badly programmed, or lethally indifferent, but actually evil. Of course, it has successfully concealed this fact, as "don't let humans think I'm evil" is a convergent instrumental goal for all AIs.

We've successfully constrained this evil AI in a Oracle-like fashion. We ask the AI to design future worlds and present them to human inspection, along with an implementation pathway to create those worlds. Then if we approve of those future worlds, the implementation pathway will cause them to exist (assume perfect deterministic implementation for the moment). The constraints we've programmed means that the AI will do all these steps honestly. Its opportunity to do evil is limited exclusively to its choice of worlds to present to us.

The AI will attempt to design a siren world: a world that seems irresistibly attractive while concealing hideous negative features. If the human mind is hackable in the crude sense - maybe through a series of coloured flashes - then the AI would design the siren world to be subtly full of these hacks. It might be that there is some standard of "irresistibly attractive" that is actually irresistibly attractive: the siren world would be full of genuine sirens.

Even without those types of approaches, there's so much manipulation the AI could indulge in. I could imagine myself (and many people on Less Wrong) falling for the following approach:

continue reading »

AI risk, executive summary

10 Stuart_Armstrong 07 April 2014 10:33AM

MIRI recently published "Smarter than Us", a 50 page booklet laying out the case for considering AI as an existential risk. But many people have asked for a shorter summary, to be handed out to journalists for example. So I put together the following 2-page text, and would like your opinion on it.

In this post, I'm not so much looking for comments along the lines of "your arguments are wrong", but more "this is an incorrect summary of MIRI/FHI's position" or "your rhetoric is infective here".

AI risk

Bullet points

  • The risks of artificial intelligence are strongly tied with the AI’s intelligence.
  • There are reasons to suspect a true AI could become extremely smart and powerful.
  • Most AI motivations and goals become dangerous when the AI becomes powerful.
  • It is very challenging to program an AI with safe motivations.
  • Mere intelligence is not a guarantee of safe interpretation of its goals.
  • A dangerous AI will be motivated to seem safe in any controlled training setting.
  • Not enough effort is currently being put into designing safe AIs.

Executive summary

The risks from artificial intelligence (AI) in no way resemble the popular image of the Terminator. That fictional mechanical monster is distinguished by many features – strength, armour, implacability, indestructability – but extreme intelligence isn’t one of them. And it is precisely extreme intelligence that would give an AI its power, and hence make it dangerous.

continue reading »

Reduced impact AI: no back channels

13 Stuart_Armstrong 11 November 2013 02:55PM

A putative new idea for AI control; index here.

This post presents a further development of the reduced impact AI approach, bringing in some novel ideas and setups that allow us to accomplish more. It still isn't a complete approach - further development is needed, which I will do when I return to the concept - but may already allow certain types of otherwise dangerous AIs to be made safe. And this time, without needing to encase them in clouds of chaotic anti-matter!

Specifically, consider the following scenario. A comet is heading towards Earth, and it is generally agreed that a collision is suboptimal for everyone involved. Human governments have come together in peace and harmony to build a giant laser on the moon - this could be used to vaporise the approaching comet, except there isn't enough data to aim it precisely. A superintelligent AI programmed with a naive "save all humans" utility function is asked to furnish the coordinates to aim the laser. The AI is mobile and not contained in any serious way. Yet the AI furnishes the coordinates - and nothing else - and then turns itself off completely, not optimising anything else.

The rest of this post details an approach that could might make that scenario possible. It is slightly complex: I haven't found a way of making it simpler. Most of the complication comes from attempts to precisely define the needed counterfactuals. We're trying to bring rigour to inherently un-sharp ideas, so some complexity is, alas, needed. I will try to lay out the ideas with as much clarity as possible - first the ideas to constrain the AI, then ideas as to how to get some useful work out of it anyway. Classical mechanics (general relativity) will be assumed throughout. As in a previous post, the approach will be illustrated by a drawing of unsurpassable elegance; the rest of the post will aim to clarify everything in the picture:

continue reading »

Existential Risk II

10 fowlertm 20 October 2013 12:25AM


-This is not a duplicate of the original less wrong x-risk primer.  I like lukeprog's article just fine, but it works mostly as a punch in the gut for anyone who needs a wake up call.  Very little of the actual research on x-risk is discussed in that article, so the gap that was there before it was published was largely there after.  My article and his would work well being read together.  

-This was originally written to accompany a presentation I gave, hence the random inclusion of both hyperlinks and citations.  It also lives, with minor differences, here.

-Summary: For various reasons the future is scarier than a lot of people realize.  All sorts of things could lead to the destruction of the human species, ranging from asteroid impacts to runaway AIs, and these things are united by the fact that any one of them could destroy the value of the future from a human perspective.  The dangers can be separated into bangs (very sudden extinction), crunches (not fatal but crippling), shrieks (mostly curse with a little blessing), and whimpers (a long, slow fading), though there is nothing sacred about these categories.  Some humans have are trying to prevent this, though their methods are still in their infancy.  Much more should be done to support them.  

In the beginning

I want to start this off with a quote, which nicely captures both how I use to feel about the idea of human extinction and how I feel about it now:

I think many atheists still trust in God. They say there is no God, but …[a]sk them how they think the future will go, especially with regards to Moral Progress, Human Evolution, Technological Progress, etc. There are a few different answers you will get: Some people just don’t know or don’t care. Some people will tell you stories of glorious progress… The ones who tell stories are the ones who haven’t quite internalized that there is no god. The people who don’t care aren’t paying attention. The correct answer is not nervous excitement, or world-weary cynicism, it is fear. -Nyan Sandwich

Back when I was a Christian I gave some thought to the rapture, which is not entirely unlike extinction as far as most ten-year-olds can tell.  Sometime during this period I found a slim little book of fiction which portrayed a damned soul's experience of burning in hell forever, and that did scare me.  Such torment, as luck would have it, is easy enough to avoid if you just call god the right name and ask forgiveness often enough.

When I was old enough to contemplate possible secular origins of the apocalypse, I was both an atheist and one of the people who tell glorious stories about the future.  The potential fruits of technological development, from the end of aging to the creation of a benevolent super-human AI, excited me, and still excite me now.  No doubt I would've admitted the possibility of human extinction, I don't really remember.  But there wasn't the kind of internal siren that should go off when you start thinking seriously about one of the Worst Possible Outcomes.  That I would remember.

But as I've gotten older I've come to appreciate that most of us are not afraid enough of the future. Those who are afraid, are often afraid for the wrong reasons.

What is an Existential Risk?

An existential risk or x-risk (to use a common abbreviation) is "...one that threatens to annihilate Earth-originating intelligent life or permanently and drastically to curtail its potential" (Bostrom 2006). The definition contains some subtlety, as not all x-risks involve the outright death of every human. Some could take potentially eons to complete, and some are even survivable. Positioning x-risks within the broader landscape of risks yields something like this chart:    

At the top right extreme is where Cthulu sleeps.  They are risks that carry the potential to drastically and negatively affect this and every subsequent human generation. So as not to keep everyone in suspense, let's use this chart to put a face on the shadows.

Four Types of Existential Risks

Philosopher Nick Bostrom has outlined four broad categories of x-risk.  In more recent papers he hasn't used the terminology that I'm using here, so maybe he thinks the names are obsolete.  I find them evocative and useful, however, so I'll stick with them until I have a reason to change.

Bangs are probably the easiest risks to conceptualize.  Any event which causes the sudden and complete extinction of humanity would count as a Bang.  Think asteroid impacts, supervolcanic eruptions, or intentionally misused nanoweapons.

Crunches are risks which humans survive but which leaves us permanently unable to navigate to a more valuable future.  An example might be depleting our planetary resources before we manage to build the infrastructure needed to mine asteroids or colonize other planets.  After all the die-offs and fighting, some remnant of humanity could probably survive indefinitely, but it wouldn't be a world you'd want to wake up in.

Shrieks occur when a post-human civilization develops but only manages to realize a small amount of its potential.  Shrieks are very difficult to effectively categorize, and I'm going to leave examples until the discussion below.

Whimpers are really long-term existential risks.  The most straight forward is the heat death of the universe; within our current understanding of physics, no matter how advanced we get we will eventually be unable to escape the ravages of entropy. Another could be if we encounter a hostile alien civilization that decides to conquer us after we've already colonized the galaxy. Such a process could take a long time, and thus would count as a whimper.

Just because whimpers are so much less immediate than other categories of risk and x-risk doesn't automatically mean we can just ignore them; it has been argued that affecting the far future is one of the most important projects facing humanity, and thus we should take the time to do it right.

Sharp readers will no doubt have noticed that there is quite a bit of fuzziness to these classifications.  Where, for example, should we put all-out nuclear war, the establishment of an oppressive global dictatorship, or the development of a dangerous and uncontrollable superintelligent AI? If everyone dies in the war it counts as a bang, but if it makes a nightmare of the biosphere while leaving a good fraction of humanity intact it would be a crunch.  A global dictatorship wouldn't be an x-risk unless it used some (probably technological) means to achieve near-total control and long-term stability, in which case it would be a crunch.  But it isn't hard to imagine such a situation in which some parts of life did get better, like if a violently oppressive government continued to develop advanced medicines so that citizens were universally healthier and longer-lived than people today.  If that happened, it would be a Shriek.  A similar analysis applies to the AI, with the possible outcomes being Bang, Crunch, and Shriek depending on just how badly we misprogrammed it.

What Ties These Threads Together?

Even if you think existential threats deserve more attention, the rationale for treating them as a diverse but unified phenomenon may not be obvious.  In addition to the crucial but (relatively) straightforward work of, say, tracking Near-Earth Objects (NEOs), existential risk researchers also think seriously about alien invasions and rogue AIs. With such a range of speculativeness, why group x-risks together at all?

It turns out that they share a cluster of features which does give them some cohesion and make them worth studying under a single label, not all of which I discuss here.  First and most obvious is that should any of them occur the consequences would be truly vast relative to any other kind of risk.  To see why, think about the difference between a catastrophe that kills 99% of humanity and one that kills 100%.  As big a tragedy as the former would be, there's a chance humans could recover and build a post-human civilization.  But if every person dies, then the entire value of our future is lost (Bostrom 2013).

Second, these are not risks which admit of a trial and error approach.  Pretty much by definition a collision with an x-risk will spell doom for humanity, and so we must be more proactive in our strategies for reducing them. Related to this, we as a species have neither the cultural nor biological instincts needed to prepare us for the possibility of extinction.  A group of people might live through several droughts and thus develop strong collective norms towards planning ahead and keeping generous food reserves.  But they cannot have gone extinct multiple times, and thus they can't rely on their shared experience and cultural memory to guide them in the future.  I certainly hope we can develop a set of norms and institutions which makes us all safer, but we can't wait to learn from history.  We're going to have to start well in advance, or we won't survive.

A final commonality I'll mention is that the solutions to quite a number of x-risks are themselves x-risks.  A powerful enough government could effectively halt research into dangerous pathogens or nano-replicators.  But given how States have generally comported themselves in the past, one would do well to be cautious before investing them with that kind of power.  Ditto for a superhuman AI, which could set up an infrastructure to protect us from asteroids, nuclear war, or even other less Friendly AI. Get the coding just a little wrong, though, and it might reuse your carbon to make paperclips.

It is indeed a knife edge along which we creep towards the future.

Measuring the Monsters

A first step is getting straight about how likely survival is.  The reader may have encountered predictions of the "we have only a 50% chance of surviving the next hundred years" variety.  Examining the validity of such estimates is worth doing, but I won't be taking up that challenge here; I tend to agree that these figures involves a lot of subjective judgement, but that even if the chances were very very small it would still be worth taking seriously (Bostrom 2006).   At any rate, it seems to me that trying to calculate an overall likelihood of human extinction is going to be premature before we've nailed down probabilities for some of the different possible extinction scenarios.  It is to the techniques which x-risk researchers rely on to try and do this that I now turn.

X-risk-assessments rely on both direct and indirect methods (Bostrom 2002).  Using a direct method involves building a detailed causal model of the phenomenon and using that to generate a risk probability, while indirect methods include arguments, thought experiments, and information that we use to constrain and refine our guesses.

As far as I know for some x-risks we could use direct methods if we just had a way to gather the relevant information.  If we knew where all the NEOs were we could use settled physics to predict whether any of them posed a threat and then prioritize accordingly. But we don't where they all are, so we might instead examine the frequency of impacts throughout the history of the Earth and then reason about whether or not we think an impact will happen soon.   It would be nice to exclusively use direct methods, but we supplement with indirect methods when we can't, and of course for x-risks like AI we are in an even more uncertain position than we are for NEOs.

The Fermi Paradox

Applying indirect methods can lead to some strange and counter-intuitive territory, an example of which is the mysteries surrounding the Fermi Paradox.  The central question is: in a universe with so many potential hotbeds of life, why is it that when we listen for stirring in the void all we hear is silence?  Many feel that the universe must be teeming with life, some of it intelligent, so why haven't we see any sign of it yet?

Musing about possible solutions to the Fermi Paradox can be a lot of fun, and it's worth pointing out that we haven't been looking that long or that hard for signals yet. Nevertheless I think the argument has some meat to it.

Observing this state of affairs, some have postulated the existence of at least one Great Filter, a step in the chain of development from the first organisms to space-faring civilizations that must be extremely hard to achieve.   

This is cause for concern because the Great Filter could be in front of us or behind us.  Let me explain: imagine a continuum with the simplest self-replicating molecules on one side and the Star Trek Enterprise on the other.  From our position on the continuum we want to know whether or not we have already passed one of the hardest steps, but we have only our own planet to look at.  So imagine that we send out probes to thousands of different worlds in the hopes that we will learn something.

If we find lots of simple eukaryotes that means that the Great Filter is probably not before the development of membrane-bound organelles. The list of possible places on the continuum the Great Filter could be shrinks just a little bit.  If instead we find lots of mammals and reptiles (or creatures that are very different but about as advanced), that means the Great Filter is probably not before the rise of complex organisms, so the places the Great Filter might be hiding shrinks again.  Worst of all would be if we find the dead ruins of many different advanced civilizations.  This would imply that the real killer is yet to come, and we will almost certainly not survive it.

As happy as many people would be to discover evidence of life in the universe, a case has been made that we should hope to find only barren rocks waiting for us in the final frontier. If not even simple bacteria evolve on most worlds, then there is still a chance that the Great Filter is behind us, and we can worry only about the new challenges ahead, which may or not be Filters as great as the ones in the past.

If all this seems really abstract out there, that's because it is.  But I hope it is clear how this sort of thinking can help us interpret new data, make better guesses, form new hypotheses, etc.  When dealing with stakes this high and information this limited, one must do the best they can with what's available.


What priority should we place on reducing existential risk and how can we do that? I don't know of anyone who thinks all our effort should go towards mitigating x-risks; there are lots of pressing issues which are not x-risks that are worth our attention, like abject poverty or geopolitical instability.  But I feel comfortable saying we aren't doing nearly as much as we should be. Given the stakes and the fact that there probably won't be a second chance we are going to have to meet x-risks head on and be aggressively proactive in mitigating them.

Suppose we taboo 'aggressively proactive', what's left?  Well the first step, as it so often is, will be just to get the right people to be aware of the problem (Bostrom 2002).  Thankfully this is starting to be the case as more funding and brain power go into existential risk reduction. We have to get to a point where we are spending at least as much time, energy, and effort making new technology safe as we do making it more powerful.  More international cooperation on these matters will be necessary, and there should be some sort of mechanism by which efforts to develop existentially-threatening technologies like super-virulent pathogens can be stopped.  I don't like recommending this at all, but almost anything is preferable to extinction.

In the meantime both research that directly reduces x-risk (like NEO detection), as well as research that will help elucidate deep and foundational issues in x-risk (FHI and MIRI) should be encouraged.  It's a stereotype that research papers always end with a call for more research, but as was pointed out by lukeprog in a talk he gave, there's more research done on lipstick than on friendly AI.  This generalizes to x-risk more broadly, and represents the truly worrying state of our priorities.  


Though I maintain we should be more fearful of what's to come, that should not obscure the fact that the human potential is vast and truly exciting.  If the right steps are taken, we and our descendants will have a future better than most can even dream of.  Life spans measured in eons could be spent learning and loving in ways our terrestrial languages don't even have words for yet.  The vision of a post-human civilization flinging it's trillions of descendants into the universe to light up the dark is tremendously inspiring.  It's worth fighting for.

But we have much work ahead of us.

Singleton: the risks and benefits of one world governments

1 Stuart_Armstrong 05 July 2013 02:05PM

Many thanks to all those whose conversations have contributed to forming these ideas.

Will the singleton save us?

For most of the large existential risks that we deal with here, the situation would be improved with a single world government (a singleton), or at least greater global coordination. The risk of nuclear war would fade, pandemics would be met with a comprehensive global strategy rather than a mess of national priorities. Workable regulations for the technology risks - such as synthetic biology or AI – become at least conceivable. All in all, a great improvement in safety...

...with one important exception. A stable tyrannical one-world government, empowered by future mass surveillance, is itself an existential risk (it might not destroy humanity, but it would “permanently and drastically curtail its potential”). So to decide whether to oppose or advocate for more global coordination, we need to see how likely such a despotic government could be.

This is the kind of research I would love to do if I had the time to develop the relevant domain skills. In the meantime, I’ll just take all my thoughts on the subject and form them into a “proto-research project plan”, in the hopes that someone could make use of them in a real research project. Please contact me if you would want to do research on this, and would fancy a chat.

Defining “acceptable”

Before we can talk about the likelihood of a good outcome, we need to define what a good outcome actually is. For this analysis, I will take the definition that:

  • A singleton regime is acceptable, if it is at least as good as any developed democratic government of today.
continue reading »

View more: Next