Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

[Link] Slashdot: study Finds Little Lies Lead To Bigger Ones

0 Gunnar_Zarncke 26 October 2016 06:53AM

[Link] Scientists Create AI Program That Can Predict Human Rights Trials With 79 Percent Accuracy

0 Gunnar_Zarncke 26 October 2016 06:47AM

Trying to find a short story

0 mgin 25 October 2016 02:27AM

It's a story about a boy who is into science and transhumanism, and a girl he told about all these crazy things that were going to happen. He dies and all of the things he said started to happen. She ended up floating around Saturn remembering him.

Either he or she was in the wheelchair. He was dying and he was disappointed he was dying because of all the cool stuff that was going to happen that she was going to be around for, and some of it had to do with whatever problem she had that was going to get fixed.

Please help me find this story if you can.

[Link] How Feasible Is the Rapid Development of Artificial Superintelligence?

7 Kaj_Sotala 24 October 2016 08:43AM

Open thread, Oct. 24 - Oct. 30, 2016

1 MrMind 24 October 2016 06:54AM

If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

Internal Race Conditions

1 SquirrelInHell 23 October 2016 01:23PM

Time start: 14:40:36


You might be familiar with the concept of a 'bug', as introduced by CFAR. By using the computer programming analogy, it frames any problem you might have in your life as something fixable... even more - as something to be fixed, something such that fixing it or thinking about how to fix it is the first thing that comes to mind when you see such a problem, or 'bug'.

Let's try another analogy in the same style, with something called 'race conditions' in programming. A race condition as a particular type of bug, that is typically very hard to find and fix ('debug'). It occurs when two or more parts of the same program 'race' to access some data, resource, decision point etc., in a way that is not controlled by any organised principle.

For example, imagine that you have a document open in an editor program. You make some changes, you give a command to save the file. While this operation is in progress, you drag and drop the same file in a file manager, moving to another hard drive. In this case, depending on timing, on the details of the programs, and on the operating system that you are using, you might get different results. The old version of the file might be moved to the new location, while the new one is saved in the old location. Or the file might get saved first, and then moved. Or the saving operation will end in an error, or in a truncated or otherwise malformed file on the disk.

If you know enough details about the situation, you could in fact work out what exactly would happen. But the margin of error in your own handling of the software is so big, that you cannot in practice do this (e.g. you'd need to know the exact milisecond when you press buttons etc.). So in practice, the outcome is random, depending on how the events play out on a scale smaller that you can directly control (e.g. minute differences in timing, strength of reactions etc.).


What is the analogy in humans? One of the places in which when you look hard, you'll see this pattern a lot is the relation of emotions and conscious decision making.

E.g., a classic failure mode is a "commitment to emotions", which goes like this:

  • I promise to love you forever
  • however if I commit to this, I will have doubts and less freedom, which will generate negative emotions
  • so I'll attempt to fall in love faster than my doubts grow
  • let's do this anyway, why won't we?

The problem here is a typical emotional "race condition": there is a lot of variability in the outcome, depending on how events play out. There could be a "butterfly effect", in which e.g. a single weekend trip together could determine the fate of the relationship, by creating a swing up or down, which would give one side of emotions a head start in the race.


Another typical example is making a decision about continuing a relationship:

  • when I spend time with you, I like you more
  • when I like you more, I want to continue our relationship
  • when we have a relationship, I spend more time with you

As you can see, there is a loop in decision process. This cannot possibly end well.

A wild emotional rollercoaster is probably around the least bad outcome of this setup.


So how do you fix race conditions?

By creating structure.

By following principles which compute the result explicitly, without unwanted chaotic behaviour.

By removing loops from decision graphs.

First and foremost, by recognizing that leaving a decision to a race condition is strictly worse than any decision process that we consciously design, even if this process is flipping the coin (at least you know the odds!).

Example: deciding to continue the relationship.

Proposed solution (arrow represent influence):

(1) controlled, long-distance emotional evaluation -> (2) systemic decision -> (3) day-to-day emotions

The idea is to remove the loop by organising emotions into tho groups: those that are directly influenced by the decision or its consequences (3), and more distant "evaluation" emotions (1). A possibility to feel emotions as in (1) can be created by pre-deciding a time to have some time alone and judge the situation from more distance, e.g. "after 6 months of this relationship I will go for a 2 week vacation to by aunt in France, and think about it in a clear-headed way, making sure I consider emotions about the general picture, not day-to-day things like physical affection etc.".


There is much to write on this topic, so please excuse my brevity (esp. in the last part, giving some examples of systemic thinking about emotions) - there is easily enough content about this to fill a book (or two). But I hope I gave you some idea.

Time end: 15:15:42

Writing stats: 31 minutes, 23 wpm, 133 cpm

What's the most annoying part of your life/job?

11 Liron 23 October 2016 03:37AM

Hi, I'm an entrepreneur looking for a startup idea.

In my experience, the reason most startups fail is because they never actually solve anyone's problem. So I'm cheating and starting out by identifying a specific person with a specific problem.

So I'm asking you, what's the most annoying part of your life/job? Also, how much would you pay for a solution?

[Link] Conscious Exotica - structure of the space of possible minds

1 morganism 21 October 2016 11:45PM

New LW Meetup: Zurich

1 FrankAdamek 21 October 2016 10:47AM

[Link] The Leverhulme Centre for the Future of Intelligence officially launches.

1 ignoranceprior 21 October 2016 01:22AM

*How* people shut down thought because of high-status respectable halos

8 NancyLebovitz 20 October 2016 02:09PM


A detailed look at the belief that high status social structures can be so much better than anything one can think of that there's no point in even trying to think about the details of what to do, and how debilitating this is.

Discussion of the essay

A problem in anthropics with implications for the soundness of the simulation argument.

5 philosophytorres 19 October 2016 09:07PM

What are your intuitions about this? It has direct implications for whether the Simulation Argument is sound.


Imagine two rooms, A and B. Between times t1 and t2, 100 trillion people sojourn in room A while 100 billion sojourn in room B. At any given moment, though, exactly 1 person occupies room A while 1,000 people occupy room B. At t2, you find yourself in a room, but you don't know which one. If you have to place a bet on which room it is (at t2), what do you say? Do you consider the time-slice or the history of room occupants? How do you place your bet?


If you bet that you're in room B, then the Simulation Argument may be flawed: there could be a fourth disjunct that Bostrom misses, namely that we become a posthuman civilization that runs a huge number of simulations yet we don't have reason for believing that we're stimulants.



[Link] Program good ethics into artificial intelligence

2 XFrequentist 19 October 2016 04:28PM

[Link] AI-ON is an open community dedicated to advancing Artificial Intelligence

3 morganism 18 October 2016 10:17PM

[Link] The Non-identity Problem - Another argument in favour of classical utilitarianism

2 casebash 18 October 2016 01:41PM

Open thread, Oct. 17 - Oct. 23, 2016

3 MrMind 17 October 2016 07:02AM

If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

Astrobiology IV: Photosynthesis and energy

8 CellBioGuy 17 October 2016 12:30AM

Originally I sat down to write about the large-scale history of Earth, and line up the big developments that our biosphere has undergone in the last 4 billion years.  But after writing about the reason that Earth is unique in our solar system (that is, photosynthesis being an option here), I guess I needed to explore photosynthesis and other forms of metabolism on Earth in a little more detail and before I knew it I’d written more than 3000 words about it.  So, here we are, taking a deep dive into photosynthesis and energy metabolism, and trying to determine if the origin of photosynthesis is a rare event or likely anywhere you get a biosphere with light falling on it.  Warning:  gets a little technical.


In short, I think it’s clear from the fact that there are multiple origins of it that phototrophy, using light for energy, is likely to show up anywhere there is light and life.  I suspect, but cannot rigorously prove, that even though photosynthesis of biomass only emerged once it was an early development in life on Earth emerging very near the root of the Bacterial tree and just produced a very strong first-mover advantage crowding out secondary origins of it, and would probably also show up where there is life and light.  As for oxygen-producing photosynthesis, its origin from more mundane other forms of photosynthesis is still being studied.  It required a strange chaining together of multiple modes of photosynthesis to make it work, and only ever happened once as well.  Its time of emergence, early or late, is pretty unconstrained and I don’t think there’s sufficient evidence to say one way or another if it is likely to happen anywhere there is photosynthesis.  It could be subject to the same ‘first mover advantage’ situation that other photosynthesis may have encountered as well.  But once it got going, it would naturally take over biomass production and crowd out other forms of photosynthesis due to the inherent chemical advantages it has on any wet planet (that have nothing to do with making oxygen) and its effects on other forms of photosynthesis.

Oxygen in the atmosphere had some important side effects, one which most people care about being allowing big complicated energy-gobbling organisms like animals – all that energy that organisms can get burning biomass in oxygen lets organisms that do so do a lot of interesting stuff.  Looking for oxygen in the atmospheres of other terrestrial planets would be an extremely informative experiment, as the presence of this substance would suggest that a process very similar to the process that created our huge diverse and active biosphere were underway.

[Link] There are 125 sheep and 5 dogs in a flock. How old is the shepherd? / Math Education

6 James_Miller 17 October 2016 12:12AM

[Link] H+Pedia opens projects and editorial portal

1 Deku-shrub 16 October 2016 09:41PM

[Link] Video to induce hallucinations , meme implanter?

2 morganism 16 October 2016 08:33PM

Agential Risks: A Topic that Almost No One is Talking About

6 philosophytorres 15 October 2016 06:41PM

(Happy to get feedback on this! It draws from and expounds ideas in this article: http://jetpress.org/v26.2/torres.htm)

Consider a seemingly simple question: if the means were available, who exactly would destroy the world? There is surprisingly little discussion of this question within the nascent field of existential risk studies. But it’s an absolutely crucial issue: what sort of agent would either intentionally or accidentally cause an existential catastrophe?

The first step forward is to distinguish between two senses of an existential risk. Nick Bostrom originally defined the term as: “One where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.” It follows that there are two distinct scenarios, one endurable and the other terminal, that could realize an existential risk. We can call the former an extinction risk and the latter a stagnation risk. The importance of this distinction with respect to both advanced technologies and destructive agents has been previously underappreciated.

So, the question asked above is actually two questions in disguise. Let’s consider each in turn.

Terror: Extinction Risks

First, the categories of agents who might intentionally cause an extinction catastrophe are fewer and smaller than one might think. They include:

(1) Idiosyncratic actors. These are malicious agents who are motivated by idiosyncratic beliefs and/or desires. There are instances of deranged individuals who have simply wanted to kill as many people as possible and then die, such as some school shooters. Idiosyncratic actors are especially worrisome because this category could have a large number of members (token agents). Indeed, the psychologist Martha Stout estimates that about 4 percent of the human population suffers from sociopathy, resulting in about 296 million sociopaths. While not all sociopaths are violent, a disproportionate number of criminals and dictators have (or very likely have) had the condition.

(2) Future ecoterrorists. As the effects of climate change and biodiversity loss (resulting in the sixth mass extinction) become increasingly conspicuous, and as destructive technologies become more powerful, some terrorism scholars have speculated that ecoterrorists could become a major agential risk in the future. The fact is that the climate is changing and the biosphere is wilting, and human activity is almost entirely responsible. It follows that some radical environmentalists in the future could attempt to use technology to cause human extinction, thereby “solving” the environmental crisis. So, we have some reason to believe that this category could become populated with a growing number of token agents in the coming decades.

(3) Negative utilitarians. Those who hold this view believe that the ultimate aim of moral conduct is to minimize misery, or “disutility.” Although some negative utilitarians like David Pearce see existential risks as highly undesirable, others would welcome annihilation because it would entail the elimination of suffering. It follows that if a “strong” negative utilitarian had a button in front of her that, if pressed, would cause human extinction (say, without causing pain), she would very likely press it. Indeed, on her view, doing this would be the morally right action. Fortunately, this version of negative utilitarianism is not a position that many non-academics tend to hold, and even among academic philosophers it is not especially widespread.

(4) Extraterrestrials. Perhaps we are not alone in the universe. Even if the probability of life arising on an Earth-analog is low, the vast number of exoplanets suggests that the probability of life arising somewhere may be quite high. If an alien species were advanced enough to traverse the cosmos and reach Earth, it would very likely have the technological means to destroy humanity. As Stephen Hawking once remarked, “If aliens visit us, the outcome would be much as when Columbus landed in America, which didn’t turn out well for the Native Americans.”

(5) Superintelligence. The reason Homo sapiens is the dominant species on our planet is due almost entirely to our intelligence. It follows that if something were to exceed our intelligence, our fate would become inextricably bound up with its will. This is worrisome because recent research shows that even slight misalignments between our values and those motivating a superintelligence could have existentially catastrophic consequences. But figuring out how to upload human values into a machine poses formidable problems — not to mention the issue of figuring out what our values are in the first place.

Making matters worse, a superintelligence could process information at about 1 million times faster than our brains, meaning that a minute of time for us would equal approximately 2 years in time for the superintelligence. This would immediately give the superintelligence a profound strategic advantage over us. And if it were able to modify its own code, it could potentially bring about an exponential intelligence explosion, resulting in a mind that’s many orders of magnitude smarter than any human. Thus, we may have only one chance to get everything just right: there’s no turning back once an intelligence explosion is ignited.

A superintelligence could cause human extinction for a number of reasons. For example, we might simply be in its way. Few humans worry much if an ant genocide results from building a new house or road. Or the superintelligence could destroy humanity because we happen to be made out of something it could use for other purposes: atoms. Since a superintelligence need not resemble human intelligence in any way — thus, scholars tell us to resist the dual urges of anthropomorphizing and anthropopathizing — it could be motivated by goals that appear to us as utterly irrational, bizarre, or completely inexplicable.

Terror: Stagnation Risks

Now consider the agents who might intentionally try to bring about a scenario that would result in a stagnation catastrophe. This list subsumes most of the list above in that it includes idiosyncratic actors, future ecoterrorists, and superintelligence, but it probably excludes negative utilitarians, since stagnation (as understood above) would likely induce more suffering than the status quo today. The case of extraterrestrials is unclear, given that we can infer almost nothing about an interstellar civilization except that it would be technologically sophisticated.

For example, an idiosyncratic actor could harbor not a death wish for humanity, but a “destruction wish” for civilization. Thus, she or he could strive to destroy civilization without necessarily causing the annihilation of Homo sapiens. Similarly, a future ecoterrorist could hope for humanity to return to the hunter-gatherer lifestyle. This is precisely what motivated Ted Kaczynski: he didn’t want everyone to die, but he did want our technological civilization to crumble. And finally, a superintelligence whose values are misaligned with ours could modify Earth in such a way that our lineage persists, but our prospects for future development are permanently compromised. Other stagnation scenarios could involve the following categories:

(6) Apocalyptic terrorists. History is overflowing with groups that not only believed the world was about to end, but saw themselves as active participants in an apocalyptic narrative that’s unfolding in realtime. Many of these groups have been driven by the conviction that “the world must be destroyed to be saved,” although some have turned their activism inward and advocated mass suicide.

Interestingly, no notable historical group has combined both the genocidal and suicidal urges. This is why apocalypticists pose a greater stagnation terror risk than extinction risk: indeed, many see their group’s survival beyond Armageddon as integral to the end-times, or eschatological, beliefs they accept. There are almost certainly less than about 2 million active apocalyptic believers in the world today, although emerging environmental, demographic, and societal conditions could cause this number to significantly increase in the future, as I’ve outlined in detail elsewhere (see Section 5 of this paper).

(7) States. Like terrorists motivated by political rather than transcendent goals, states tend to place a high value on their continued survival. It follows that states are unlikely to intentionally cause a human extinction event. But rogue states could induce a stagnation catastrophe. For example, if North Korea were to overcome the world’s superpowers through a sudden preemptive attack and implement a one-world government, the result could be an irreversible decline in our quality of life.

So, there are numerous categories of agents that could attempt to bring about an existential catastrophe. And there appear to be fewer agent types who would specifically try to cause human extinction than to merely dismantle civilization.

Error: Extinction and Stagnation Risks

There are some reasons, though, for thinking that error (rather than terror) could constitute the most significant threat in the future. First, almost every agent capable of causing intentional harm would also be capable of causing accidental harm, whether this results in extinction or stagnation. For example, an apocalyptic cult that wants to bring about Armageddon by releasing a deadly biological agent in a major city could, while preparing for this terrorist act, inadvertently contaminate its environment, leading to a global pandemic.

The same goes for idiosyncratic agents, ecoterrorists, negative utilitarians, states, and perhaps even extraterrestrials. (Indeed, the large disease burden of Europeans was a primary reason Native American populations were decimated. By analogy, perhaps an extraterrestrial destroys humanity by introducing a new type of pathogen that quickly wipes us out.) The case of superintelligence is unclear, since the relationship between intelligence and error-proneness has not been adequately studied.

Second, if powerful future technologies become widely accessible, then virtually everyone could become a potential cause of existential catastrophe, even those with absolutely no inclination toward violence. To illustrate the point, imagine a perfectly peaceful world in which not a single individual has malicious intentions. Further imagine that everyone has access to a doomsday button on her or his phone; if pushed, this button would cause an existential catastrophe. Even under ideal societal conditions (everyone is perfectly “moral”), how long could we expect to survive before someone’s finger slips and the doomsday button gets pressed?

Statistically speaking, a world populated by only 1 billion people would almost certainly self-destruct within a 10-year period if the probability of any individual accidentally pressing a doomsday button were a mere 0.00001 percent per decade. Or, alternatively: if only 500 people in the world were to gain access to a doomsday button, and if each of these individuals had a 1 percent chance of accidentally pushing the button per decade, humanity would have a meager 0.6 percent chance of surviving beyond 10 years. Thus, even if the likelihood of mistakes is infinitesimally small, planetary doom will be virtually guaranteed for sufficiently large populations.

The Two Worlds Thought Experiment

The good news is that a focus on agential risks, as I’ve called them, and not just the technological tools that agents might use to cause a catastrophe, suggests additional ways to mitigate existential risk. Consider the following thought-experiment: a possible world A contains thousands of advanced weapons that, if in the wrong hands, could cause the population of A to go extinct. In contrast, a possible world B contains only a single advanced “weapon of total destruction” (WTD). Which world is more dangerous? The answer is obviously world A.

But it would be foolishly premature to end the analysis here. Imagine further that A is populated by compassionate, peace-loving individuals, whereas B is overrun by war-mongering psychopaths. Now which world appears more likely to experience an existential catastrophe? The correct answer is, I would argue, world B.

In other words: agents matter as much as, or perhaps even more than, WTDs. One simply can’t evaluate the degree of risk in a situation without taking into account the various agents who could become coupled to potentially destructive artifacts. And this leads to the crucial point: as soon as agents enter the picture, we have another variable that could be manipulated through targeted interventions to reduce the overall probability of an existential catastrophe.

The options here are numerous and growing. One possibility would involve using “moral bioenhancement” techniques to reduce the threat of terror, given that acts of terror are immoral. But a morally enhanced individual might not be less likely to make a mistake. Thus, we could attempt to use cognitive enhancements to lower the probability of catastrophic errors, on the (tentative) assumption that greater intelligence correlates with fewer blunders.

Furthermore, implementing stricter regulations on CO2 emissions could decrease the probability of extreme ecoterrorism and/or apocalyptic terrorism, since environmental degradation is a “trigger” for both.

Another possibility, most relevant to idiosyncratic agents, is to reduce the prevalence of bullying (including cyberbullying). This is motivated by studies showing that many school shooters have been bullied, and that without this stimulus such individuals would have been less likely to carry out violent rampages. Advanced mind-reading or surveillance technologies could also enable law enforcement to identify perpetrators before mass casualty crimes are committed.

As for superintelligence, efforts to solve the “control problem” and create a friendly AI are of primary concern among many many researchers today. If successful, a friendly AI could itself constitute a powerful mitigation strategy for virtually all the categories listed above.

(Note: these strategies should be explicitly distinguished from proposals that target the relevant tools rather than agents. For example, Bostrom’s idea of “differential technological development” aims to neutralize the bad uses of technology by strategically ordering the development of different kinds of technology. Similarly, the idea of police “blue goo” to counter “grey goo” is a technology-based strategy. Space colonization is also a tool intervention because it would effectively reduce the power (or capacity) of technologies to affect the entire human or posthuman population.)

Agent-Tool Couplings

Devising novel interventions and understanding how to maximize the efficacy of known strategies requires a careful look at the unique properties of the agents mentioned above. Without an understanding of such properties, this important task will be otiose. We should also prioritize different agential risks based on the likely membership (token agents) of each category. For example, the number of idiosyncratic agents might exceed the number of ecoterrorists in the future, since ecoterrorism is focused on a single issue, whereas idiosyncratic agents could be motivated by a wide range of potential grievances.[1] We should also take seriously the formidable threat posed by error, which could be nontrivially greater than that posed by terror, as the back-of-the-envelope calculations above show.

Such considerations, in combination with technology-based risk mitigation strategies, could lead to a comprehensive, systematic framework for strategically intervening on both sides of the agent-tool coupling. But this will require the field of existential risk studies to become less technocentric than it currently is.

[1] Although, on the other hand, the stimulus of environmental degradation would be experienced by virtually everyone in society, whereas the stimuli that motivate idiosyncratic agents might be situationally unique. It’s precisely issues like these that deserve further scholarly research.

Map and Territory: a new rationalist group blog

8 gworley 15 October 2016 05:55PM

If you want to engage with the rationalist community, LessWrong is mostly no longer the place to do it. Discussions aside, most of the activity has moved into the diaspora. There are a few big voices like Robin and Scott, but most of the online discussion happens on individual blogs, Tumblr, semi-private Facebook walls, and Reddit. And while these serve us well enough, I find that they leave me wanting for something like what LessWrong was: a vibrant group blog exploring our perspectives on cognition and building insights towards a deeper understanding of the world.

Maybe I'm yearning for a golden age of LessWrong that never was, but the fact remains that there is a gap in the rationalist community that LessWrong once filled. A space for multiple voices to come together in a dialectic that weaves together our individual threads of thought into a broader narrative. A home for discourse we are proud to call our own.

So with a lot of help from fellow rationalist bloggers, we've put together Map and Territory, a new group blog to bring our voices together. Each week you'll find new writing from the likes of Ben Hoffman, Mike Plotz, Malcolm Ocean, Duncan Sabien, Anders Huitfeldt, and myself working to build a more complete view of reality within the context of rationality.

And we're only just getting started, so if you're a rationalist blogger please consider joining us. We're doing this on Medium, so if you write something other folks in the rationalist community would like to read, we'd love to consider sharing it through Map and Territory (cross-positing encouraged). Reach out to me on Facebook or email and we'll get the process rolling.


[Link] Reducing Risks of Astronomical Suffering (S-Risks): A Neglected Global Priority

6 ignoranceprior 14 October 2016 07:58PM

Weekly LW Meetups

0 FrankAdamek 14 October 2016 03:56PM

This summary was posted to LW Main on October 14th. The following week's summary is here.

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, New Hampshire, New York, Philadelphia, Research Triangle NC, San Francisco Bay Area, Seattle, St. Petersburg, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.

continue reading »

[Link] GiveWell: A case study in effective altruism, part 1

0 philh 14 October 2016 10:46AM

[Link] Wikipedia book based on betterhumans' article on cognitive biases

1 MathieuRoy 14 October 2016 01:03AM

The map of agents which may create x-risks

2 turchin 13 October 2016 11:17AM

Recently Phil Torres wrote an article  where he raises a new topic in existential risks research: the question about who could be possible agents in the creation of a global catastrophe. Here he identifies five main types of agents, and two main reasons why they will create a catastrophe (error and terror).  

He discusses the following types of agents: 


(1) Superintelligence. 

(2) Idiosyncratic actors.  

(3) Ecoterrorists.  

(4) Religious terrorists.  

(5) Rogue states.  


Inspired by his work I decided to create a map of all possible agents as well as their possible reasons for creating x-risks. During this work some new ideas appeared.  

I think that a significant addition to the list of agents should be superpowers, as they are known to have created most global risks in the 20th century; corporations, as they are now on the front line of AGI creation; and pseudo-rational agents who could create a Doomsday weapon in the future to use for global blackmail (may be with positive values), or who could risk civilization’s fate for their own benefits (dangerous experiments). 

The X-risks prevention community could also be an agent of risks if it fails to prevent obvious risks, or if it uses smaller catastrophes to prevent large risks, or if it creates new dangerous ideas of possible risks which could inspire potential terrorists.  

The more technology progresses, the more types of agents will have access to dangerous technologies, even including teenagers. (like: "Why This 14-Year-Old Kid Built a Nuclear Reactor” ) 

In this situation only the number of agents with risky tech will matter, not the exact motivations of each one. But if we are unable to control tech, we could try to control potential agents or their “medium" mood at least. 

The map shows various types of agents, starting from non-agents, and ending with types of agential behaviors which could result in catastrophic consequences (error, terror, risk etc). It also shows the types of risks that are more probable for each type of agent. I think that my explanation in each case should be self evident. 

We could also show that x-risk agents will change during the pace of technological progress. In the beginning there are no agents, and later there are superpowers, and then smaller and smaller agents, until there will be millions of people with biotech labs at home. In the end there will be only one agent - SuperAI.  

So, a lessening the number of agents, and increasing their ”morality” and intelligence seem to be the most plausible directions in lowering risks. Special organizations or social networks may be created to control the most risky type of agents. Differing agents probably need differing types of control. Some ideas of this agent-specific control are listed in the map, but a real control system should be much more complex and specific.

The map shows many agents, some of them real and exist now (but don’t have dangerous capabilities), and some are only possible in moral sense or in technical sense.


So there are 4 types of agents, and I show them in the map in different colours:


1) Existing and dangerous, that is already having technology to destroy the humanity. That is superpowers, arrogant scientists – Red

2) Existing, and willing to end the world, but lacking needed technologies. (ISIS, VHEMt) - Yellow

3) Morally possible, but don’t existing. We could imagine logically consistent value systems which may result in human extinction. That is Doomsday blackmail. - Green

4) Agents, which will pose risk only after supertechnologies appear, like AI-hackers, children biohackers. - Blue


Many agents types are not fit for this classification so I rest them white in the map. 


The pdf of the map is here: http://immortality-roadmap.com/agentrisk11.pdf





(The jpg of the map is below because side bar is closing part of it I put it higher)









(The jpg of the map is below because side bar is closing part of it I put it higher)














[Link] Barack Obama's opinions on near-future AI [Fixed]

3 scarcegreengrass 12 October 2016 03:46PM

[Link] An attempt in layman's language to explain the metaethics sequence in a single post.

1 Bound_up 12 October 2016 01:57PM

MIRI AMA plus updates

11 RobbBB 11 October 2016 11:52PM

MIRI is running an AMA on the Effective Altruism Forum tomorrow (Wednesday, Oct. 11): Ask MIRI Anything. Questions are welcome in the interim!

Nate also recently posted a more detailed version of our 2016 fundraising pitch to the EA Forum. One of the additions is about our first funding target:

We feel reasonably good about our chance of hitting target 1, but it isn't a sure thing; we'll probably need to see support from new donors in order to hit our target, to offset the fact that a few of our regular donors are giving less than usual this year.

The Why MIRI's Approach? section also touches on new topics that we haven't talked about in much detail in the past, but plan to write up some blog posts about in the future. In particular:

Loosely speaking, we can imagine the space of all smarter-than-human AI systems as an extremely wide and heterogeneous space, in which "alignable AI designs" is a small and narrow target (and "aligned AI designs" smaller and narrower still). I think that the most important thing a marginal alignment researcher can do today is help ensure that the first generally intelligent systems humans design are in the “alignable” region. I think that this is unlikely to happen unless researchers have a fairly principled understanding of how the systems they're developing reason, and how that reasoning connects to the intended objectives.

Most of our work is therefore aimed at seeding the field with ideas that may inspire more AI research in the vicinity of (what we expect to be) alignable AI designs. When the first general reasoning machines are developed, we want the developers to be sampling from a space of designs and techniques that are more understandable and reliable than what’s possible in AI today.

In other news, we've uploaded a new intro talk on our most recent result, "Logical Induction," that goes into more of the technical details than our previous talk.

See also Shtetl-Optimized and n-Category Café for recent discussions of the paper.

[Recommendation] Steven Universe & cryonics

8 tadrinth 11 October 2016 04:21PM

I've been watching Steven Universe with my fiancee (a children's cartoon on Cartoon Network by Rebecca Sugar), and it wasn't until I got to Season 3 that I realized there's been a cryonics metaphor running in the background since the very first episode. If you want to introduce your kids to the idea of cryonics, this series seems like a spectacularly good way to do it.

If you don't want any spoilers, just go watch it, then come back.

Otherwise, here's the metaphor I'm seeing, and why it's great:

  • In the very first episode, we find out that the main characters are a group called the Crystal Gems, who fight 'gem monsters'. When they defeat a monster, a gem is left behind, which they lock in a bubble-forcefield and store in their headquarters.

  • One of the Crystal Gems is injured in a training accident, and we find out that their bodies are just projections; each Crystal Gem has a gem located somewhere on their body, which contains their minds. So long as their gem isn't damaged, they can project a new body after some time to recover. So we already have the insight that minds and bodies are separate.

  • This is driven home by a second episode where one of the Crystal Gems has their crystal cracked; this is actually dangerous to their mind, not just body, and is treated as a dire emergency instead of merely an inconvenience.

  • Then we eventually find out that the gem monsters are actually corrupted members of the same species as the Crystal Gems. They are 'bubbled' and stored in the temple in hopes of eventually restoring them to sanity and their previous forms.

  • An attempt is made to cure one of the monsters, which doesn't fully succeed, but at least restores them to sanity. This allows them to remain unbubbled and to be reunited with their old comrades (who are also corrupted). This was the episode where I finally made the connection to cryonics.

  • The Crystal Gems are also revealed to be over 5000 years old, and effectively immortal. They don't make a big deal out of this; for them, this is totally normal.

  • This also implies that they've made no progress in curing the gem monsters in 5000 years, but that doesn't stop them from preserving them anyway.

  • Finally, a secret weapon is revealed which is capable of directly shattering gems (thus killing the target permanently), but the use of it is rejected as unethical.

So, all in all, you have a series where when someone is hurt or sick in a way that you can't help, you preserve their mind in a safe way until you can figure out a way to help them. Even your worst enemy deserves no less.


Also, Steven Universe has an entire episode devoted to mindfulness meditation.  

[Link] Reasonable Requirements of any Moral Theory

-1 TheSurvivalMachine 10 October 2016 08:48PM

Open thread, Oct. 10 - Oct. 16, 2016

3 MrMind 10 October 2016 07:00AM

If it's worth saying, but not worth its own post, then it goes here.

Notes for future OT posters:

1. Please add the 'open_thread' tag.

2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)

3. Open Threads should start on Monday, and end on Sunday.

4. Unflag the two options "Notify me of new top level comments on this article" and "

[Link] Biofuels a climate mistake

4 morganism 09 October 2016 09:16PM

[Link] Quantum Bayesianism

0 morganism 08 October 2016 11:27PM

[Link] Viruses and DRACOs in the Valley of Death in medical research.

-1 morganism 08 October 2016 08:36PM

[Link] Six principles of a truth-friendly discourse

4 philh 08 October 2016 04:56PM

The map of organizations, sites and people involved in x-risks prevention

6 turchin 07 October 2016 12:04PM

Three known attempts to make a map of x-risks prevention in the field of science exist:

1. First is the list from the Global Catastrophic Risks Institute in 2012-2013, and many links there are already not working:

2. The second was done by S. Armstrong in 2014

3. And the most beautiful and useful map was created by Andrew Critch. But its ecosystem ignores organizations which have a different view of the nature of global risks (that is, they share the value of x-risks prevention, but have another world view).

In my map I have tried to add all currently active organizations which share the value of global risks prevention.

It also regards some active independent people as organizations, if they have an important blog or field of research, but not all people are mentioned in the map. If you think that you (or someone) should be in it, please write to me at alexei.turchin@gmail.com

I used only open sources and public statements to learn about people and organizations, so I can’t provide information on the underlying net of relations.

I tried to give all organizations a short description based on its public statement and also my opinion about its activity. 

In general it seems that all small organizations are focused on their collaboration with larger ones, that is MIRI and FHI, and small organizations tend to ignore each other; this is easily explainable from the social singnaling theory. Another explanation is that larger organizations have a great ability to make contacts.

It also appears that there are several organizations with similar goal statements. 

It looks like the most cooperation exists in the field of AI safety, but most of the structure of this cooperation is not visible to the external viewer, in contrast to Wikipedia, where contributions of all individuals are visible. 

It seems that the community in general lacks three things: a united internet forum for public discussion, an x-risks wikipedia and an x-risks related scientific journal.

Ideally, a forum should be used to brainstorm ideas, a scientific journal to publish the best ideas, peer review them and present them to the outer scientific community, and a wiki to collect results.

Currently it seems more like each organization is interested in creating its own research and hoping that someone will read it. Each small organization seems to want to be the only one to present the solutions to global problems and gain full attention from the UN and governments. It raises the problem of noise and rivalry; and also raises the problem of possible incompatible solutions, especially in AI safety.

The pdf is here: http://immortality-roadmap.com/riskorg5.pdf

Weekly LW Meetups

0 FrankAdamek 07 October 2016 03:58AM

This summary was posted to LW Main on October 7th. The following week's summary is here.

Irregularly scheduled Less Wrong meetups are taking place in:

The remaining meetups take place in cities with regular scheduling, but involve a change in time or location, special meeting content, or simply a helpful reminder about the meetup:

Locations with regularly scheduled meetups: Austin, Berlin, Boston, Brussels, Buffalo, Canberra, Columbus, Denver, Kraków, London, Madison WI, Melbourne, Moscow, New Hampshire, New York, Philadelphia, Research Triangle NC, San Francisco Bay Area, Seattle, Sydney, Tel Aviv, Toronto, Vienna, Washington DC, and West Los Angeles. There's also a 24/7 online study hall for coworking LWers and a Slack channel for daily discussion and online meetups on Sunday night US time.

continue reading »

The University of Cambridge Centre for the Study of Existential Risk (CSER) is hiring!

6 crmflynn 06 October 2016 04:53PM

The University of Cambridge Centre for the Study of Existential Risk (CSER) is recruiting for an Academic Project Manager. This is an opportunity to play a shaping role as CSER builds on its first year's momentum towards becoming a permanent world-class research centre. We seek an ambitious candidate with initiative and a broad intellectual range for a postdoctoral role combining academic and project management responsibilities.

The Academic Project Manager will work with CSER's Executive Director and research team to co-ordinate and develop CSER's projects and overall profile, and to develop new research directions. The post-holder will also build and maintain collaborations with academic centres, industry leaders and policy makers in the UK and worldwide, and will act as an ambassador for the Centre’s research externally. Research topics will include AI safety, bio risk, extreme environmental risk, future technological advances, and cross-cutting work on governance, philosophy and foresight. Candidates will have a PhD in a relevant subject, or have equivalent experience in a relevant setting (e.g. policy, industry, think tank, NGO).

Application deadline: November 11th. http://www.jobs.cam.ac.uk/job/11684/

View more: Next