Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

A Problem for the Simulation Hypothesis

1 philosophytorres 16 March 2017 04:02PM

I've posted about this once before, but here's a more developed version of the idea. Does this pose a serious problem for the simulation hypothesis, or does it merely complicate the idea?

1. Which room am I in?

Imagine two rooms, A and B. At a timeslice t2, there are exactly 1000 people in room B and only 1 person in room A. Neither room contains any clues as to which it is; i.e., no one can see anyone else in room B. If you were placed in one of these rooms with only the information above, which would you guess that you were in? The correct answer appears to be room B. After all, if everyone were to bet that they are in room B, almost everyone would win, whereas if everyone were to bet that they are in room A, almost everyone would lose.

Now imagine that you are told that during a time segment t1 to t2, a total of 100 trillion people had sojourned in room A and only 1 billion in room B. How does this extra information influence your response? The question posed above is not which room you are likely to have been in, all things considered, but which room you are currently in at t2. Insofar as betting odds guide rational belief, it still follows that if everyone at t2 were to bet that they are in room A, almost everyone would lose. This differs from what appears to be the correct conclusion if one reasons across time, from t1 to t2. Thus, we can imagine that at some future moment t3 everyone who ever sojourned in either room A or B is herded into another room C and then asked whether their journeys from t1 to t3 took them through room A or B. In this case, most people would win the bet if they were to point at room A rather than room B.

Let’s complicate this situation. Since more people in total pass through room A than room B, imagine that people are swapped in and out of room A faster than room B. Once in either room, a blindfold is removed and the occupant is asked which room they are in. After they answer, the blindfold is put back on. Thus, there are more total instances of removing blindfolds in room A than room B between t1 and t2. Should this fact change your mind about where you are at exactly t2? Surely one could argue that the directly relevant information is that pertaining to each individual timeslice, rather than the historical details of occupants being swapped in and out of rooms. After all, the bet is being made at a particular timeslice about a particular timeslice, and the fact is that most people who bet at t2 that they are in room B at t2 will win some cash, whereas those who bet that they are in room A will lose.

2. The simulation argument

Nick Bostrom (2003) argues that at least one of the following disjuncts is true: (1) civilizations like ours tend to self-destruct before reaching technological maturity, (2) civilizations like ours tend to reach technological maturity but refrain from running a large number of ancestral simulations, or (3) we are almost certainly in a simulation. The third disjunct corresponds to the “simulation hypothesis.” It is based on the following premises: first, assume the truth of functionalism, i.e., that physical systems that exhibit the right functional organization will give rise to conscious mental states like ours. Second, consider the computational power that could be available to future humans. Bostrom provides a convincing analysis that future humans will have at least the capacity to run a large number of ancestral simulations—or, more generally, simulations in which minds sufficiently “like ours” exist.

The final step of the argument proceeds as follows: if (1) and (2) are false, then we do not self-destruct before reaching a state of technological maturity and do not refrain from running a large number of ancestral simulations. It follows that we run a large number of ancestral simulations. If so, we have no independent knowledge of whether we exist in vivo or in machina. A “bland” version of the principle of indifference thus tells us to distribute our probabilities equally among all the possibilities. Since the number of sims would far exceed the number of non-sims in this scenario, we should infer that we are almost certainly simulated. As Bostrom writes, “it may also be worth to ponder that if everybody were to place a bet on whether they are in a simulation or not, then if people use the bland principle of indifference, and consequently place their money on being in a simulation if they know that that’s where almost all people are, then almost everyone will win their bets. If they bet on not being in a simulation, then almost everyone will lose. It seems better that the bland indifference principle be heeded” (Bostrom 2003).

Now, let us superimpose the scenario of Section 1 onto the simulation argument. Imagine that our posthuman descendants colonize the galaxy and their population grows to 100 billion individuals in total. Imagine further that at t2 they are running 100 trillion simulations, each of which contains 100 billion individuals. Thus, the total number of sims equals 10^25. If one of our posthuman descendants were asked whether she is a sim or non-sim, she should therefore answer that she is almost certainly a sim. Alternatively, imagine that at t2 our posthuman descendants decide to run only a single simulation in the universe that contains a mere 1 billion sims, ceteris paribus. Given this situation: if one of our posthuman descendants were asked whether she is a sim given this information, she should quite clearly answer that she is most likely a non-sim.

3. Complications

With this in mind, consider a final possible scenario: our posthuman descendants decide to run simulations with relatively small populations in a serial fashion, that is, one at a time. These simulations could be sped up a million times to enable complete recapitulations of our evolutionary history (as per Bostrom). The result is that at any given timeslice the total number of non-sims will far exceed the total number of sims—yet across time the total number of sims will accumulate and eventually far exceed the total number of non-sims. The result is that if one takes a bird’s-eye view of our posthuman civilization from its inception to its decline (say, because of the entropy death of the cosmos), and if one were asked whether she is more likely to have existed in vivo or in machina, it appears that she should answer “I was a sim.”

But this might not be the right way to reason about the situation. Consider that history is nothing more than a series of timeslices, one after the other. Since the ratio of non-sims to sims favors the former at every possible timeslice, one might argue that one should always answer the question, “Are you right now more likely to exist in vivo or in machina?” with “I probably exist in vivo.” Again, the difficulty that skeptics of this answer must overcome is the ostensible fact that if everyone were to bet on being simulated at any given timeslice—even billions of years after the first serial simulation is run—then nearly everyone would lose, whereas if everyone were to bet that they are a non-sim, then almost everyone would win.

The tension here emerges from the difference between timeslice reasoning and the sort of “atemporal” reasoning that Bostrom employs. If the former is epistemically robust, then Bostrom’s tripartite argument fails because none of the disjuncts are true. This is because the scenario above entails (a) we survive to reach technological maturity, and (b) we run a large number of ancestor simulations, yet (c) we do not have reason to believe that we are in a simulation at any particular moment. The latter proposition depends, of course, upon how we run the simulations (serially versus in parallel) and, relatedly, how we decide to reason about our metaphysical status at each moment in time.

In conclusion, I am unsure about whether this constitutes a refutation of Bostrom or merely complicates the picture. At the very least, I believe it does the latter, requiring more work on the topic.


Bostrom, Nick. 2003. Are You Living in a Computer Simulation? Philosophical Quarterly. 53(211): 243-255.


Error and Terror: Are We Worrying about the Wrong Risks?

5 philosophytorres 30 January 2017 01:07AM

I would be happy to get feedback on this article, originally posted by the IEET:


When people worry about the dark side of emerging technologies, most think of terrorists or lone psychopaths with a death wish for humanity. Some future Ted Kaczynski might acquire a masters degree in microbiology, purchase some laboratory equipment intended for biohackers, and synthesize a pathogen that spreads quickly, is incurable, and kills 90 percent of those it infects.

Alternatively, Benjamin Wittes and Gabriella Blum imagine a scenario in which a business competitor releases “a drone attack spider, purchased from a bankrupt military contractor, to take you out. … Upon spotting you with its sensors, before you have time to weigh your options, the spider—if it is, indeed, an attack spider—shoots an infinitesimally thin needle … containing a lethal dose of a synthetically produced poison.” Once this occurs, the spider exits the house and promptly self-destructs, leaving no trace behind it.

This is a rather terrifying picture of the future that, however fantastical it may sound, is not implausible given current techno-developmental trends. The fact is that emerging technologies like synthetic biology and nanotechnology are becoming exponentially more powerful as well as more accessible to small groups and even single individuals. At the extreme, we could be headed toward a world in which a large portion of society, or perhaps everyone, has access to a “doomsday button” that could annihilate our species if pressed.

This is an unsettling thought given that there are hundreds of thousands of terrorists—according to one estimate—and roughly 4 percent of the population are sociopaths—meaning that there are approximately 296 million sociopaths in our midst today. The danger posed by such agents could become existential in the foreseeable future.

But what if deranged nutcases with nefarious intentions aren’t the most significant threat to humanity? An issue that rarely comes up in such conversations is the potentially greater danger posed by well-intentioned people with access to advanced technologies. In his erudite and alarming book Our Final Hour, Sir Martin Rees distinguishes between two types of agent-related risks: terror and error. The difference between these has nothing to do with the consequences—a catastrophe caused by error could be no less devastating than one caused by terror. Rather, what matters are the intentions behind the finger that pushes a doomsday button, causing spaceship Earth to explode.

There are reasons for thinking that error could actually constitute a greater threat than terror. First, let’s assume that science and technology become democratized such that most people on the planet have access to a doomsday button of some sort. Let’s say that the global population at this time is 10 billion people.

Second, note that the number off individuals who could pose an error threat will vastly exceed the number of individuals who would pose a terror threat. (In other words, the former is a superset of the latter.) On the one hand, every terrorist hell-bent on destroying the world could end up pushing the doomsday button by accident. Perhaps while attempting to create a designer pathogen that kills everyone not vaccinated against it, a terrorist inadvertently creates a virus that escapes the laboratory and is 100 percent lethal. The result is a global pandemic that snuffs out the human species.

On the other hand, any good-intentioned hobbyist with a biohacking laboratory could also accidentally create a new kind of lethal germ. History reveals numerous leaks from highly regulated laboratories—the 2009 swine flu epidemic that killed 12,000 between 2009 and 2010 was likely caused by a laboratory mistake in the late 1970s—so it’s not implausible to imagine someone in a largely unregulated environment mistakenly releasing a pathogenic bug.

In a world where nearly everyone has access to a doomsday button, exactly how long could it last? We can, in fact, quantify the danger here. Let’s begin by imagining a world in which all 10 billion people have (for the sake of argument) a doomsday button on their smartphone. This button could be pushed at any moment if one opens up the Doomsday App. Further imagine that of the 10 billion people who live in this world, not a single one has any desire to destroy it. Everyone wants the world to continue and humanity to flourish.

Now, how likely is this world to survive the century if each individual has a tiny chance of pressing the button? Crunching a few numbers, it turns out that doom would be all-but-guaranteed if each person had a negligible 0.000001 percent chance of error. The reason is that even though the likelihood of any one person causing total annihilation on accident is incredibly small, this probability adds up across the population. With 10 billion people, one should expect an existential catastrophe even if everyone is very, very, very careful not to press the button.

Consider an alternative scenario: imagine a world of 10 billion morally good people in which only 500 have the Doomsday App on their smartphone. This constitutes a mere 0.000005 percent of the total population. Imagine further that each of these individuals has an incredibly small 1 percent chance of pushing the button each decade. How long should civilization as a whole, with its 10 billion denizens, expect to survive? Crunching a few numbers again reveals that the probability of annihilation in the next 10 years would be a whopping 99 percent—that is, more or less certain.

The staggering danger of this situation stems from the two trends mentioned above: the growing power and accessibility of technology. A world in which fanatics want to blow everything up would be extremely dangerous if “weapons of total destruction” were to become widespread. But even if future people are perfectly compassionate—perhaps because of moral bioenhancements or what Steven Pinker calls the “moral Flynn effect”—the fact of human fallibility will make survival for centuries or decades highly uncertain. As Rees puts this point:

If there were millions of independent fingers on the button of a Doomsday machine, then one person’s act of irrationality, or even one person’s error, could do us all in. … Disastrous accidents (for instance, the unintended creation or release of a noxious fast-spreading pathogen, or a devastating software error) are possible even in well-regulated institutions. As the threats become graver, and the possible perpetrators more numerous, disruption may become so pervasive that society corrodes and regresses. There is a longer-term risk even to humanity itself.

As scholars have noted, “an elementary consequence of probability theory [is] that even very improbable outcomes are very likely to happen, if we wait long enough.” The exact same goes for improbable events that could be caused by a sufficiently large number of individuals—not across time, but across space.

Could this situation be avoided? Maybe. For example, perhaps engineers could design future technologies with safety mechanisms that prevent accidents from causing widespread harm—although this may turn out to be more difficult than it seems. Or, as Ray Kurzweil suggests, we could build a high-tech nano-immune system to detect and destroy self-replicating nanobots released into the biosphere (a doomsday scenario known as “grey goo”).

Another possibility advocated by Ingmar Persson and Julian Savulescu entails making society just a little less “liberal” by trading personal privacy for global security. While many people may, at first glance, be resistant to this proposal—after all, privacy seems like a moral right of all humans—if the alternative is annihilation than this trade-off might be worth the sacrifice. Or perhaps we could adopt the notion of sousveillance, whereby citizens themselves monitor society the use of wearable cameras and other apparatuses. In other words, the surveillees (those being watched) could use advanced technologies to surveil the surveillers (those doing the watching)—a kind of “inverse panopticon” to protect people from the misuse and abuse of state power.

While terror gets the majority of attention from scholars and the media, we should all be thinking more about the existential dangers inherent in the society-wide distribution of offensive capabilities involving advanced technologies. There’s a frighteningly good chance that future civilization will be more susceptible to error than terror.

(Parts of this are excerpted from my forthcoming book Morality, Foresight, and Human Flourishing: An Introduction to Existential Risks.)

Comment author: SithLord13 15 October 2016 11:25:12PM 5 points [-]

Furthermore, implementing stricter regulations on CO2 emissions could decrease the probability of extreme ecoterrorism and/or apocalyptic terrorism, since environmental degradation is a “trigger” for both.

Disregarding any discussion of legitimate climate concerns, isn't this a really bad decision? Isn't it better to be unblackmailable, to disincentivize blackmail.

Comment author: philosophytorres 20 October 2016 08:08:23PM -1 points [-]

What do you mean? How is mitigating climate change related to blackmail?

Comment author: WhySpace 16 October 2016 04:52:51AM 0 points [-]

Awesome article! I do have a small piece of feedback to offer, though.

Interestingly, no notable historical group has combined both the genocidal and suicidal urges.

No historical group has combined both genocidal and suicidal actions, but that may be because of technological constraints. If we had had nukes widely available for millennia, how many groups do you think would have blown up their own cities?

Without sufficiently destructive technology, it takes a lot more time and effort to completely wipe out large groups of people. Usually some of them survive, and there's a bloody feud for the next 10 generations. It's rare to win sufficiently thoroughly that the group can then commit mass suicide without the culture they attempted genocide against coming back in a generation or two.

There have, of course, been plenty of groups willing to fight to the death. How many of them would have pressed a domesday button if they could?

Comment author: philosophytorres 20 October 2016 08:07:56PM -1 points [-]

I actually think most historical groups wanted to vanquish the enemy, but not destroy either themselves or the environment to the point at which it's no longer livable. This is one of the interesting things that shifts to the foreground when thinking about agents in the context of existential risks. As for people fighting to the death, often this was done for the sake of group survival, where the group is the relevant unit here. (Thoughts?)

Comment author: turchin 16 October 2016 10:05:22AM *  3 points [-]

In 20 century most risks were created by superpowers. Should we include them in the list of potential agents?

Also it seems that some risks are non-agential, as they result from collective behaviors of a group of agents, like arms race, capitalism, resource depletion, overpopulation etc.

Comment author: philosophytorres 20 October 2016 08:06:02PM -1 points [-]

Totally agree that some x-risks are non-agential, such as (a) risks from nature, and (b) risks produced by coordination problems, resulting in e.g. climate change and biodiversity loss. As for superpowers, I would classify them as (7). Thoughts? Any further suggestions? :-)

Comment author: chron 16 October 2016 07:57:22PM 1 point [-]

Your example (2) (and arguably (3)) seems like a special case of (6) and it's not at all clear why you're singling out those to particular apocalyptic philosophies, as opposed to say ISIS-style Islamic apocalyptism.

Comment author: philosophytorres 20 October 2016 08:04:53PM -1 points [-]

(2) is quite different in that it isn't motivated by supernatural eschatologies. Thus, the ideological and psychological profiles of ecoterrorists are quite different than apocalyptic terrorists, which are bound together by certain common worldview-related threads.

Comment author: chron 16 October 2016 08:10:29PM *  4 points [-]

Interestingly, no notable historical group has combined both the genocidal and suicidal urges.

Actually such groups existed, for example the Khmer Rouge turned in on themselves after killing their enemies. Something similar happened with the movement lead by Zhang Xianzhong only to a much greater extent, i.e., they more-or-less depopulated the province of Sichuan, including killing themselves.

Comment author: philosophytorres 20 October 2016 08:02:50PM -1 points [-]

I think my language could have been more precise: it's not merely genocidal, but humanicidal or omnicidal that we're talking about in the context of x-risks. Also, Khmer Rough wasn't suicidal to my knowledge. Am I less right?

Comment author: Manfred 20 October 2016 01:04:15AM *  3 points [-]

Depends on information. If people retain memories, so that each person-moment follows from a previous one, then knowing only that I suddenly find myself in a room means I'm probably in room A. If people are memory-wiped at some interval, then this increases the probability I should assign to being in room B - probability of being in a specific room, given that your state of information is that you suddenly find yourself in a room, is proportional to the number of times "I have suddenly found myself in a room" is somebody's state of information.

The above is in fact true. So here's a fun puzzler for you: why is the following false?

"If you tell me the exact time, then my room must more likely be B, because there are 1000 times more people in room B at that time. Since this holds for all times you could tell me, it is always true that my room is probably B, so I'm probably in room B."

Hint: Assuming that room B residents "live" 1,000,000 times longer than room A residents, how does their probability of being in room B look throughout their life, assuming they retain their memories?

Comment author: philosophytorres 20 October 2016 07:58:05PM 1 point [-]

As for your first comment, imagine that everyone "wakes up" in a room with only the information provided and no prior memories. After 5 minutes, they're put back to sleep -- but before this occurs they're asked about which room they're in. (Does that make sense?)

Comment author: gbear605 19 October 2016 10:35:19PM 2 points [-]

Not OP, but each single person could be in room A for 1/1,000,000 the time that they're in room B. The time doesn't run slower, but they're there less time, producing the same effect.

Comment author: philosophytorres 20 October 2016 03:23:44PM 0 points [-]

Yes to both possibilities. But gbear605 is closer to what I was thinking.

A problem in anthropics with implications for the soundness of the simulation argument.

5 philosophytorres 19 October 2016 09:07PM

What are your intuitions about this? It has direct implications for whether the Simulation Argument is sound.


Imagine two rooms, A and B. Between times t1 and t2, 100 trillion people sojourn in room A while 100 billion sojourn in room B. At any given moment, though, exactly 1 person occupies room A while 1,000 people occupy room B. At t2, you find yourself in a room, but you don't know which one. If you have to place a bet on which room it is (at t2), what do you say? Do you consider the time-slice or the history of room occupants? How do you place your bet?


If you bet that you're in room B, then the Simulation Argument may be flawed: there could be a fourth disjunct that Bostrom misses, namely that we become a posthuman civilization that runs a huge number of simulations yet we don't have reason for believing that we're stimulants.



View more: Next