It is often claimed that the Self-Sampling assumption (SSA) is problematic, in particular, because we don’t know which reference class is a correct one. Therefore, the notorious Doomsday argument is weak. Here I will try to show that it is less a problem than seems.

Central claim: Each reference class has its own end.

Example: I am put randomly in a 2-dimensional grid of rooms. Vertically they are designated by letters from A to Z, and horizontally they are numbered from 1 to 100. So, each room has an address like E021. The Doomsday-like claim is that I am more likely to be somewhere in the middle of the grid than on its borders and thus, I am unlikely to be in the raw of 1-rooms and in A-column. Therefore, claims DA, if I learn my room number, I can estimate the total number of rooms in my raw, and the same is true for letters. (UPDATE:  by "middle" I mean some region not on the borders, not  exactly cell N50, but, say, between 20 and 80.)

A real-world example: I assume that my height and my date of birth are two independent variables. Thus, both should be in the middle of the distribution: I am unlikely to be born on January, 1, and unlikely to have a very low height. It is actually true for me: my date of birth and my height are in the middle of the distribution.

Here we use two important assumptions:

  1. The independence of the variables, like height and date of birth.
  2. Each variable is external to my cognition process, so it doesn’t affect my probability to think about anthropics.

Now we could try to apply this to real-world Doomsday-like problems, that is, claims like “I am in the middle of my reference class existence in time”.

Let’s discuss three of my reference classes and the corresponding predictions about the end of the world:

  1. I am a member of the class of mammals, which existed for last 200 million years, and thus mammals may exist a few hundreds of millions of years more.
  2. I am Russian, and Russians have existed for around last 1000, so they can exist another millennium.
  3. LessWrong has existed for 13 years and thus will likely continue to exist 13 more years with 50 per cent probability.

These predictions give different timing of the “end of the world”. This type of inconsistency is often used to show that there is something wrong with the Doomsday argument as it depends on the choice of the reference class.

However, if we ignore the idea that there should be just one “end of the world” in the Bang style, these all start to look plausible: a few hundreds of millions of years from now is a reasonable prediction about how long complex life on Earth will exist because of the Sun’s increasing luminosity. Millennium is a typical time of the existence of a nation. And a couple of decades are also a plausible prediction of the time of existence for an organization.

However, there is a caveat. Assumption (2) does not work for mammals: most of the previous mammals were not able to make predictions about the end of the world. Assumption (2) is obviously true only for lesswrongers. In that case, we come to the idea of some kind of “reference class of qualified observers”, which consists of the minds who do think about anthropics or at least can do it. This is bad news, as such class is new and will likely end soon if we apply DA-logic to it, and the most plausible way how it could end is a bang.

One way to save reference class is to say that I am not a mammal in any relevant to my cognition sense, but I just observe mammals in a random moment of their existence. However, assumption (1) is also violated here as being a mammal and being a lesswronger are not independent variables. The lesswrongers likely exist at the end of the existence of mammals, at least according to their own world-model with x-risks.

New Comment
13 comments, sorted by Click to highlight new comments since:

In that case, we come to the idea of some kind of “reference class of qualified observers”, which consists of the minds who do think about anthropics or at least can do it.

Or it specifically consists of the minds who think about anthropics in the same confused way that we do

If most intelligent species continue for a billion years but their anthropic questions are resolved early using something other than SSA, the conditional probability of using SSA to get an incorrect short doomsday timeline is high, because those species that use SSA at all discard it early in their development.

You can take this as anthropic evidence that using SSA as a model is doomed soon.

Agreed. The doom is the end of  the reference class, not a bang.  And if SSA-based DA is universally refuted so that no one ever even try to think in that direction, then it is the end of this type of thinking. I looked at Google Scholar and found that the number of articles about DA peaked around 2000s and is now declining. It suggests that the interest to the problem is declining. 

However, if we will exist for a very long time, there will be a few observers every millennia who still like the SSA and, for billions years, there should be many of them, more than now living SSA-believers. In that case, I am still more likely to find myself in remote future, not now – and as I am not there, I am surprised. Thus DA still predicts bang even if we assume that it will be refuted. 

If they keep generating new generations, I should be not in the first generation.

They can generate different dates, but they still use the same mental model which doesn't depend on the date.

It looks like that I am in the second generation of anthropic reasoning (I started read about it in 2006), but the interesting thing is that the second generation is much more numerous than the first one, thanks to Internet and LW.  So it is not surprising to be in the second generation than in the first. But why I am not in the third generation? 

Your example of the grid of rooms does not quite work. Unlike height (but somewhat like birthday), which column you are in follows a uniform distribution, with no mode near the middle. You are in fact more likely to find yourself in column {A or Z} than to find yourself in column M, for instance. Same for the rows.

Generally speaking, we expect a priori to be part of the typical set of any given distribution, not near the middle per se. In fact, even for Gaussian distributions, as the dimensionality of the space increases, the typical set actually recedes away from the mode/center and towards a hyperellipsoidal shell around it (https://arxiv.org/abs/1701.02434).

This is just something to keep in mind. I haven't yet thought through how this caveat may apply to doomsday timelines, but it's probably important.

I probably should have clarified that in the case of raws, I count as "middle" everything which is not on the border, that is not first or last raw. 

This caveat actually plays in the situation of the universe anthropic fine-tuning by many parameters. The number not-perfectly-fine-tuned universes is much larger than the set of fine-tuned ones. This means lower concentration of civilizations in space compared with "optimal universe". Seems to be solution of the Fermi paradox. 

Thank for the link.

Each reference class has its own end.

I initially read this as "you infer or choose reference classes based on what you want to predict", taking end to mean purpose of the reference class.  But you're talking end more literally, for reference classes that have a duration.  I think that's a little more suspect.  

The reference class problem seems to apply equally to SIA and SSA, and in fact to non-anthropic probability as well.  Categories are simply not natural things, and there is no "correct" reference class.

I actually meant "you infer or choose reference classes based on what you want to predict", but my point of interest was specific application of the problem, that is, Doomsday argument. 

If I am randomly put into a 2d grid of rooms, assuming that "random" means that I have an equal probability of ending up in any room, then shouldn't I be equally likely to end up in the border rooms as in the middle rooms?

I mean by "middle" a large region of rooms which are not on borders, like between 20 and 80, not exactly room 50. Should clarify it in the post.

More likely actually*. The trick is people are using middle, to mean** 'middle rooms, or close to middle rooms', and are trying to riff off the central limit theorem or something.

*The actual probability depends on whether the grid has an even or odd number of columns/rows (and if it's big enough that the middle and the border are different).

1 2 3

4 5 6

8 9 0

Odd both ways, probability of having a border room is actually a lot larger.

1234

5678

9012

3456

The 'middle' is larger for an even number of columns and rows. But the border is still bigger than the center. The bigger it gets, the more this is the case.


**Above this is clarified to something very different: middle as any 'non-border room'.