I've seen the Doomsday Argument discussed here, and I wanted to address some aspects that I have difficulty accepting.
Overview of Doomsday Argument
Nick Bostrom introduces the Doomsday Argument (DA) by asking you to imagine a universe of 100 numbered cubicles. You are told that a coin was flipped and one of the following happened:
A) 100 people were created and placed individually in cubicles 1-100 (if the coin came up heads)
B) 10 people were created and placed individually in cubicles 1-10 (if the coin came up tails)
Now, suppose you see that your own cubicle number (n) is 7, then you can deduce the relative likelihood of the above two scenarios by Bayesian reasoning:
Suppose that there were 200 such universes starting at the coin flip and in the ideal case there are 100 heads and 100 tails:
p(n<=10 | heads) = 10% (or 10 out of 100 such trials)
p(n<=10 | tails) = 100% (or 100 out of 100 such trials)
Relatively speaking, then, the likelihood that tails came up in this scenario is 100/(10+100) = 0.91 = 91%, even though the prior probability was equal for each outcome.
The DA says that based on reasoning similar to the above, it is possible to assign upper and lower bounds to the total number of humans that will ever be born, using birth order in place of the cubicle number.
For example (using my own math now), if human birth order is assigned randomly, then you have 95% confidence that your birth is within the middle 95% of all humans who will ever be born (i.e. between 2.5% and 97.5%). You can then calculate with 95% confidence the upper and lower bounds of the number of humans that will ever be born if you assume that your birth order is 100 billionth (this is close to the current consensus of approximately how many humans have ever been born):
upper limit: 100 billion / 0.025 = 4 Trillion
lower limit: 100 billion / 0.975 = 102.6 Billion
therefore p(102.6 Billion < Total Humans ever to be Born < 4 Trillion) = 95%
You can choose your desired confidence level and derive the upper and lower bounds accordingly.
Alternatively, you can also choose to determine only the upper limit by saying that you have 95% confidence that you are not among the first 5% of humans born:
upper limit: 100 billion / 0.05 = 2 Trillion
therefore p(Total Humans Born < 2 Trillion) = 95%
Discussion
The cubicle part of the argument is well-defined because there is a logical and discrete reference class of objects: A sequentially numbered group of cubicles wherein each item is equally likely (by definition) to be linked to the observer. In applying the DA to human births, you have to choose a reference class that is sequential and distributed such that each position in the sequence is equally likely to contain the observer. In his response to Korb and Oliver, Bostrom admits that:
"In my opinion, the problem of the reference class is still unsolved, and it is a serious one for the doomsayer."
If you make the statement "100 billion humans have been born thus far" and then base your DA on it, I think you raise some important questions.
At what point in our species' history is it appropriate to designate the starting point of "human birth #1"? Evolution works gradually, after all, and the concept of species is blurry. Were each of us equally likely to have been born as H. sapiens, H. neanderthalensis, Denisova, H. heidelbergensis, H. erectus, or some earlier form? If not, then why not? If yes, then the total number of births counted would increase dramatically and we still would lack a logical and discrete boundary.
Possible responses to the above that I have seen discussed:
"My chosen reference class contains only those individuals who would have been capable of understanding the DA in the first place."
Meaning what? That if the DA had been patiently explained to them in their native language at some point in their lives, they would have understood it? You might have been able to explain it to H. erectus if you spent enough time, or maybe not. Maybe H. neanderthalensis is a better candidate. How can you know for sure who would and wouldn't understand? How much education time is allowed? Thirty minutes, 1 day, several years of intensive study? Even then, you still don't have a logical and discrete cutoff to your reference class.
"My chosen reference class contains only those individuals who have actually read and understood the DA."
This seems circular; If the DA turns out not to remain a popular idea, the only doom it is predicting is its own. Even if it becomes so popular that almost everyone reads and understands it two centuries from now, then it's still only predicting its own memetic fitness (which may not correlate with the prosperity of humanity as we know it). Is there a useful reason for picking this as a reference class instead of "People who are Mormons" or "People who skateboard"?
Summary or tl;dr
The choice of reference class is a big part of the DA, perhaps the most important part, and it's been hand-waved away or completely ignored in the discussions I have seen. It's all neat and well-behaved when you're talking about sequential cubicles or numbered balls in urns, but without a good way to assign a reference class, I think the argument is weak at best. I would be interested to hear creative ideas for useful and well-bounded reference classes if you have them.
The reference class determines for whom the bell tolls. If you're 100 billionth in h. sapiens, that means that, if given absolutely no other information, you should guess that h. sapiens will last about another 100 billion individuals. Although it is possible to choose fuzzy or ill-defined reference classes and get fuzzy or ill-defined answers, I wouldn't call that a "serious problem."
More serious is the fact that we have tons of extra information. For example, did you know that h. sapiens is a self-replicating life-form? The fact that we have trouble not knowing this (and more!) is probably where the conflict between our intuition and the doomsday argument comes from.
I think I see what you're saying about fuzzy classes yielding fuzzy results, and that doesn't mean that the results are invalid.
In your opinion, how would the extra information (that we're self-replicating, and whatever else) affect the argument?