If you rule out indexicals completely how can you even begin to reason about the probability of a statement ("I am in a blue room") that uses an indexical?
We shouldn't rule out indexicals in your scenario, but we should understand their meaning in a non-indexical way.
In your scenario, where everyone in the pool of people exists, we can just suppose that each person has a unique identifier, such as a unique proper name. Then, for each proper name N, the person named "N" can reason according to the rule "Upon learning that N is in a room, guess that N is in a blue room." This allows them to achieve the 0.99 success rate that indexical reasoning allows.
[ETA: Note that this means that each person N is employing a different rule. This is reasonable because N will have learned that information regarding N is especially reliable. We can imagine minds that could go through this reasoning process without ever thinking to themselves "Hey, wait a minute — I myself am N."]
In real life, people share proper names. But we can still suppose that each person can be picked out uniquely with some set of non-indexical properties.
For example, there might be more than one person who is named "Bob". There might be more than one person who is named "Bob" and was born on January 8th, 1982. There might even be more than one person who is named "Bob", was born on January 8th, 1982, and has red hair. But, if we keep adding predicates, we can eventually produce a proper definite description that is satisfied by exactly one person in the pool.
This is what justifies the kind of indexical reasoning that works so well in your scenario.
What makes the scenario in the OP different is this: Some of the possible people in the "pool" are distinguished from the others only by whether they exist. The problem here is that existence is not a predicate (according to most analytic philosophers). Thus, "exists" is not among the properties that we can use to pick out a unique individual with a proper definite description. That's what makes it problematic to carry over indexical reasoning to the scenario in the OP.
Interesting. Thanks for clarifying that.
Regardless of if "I" is a valid index in this case though, certainly "person P used the word 'I' and concluded 'I am in a blue room' " is a valid predicate, even if person P's use of "I" was gibberish.
We can then say that 99% of people, if they concluded that gibberish, would have gone on to conclude the gibberish, "I was, in fact, right to conclude that I was in a blue room."
The self-indication assumption (SIA) states that
Given the fact that you exist, you should (other things equal) favor hypotheses according to which many observers exist over hypotheses on which few observers exist.
The reason this is a bad assumption might not be obvious at first. In fact, I think it's very easy to miss.
Argument for SIA posted on Less Wrong
First, let's take a look at a argument for SIA that appeared at Less Wrong (link). Two situations are considered.
1. we imagine that there are 99 people in rooms that have a blue door on the outside (1 person per room). One person is in a room with a red door on the outside. It was argued that you are in a blue door room with probability 0.99.
2. Same situation as above, but first a coin is flipped. If heads, the red door person is never created. If tails, the blue door people are never created. You wake up in a room and know these facts. It was argued that you are in a blue door room with probability 0.99.
So why is 1. correct and 2. incorrect? The first thing we have to be careful about is not treating yourself as special. The fact that you woke up just tells you that at least one conscious observer exists.
In scenario 1 we basically just need to know what proportion of conscious observers are in a blue door room. The answer is 0.99.
In scenario 2 you never would have woken up in a room if you hadn't been created. Thus, the fact that you exist is something we have to take into account. We don't want to estimate P(randomly selected person, regardless of if they exist or not, is in a blue door room). That would be ignoring the fact that you exist. Instead, the fact that you exist tells us that at least one conscious observer exists. Again, we want to know what proportion of conscious observers are in blue door rooms. Well, there is a 50% chance (if heads landed) that all conscious observers are in blue door rooms, and a 50% chance that all conscious observers are in red door rooms. Thus, the marginal probability of a conscious observer being in a blue door room is 0.5.
The flaw in the more detailed Less Wrong proof (see the post) is when they go from step C to step D. The *you* being referred to in step A might not exist to be asked the question in step D. You have to take that into account.
General argument for SIA and why it's wrong
Let's consider the assumption more formally.
Assume that the number of people to be created, N, is a random draw from a discrete uniform distribution1 on {1,2,...,Nmax}. Thus, P(N=k)=1/Nmax, for k=1,...,Nmax. Assume Nmax is large enough so that we can effectively ignore finite sample issues (this is just for simplicity).
Assume M= Nmax*(Nmax+1)/2 possible people exist, and we arbitrarily label them 1,...,M. After the size of the world, say N=n, is determined, then we randomly draw n people from the M possible people.
After the data are collected we find out that person x exists.
We can apply Bayes' theorem to get the posterior probability:
P(N=k|x exists)=k/M, for k=1,...,Nmax.
The prior probability was uniform, but the posterior favors larger worlds. QED.
Well, not really.
The flaw here is that we conditioned on person x existing, but person x only became of interest after we saw that they existed (peeked at the data).
What we really know is that at least one conscious observer exists -- there is nothing special about person x.
So, the correct conditional probability is:
P(N=k|someone exists)=1/Nmax, for k=1,...,Nmax.
Thus, prior=posterior and SIA is wrong.
Egotism
The flaw with SIA that I highlighted here is it treats you as special, as if you were labeled ahead of time. But the reality is, no matter who was selected, they would think they are the special person. "But I exist, I'm not just some arbitrary person. That couldn't happen in small world. It's too unlikely." In reality, that fact that I exist just means someone exists. I only became special after I already existed (peeked at the data and used it to construct the conditional probability).
Here's another way to look at it. Imagine that a random number between 1 and 1 trillion was drawn. Suppose 34,441 was selected. If someone then asked what the probability of selecting that number was, the correct answer is 1 in 1 trillion. They could then argue, "that's too unlikely of an event. It couldn't have happened by chance." However, because they didn't identify the number(s) of interest ahead of time, all we really can conclude is that a number was drawn, and drawing a number was a probability 1 event.
I give more examples of this here.
I think Nick Bostrom is getting at the same thing in his book (page 125):
..your own existence is not in general a ground for thinking that hypotheses are more likely to be true just by virtue of implying that there is a greater total number of observers. The datum of your existence tends to disconfirm hypotheses on which it would be unlikely that any observers (in your reference class) should exist; but that’s as far as it goes. The reason for this is that the sample at hand—you—should not be thought of as randomly selected from the class of all possible observers but only from a class of observers who will actually have existed. It is, so to speak, not a coincidence that the sample you are considering is one that actually exists. Rather, that’s a logical consequence of the fact that only actual observers actually view themselves as samples from anything at all
Related arguments are made in this LessWrong post.
1 for simplicity I'm assuming a uniform prior... the prior isn't the issue here