Surprisingly, our current theories of anthropics don't seem to cover this.
You have a revolver with six chambers. One of them has a bullet in it. You are offered $1 for spinning the barrel, pointing at your head, and pulling the trigger. You remember doing this many times, and surviving each time. You also remember many other people doing this many times, and dying about 1/6th of the time. Should you play another round?
It seems to me that the answer is no, but existing formal theories disagree. Consider two hypothesis: A says that everyone has a 1/6 chance of dying. B says that everyone else has a 1/6 chance of dying, but I survive for sure. Now A has a lot more prior probability, but the likelihood ratio is 5:6 for every time I played. So if I played often enough, I will have updated to mostly believing B. Neither Self Indication Assumption nor Self Selection Assumption update this any further. SIA, because theres one of me in both worlds. SSA, because that one me is also 100% of my reference class. UDT-like approaches reason that in the A world, you want to never play, and in the B world you want to always play. Further, if I remember playing enough rounds, almost all my remaining measure will be in the B world, and so I should play, imitating the simple bayesian answer.
I'm not sure how we got to this point. It seems like most of the initial anthropics-problems were about birth-related uncertainty, and this stuck pretty well.
Problems for any future solution
Now one obvious way to fix this is to introduce a [death] outcome, which you can predict but which doesn't count towards the normalization factor when updating. Trying to connect this [death] with the rest of your epistemology would require some solution to embedding.
Worse than that however, this would only stop you from updating on your survival. I think the bigger problem here is that we aren't learning anything (in the long term) from the arbitrarily large control group. After all even if we don't update on our survival, that only means our odds ratio between A and B stays fixed. Its hardly a solution to the problem if "having the right prior" is doing all the work.
Learning from the control group has its own problems however. Consider for example the most obvious way of doing so: we observe that most things work out similarly for them as they do for us, and so we generalize this to playing russian roulette. But this is not a solution at all. Because how can we distinguish the hypothesis "most things generalize well from others to us, including russian roulette" and "most things generalize well from others to us, but not russian roulette"? This is more or less the same problem as distinguishing between A and B in the first place. And this generalizes: Every way to learn about us from others involves reasoning from something that isn't our frequency of survival, to our frequency of survival. Then we can imagine a world where the inference fails, and then we must be unable to update towards being in that world.
Note that the use of other humans here is not essential; a sufficient understanding of physics should be able to stand in for observing them I think. And to make things yet more difficult, there doesn't seem to be any metaphysical notion like "what body your soul is in" or "bridging laws" or such that a solution could fill in with something more reasonable. There is one particular gun, and whether a bullet will come out of its barrel is already affected.
Is this just the problem of induction repackaged? After all we are in a not fully episodic environment (with our potential death), so perhaps we just can't figure out everything? That may be related, but I think this is worse. With the problem of induction, you can at least assume the world is regular, and be proven wrong. Here though, you can believe either that you are an exception to natural regularity, or not, and either way you will never be proven wrong. Though a revision of Humean possibility could help with both.
You have described some bizarre issues with SSA, and I agree that they are bizarre, but that's what defenders of SSA have to live with. The crucial question is:
The normal updates are factored into the SSA update. A formal reference would be the formula for P(H|E) on p.173 of Anthropic Bias, which is the crux of the whole book. I won't reproduce it here because it needs a page of terminology and notation, but instead will give an equivalent procedure, which will hopefully be more transparently connected with the normal verbal statement of SSA, such as one given in https://www.lesswrong.com/tag/self-sampling-assumption:
That link also provides a relatively simple illustration of such an update, which we can use as an example:
In this case, the reference class is not trivial, it includes N + 1 or N + 2 observers (observer-moments, to be more precise; and N = trillion), of which only 1 or 2 learn that they are in the sleeping beauty problem. The effect of learning new information (that you are in the sleeping beauty problem or, in our case, that the gun didn't fire for the umpteenth time) is part of the SSA calculation as follows:
In normal situations using the trivial class is fine with the above procedure with the following proviso: assume the world is small or, alternatively, restrict the class further by only including observers on our Earth, say, or galaxy. In either case, if you ensure that at most one person, you, belongs to the class in every possibility i then the above procedure reproduces the results of applying normal Bayes.
If the world is big and has many copies of you then you can't use the (regular) trivial reference class with SSA, you will get ridiculous results. A classic example of this is observers (versions of you) measuring the temperature of the cosmic microwave background, with most of them getting correct values but a small but non-zero number getting, due to random fluctuations, incorrect values. Knowing this, our measurement of, say, 2.7K wouldn't change our credence in 2.7K vs some other value if we used SSA with the trivial class of copies of you who measured 2.7K. That's because even if the true value was, say, 3.1K there would still be a non-zero number of you's who measured 2.7K.
To fix this issue we would need to include in your reference class whoever has the same background knowledge as you, irrespective of whether they made the same observation E you made. So all you's who measured 3.1K would then be in your reference class. Then the above procedure would have you severely penalize the possibility i that the true value is 3.1K, because Qi would then be tiny (most you's in your reference class would be ones who measured 3.1K).
But again, I don't want to defend SSA, I think it's quite a mess. Bostrom does an amazing job defending it but ultimately it's really hard to make it look respectable given all the bizarre implications imo.