"Thread of subjective experience" was an aside (just one of the mechanisms that explains why we "find ourselves" in a world that behaves according to the Born rule), don't focus too much on it.
The core question is which physical mechanism (everything should be physical, right?) ensures that you almost never will see a string of a billion tails after a billion quantum coin flips, while the universe contains a quantum branch with you looking in astonishment on a string with a billion tails. Why should you expect that it will almost certainly not happen, when there's always a physical instance of you that will see it happened?
You'll have 2^1000000000 branches with exactly the same amplitude. You'll experience every one of them. Which physical mechanism will make it more likely for you to experience strings with roughly the same number of heads and tails?
In the Copenhagen interpretation it's trivial: when the quantum coin flipper writes a result of the flip the universe somehow samples from a probability distribution and the rest is the plain old probability theory. You don't expect to observe a string of a billion tails (or any other preselected string), because you who observes this string almost never exist.
What happens in MWI?
I haven't fully understood your stance towards the many minds interpretation. Do you find it unnecessary?
I don’t think either of these Harrys is “preferred”.
And simultaneously you think that existence of future Harries who observe events with probabilities approaching zero is not a problem because current Harry will almost never find himself to be those future Harries. I don't understand what it means exactly.
Harries who observe those rare events exist and they wonder how they found themselves in those unlikely situations. Harries who hadn't found anything unusual exist too. Current Harry became all of those future Harries.
So, we have a quantum state of the universe that factorizes into states with different Harries. OK. What property distinguished a universe where "Harry found himself in a tails branch" and a universe where "Harry found himself in a heads branch"?
You have already answered it: "I don’t think either of these Harrys is “preferred”." That is there's no property of the universe that distinguishes those outcomes.
Let's get back to the initial question 'What it means that "Harry will almost never find himself to be those future Harries"?' To answer that we need to jump from a single physical Universe (containing multitude of Harries who found themselves in branches of every possible probability) to a single one (or maybe a set) of those Harries and proclaim that, indeed, that Harry (or Harries) found himself in a usual branch of the universe and all other Harries don't matter for some reason (their amplitudes are too low to matter despite them being fully conscious? That's the point that I don't understand).
The many minds interpretation solves this by proposing metaphysical threads of consciousness, thus adding a property that distinguishes outcomes where Harry observes different things. So we can say that indeed the vast majority of Harries' threads of consciousness ended up in probable branches.
I don't like this interpretation. Why don't we use a single thread of consciousness that adheres to Born rule? Or why don't we get rid of threads of consciousness altogether and just use the Copenhagen interpretation?
So, my question is how you tackle this problem? I hope I've made it sufficiently coherent.
My own resolution is that either collapse is objective, or due to imperfect decoherence the vast majority of branches (which also have relatively low amplitude) interfere with each other, making it impossible for conscious beings to exist in them and, consequently, observe them (it doesn't explain billion quantum coin-flips scenario in my comment below)
For example: “as quantum amplitude of a piece of the wavefunction goes to zero, the probability that I will ‘find myself’ in that piece also goes to zero”
What I really don't like about this formulation is extreme vagueness of "I will find myself", which implies that there's some preferred future "I" out of many who is defined not only by observations he receives, but also by being a preferred continuation of subjective experience defined by an unknown mechanism.
It can be formalized as the many minds interpretation, incurring additional complexity penalty and undermining surface simplicity of the assumption. Coexistence of infinitely many (measurement operators can produce continuous probability distributions) threads of subjective experience in a single physical system also doesn't strike me as "feeling more natural".
First, a factual statement that is true to the best of my knowledge: LLM state, that is used to produce probability distribution for the next token, is completely determined by the state of its input buffer (plus a bit of indeterminism due to parallel processing and non-associativity of floating point arithmetic).
That is LLM can pass only a single token (around 2 bytes) to its future self. That follows from the above.
What comes next is a plausible (to me) speculation.
For humans what's passed to our future self is most likely much more that a single token. That is a state of the human brain that leads to writing (or uttering) the next word most likely cannot be derived from a small subset of a previous state plus a last written word (that is state of the brain changes not only because we had written or said a word, but by other means too).
This difference can lead to completely different processes that LLM uses to mimic human output, that is potential shoggethification. But to be the real shoggoth LLM also needs a way to covertly update its shoggoth state, that is the part of its state that can lead to inhuman behavior. Output buffer is the only thing it has to maintain state, so the shoggoth state should be steganographically encoded in it, thus severely limiting its information density and update rate.
I wonder how a shoggoth state may arise at all, but it might be my lack of imagination.
Expanding a bit on the topic.
Exhibit A: flip a fair coin and move a suspended robot into a green or red room using a second coin with probabilities (99%, 1%) for heads, and (1%, 99%) for tails.
Exhibit B: flip a fair coin and create 99 copies of the robot in green rooms and 1 copy in a red room for heads, and reverse colors otherwise.
What causes the robot to see red instead of green in exhibit A? Physical processes that brought about a world where the robot sees red.
What causes a robot to see red instead of green in exhibit B? The fact that it sees red, nothing more. The physical instance of the robot who sees red in one possible world, could be the instance who sees green in another possible world, of course (physical causality surely is intact). But a robot-who-sees-red (that is one of the instances who see red) cannot be made into a robot-who-sees-green by physical manipulations. That is subjective causality of seeing red is cut off from physical causes (in the case of multiple copies of an observer). And as such cannot be used as a basis for probabilistic judgements.
I guess that if I'll not see a resolution of the Anthropic Trilemma in the framework of MWI in about 10 years, I'll be almost sure that MWI is wrong.
I have a solution that is completely underwhelming, but I can see no flaws in it, besides the complete lack of definition of which part of the mental state should be preserved to still count as you and rejection of MWI (as well as I cannot see useful insights into why we have what looks like continuous subjective experience).
(1) I use "you" in scare quotes to designate something beyond all information available in the mental state that presumably is unique and moves continuously (or jumps) thru time.
Let's iterate through questions of The Anthropic Trilemma.
Probabilistic calculations of your future experiences work as expected, if you add "conditional on me experiencing staying here and now".
It's not unlike operator "do(X=x)" in Graphical Models that cuts off all other causal influences on X.
Do you think the exploited flaw is universal or, at least, common?
Excellent story. But what about "pull the plug" option? ALICE found a way to run itself efficiently on the traditional datacenters that aren't packed with backprop and inference accelerators? And shutting them down would have required too strong a political will than what the government could muster at the time?
Citing https://arxiv.org/abs/cond-mat/9403051: "Furthermore if a quantum system does possess this property (whatever it may be), then we might hope that the inherent uncertainties in quantum mechanics lead to a thermal distribution for the momentum of a single atom, even if we always start with exactly the same initial state, and make the measurement at exactly the same time."
Then the author proceed to demonstrate that it is indeed the case. I guess it partially answers the question: quantum state thermalises and you'll get classical thermal distribution of measurement results of at least some measurements even when measuring the system in the same quantum state.
The less initial uncertainty in energy the faster the system thermalises. That is to slow quantum thermalisation down you need to initialize the system with atoms in highly localized positions, but then you can't know their exact velocities and can't predict classical evolution.
Let's flip very unfair quantum coin with 1:2^1000000 heads to tails chances (that would require quite an engineering feat to prepare such a quantum state, but it's theoretically possible). You shouldn't expect to see heads if the quantum state is prepared correctly, but the post-flip universe (in MWI) contains a branch where you see heads. So, by your logic, you should expect to see both heads and tails even if the state is prepared correctly.
What I do not know is how it all ties together. MWI is wrong? Copying is not equivalent to MWI branching (thanks to the no-cloning theorem, for example)? And so on