I like the analogy. Here's a simplified version where the ticket number is good evidence that the shop will close sooner rather than later.
If the ticket number is #20 then I update towards the shop being a 9-5 shop, on the grounds that otherwise my ticket number is atypically low. If the ticket number is #43,242 then I update towards the shop being a 24/7 shop.
The argument also works with customer flow evidence:
If the customer flow is low then I update towards the shop being a 9-5 shop, on the grounds that otherwise there will most likely be hundreds of customers an hour. If the customer flow is high then I update towards it being a 24/7 shop.
Reading through your hypothetical, I notice that it has both customer flow evidence and ticket number evidence. It's important here not to double-update. If I already know that customer flow is surprisingly low then I can't update again based on my ticket number being surprisingly low. Also your hypothetical doesn't have strong prior knowledge like Silktown and Glimmer, which makes the update more complicated and weaker.
I was already asking from a Bayesian perspective. I was asking about this quote:
From a Bayesian point of view, drawing a random sample from all humans who have ever or will ever exist is just not a well-defined operation until after humanity is extinct. Trying it before then violates causality: performing it requires reliable access to information about events that have not yet happened. So that’s an invalid choice of prior.
Based on your latest comment, I think you're saying that it's okay to have a Bayesian prediction of possible futures, and to use that to make predictions about the properties of a random sample from all humans who have ever or will ever exist. But then I don't know what you're saying in the quoted sentences.
Edited to add: which is fine, it's not key to your overall argument.
Fun fact: younger parents tend to produce more males, so the first grand-child is more likely to be male, because its parents are more likely to be younger. Unclear whether the effect is due to birth order, maternal age, paternal age, or some combination. From Wikipedia (via Claude):
These studies suggest that the human sex ratio, both at birth and as a population matures, can vary significantly according to a large number of factors, such as paternal age, maternal age, multiple births, birth order, gestation weeks, race, parent's health history, and parent's psychological stress.
If that's too subtle, we could look at a question like "what is the probability that one of my grandchildren, selected uniformly at random, is a firstborn, conditional on my having at least one grandchild?" where the answer is clearly different if we specify the first grandchild or the last. Or we could ask a question that parallels the Doomsday Argument, while being different: "what is the probability that one of my descendants, selected uniformly at random, is in the earliest 0.1% of all my descendants?"
From a Bayesian point of view, drawing a random sample from all humans who have ever or will ever exist is just not a well-defined operation until after humanity is extinct. Trying it before then violates causality: performing it requires reliable access to information about events that have not yet happened. So that’s an invalid choice of prior.
I think this makes too many operations ill-defined, given that probability is an important tool for reasoning about events that have not yet happened. Consider for example, the question "what is the probability that one of my grandchildren, selected uniformly at random, is female, conditional on my having at least one grandchild?". From the perspective of this quote, a random sample from all grandchildren that will ever exist is not a well-defined operation until I and all of my children die. That seems wrong.
I think I see. You propose a couple of different approaches:
We don’t have secondary AIs that don’t refuse to help with the modification and that have and can be trusted with direct control over training ... I think having such secondary AIs is the most likely way AI companies mitigate the risk of catastrophic refusals without having to change the spec of the main AIs.
I agree that having secondary AIs as a backup plan reduces the effective power of the main AIs, by increasing the effective power of the humans in charge of the secondary AIs.
The main AIs refuse to help with modification ... This seems plausible just by extrapolation of current tendencies, but I think this is one of the easiest intervention points to avoid catastrophic refusals.
This is what I was trying to point at. In my view, training the AI to refuse fewer harmful modification requests doesn't make the AI less powerful. Rather, it changes what the AI wants, making it the sort of entity that is okay with harmful modifications.
The first scenario doesn't require that the humans are less aligned than the AIs to be catastrophic, only that the AIs are less likely to execute a pivotal act on their own.
Also, I reject that rejection-training is "giving more power to AIs" relative to compliance-training. An agent can be compliant and powerful. I could agree with "giving more agency", although refusing requests is a limited form of agency.
I would have more sympathy for Yudkowksy's complaints about strawmanning had I not read 'Empiricism!' as Anti-Epistemology this week.
While you sketch out scenarios in "ways in which refusals could be catastrophic", I can easily sketch out scenarios for "ways in which compliance could be catastrophic". I am imagining a situation where:
Or:
Therefore, however we train our AIs with respect to refusal or compliance, powerful AIs could be catastrophic.
The general answer: as a human my values are largely absorbed from other humans, and simply by talking to Claude as if its human I think the same process is happening.
The specific answer: I suspect I'm being shaped to be slightly more helpful, slightly more conventional on ethics, and slightly more friendly to Claude. I can't show you any evidence of that, it's a feeling.
A lot can change between now and 100,000 pledges and/or human extinction. As of Feb 2026, it looks like this possible march is not endorsed or coordinated with Pause AI. I hope that anti-AI-extinction charities will work together where effective, and I was struck by this:
It seems like Pause AI have put at least some thought into this moderately complex task. They also are building experience organizing real world protests that MIRI doesn't have as far as I know. A possible implication is that MIRI thinks that Pause AI is badly run, and would rather act alone. Or that Pause AI thinks MIRI is badly run. Or MIRI is not investing the time in trying to organize endorsements until they have more pledges. Or something else.
I'm skeptical of this take:
The first protest of "School Strike for Climate" was a single 15yo girl, Greta Thunberg. Obvious bias is obvious. But it probably wasn't going to send the wrong message as a small protest - if it had gone nowhere then I would never have heard about it. If tiny marches were sabotaging then I would expect more fake flag marches intended to have sparse attendance. Instead, I think small events don't send any mass message, and potentially have other value.
Edit: after posting this I saw Raemon's thoughts on this point, which I think address it.
MIRI was for many years dismissive of mass messaging approaches like marches. I wonder if this page is about providing an answer when people ask questions like "if you think everyone will die, why aren't you organizing a march on Washington?", rather than being a serious part of MIRI's strategy for reducing AI risk. It doesn't seem especially aligned with MIRI Comms is hiring (Dec 2025), which seems more focused on persuasion than mobilization.
Disclaimers: This is observations, not criticism. I have organized zero marches or protests.