It looks like Theorem 1 can be improved slightly, by dropping the "only if" condition on . We can then code up something like Kolmogorov complexity by adding a probability transition from every site to our chosen UTM.
If you only want the weaker statement that there is no stationary distribution, it looks like there's a cheaper argument: Since is aperiodic and irreducible the hypothetical stationary distribution is unique. is closed under the action of , and (2) implies that for any , the map is an automorphism of the Markov chain. If the (infinite) transition matrix is , then can be considered as a permutation matrix with (abusing notation) . Then and so by uniqueness. So is constant on orbits of , which are all countably infinite. Hence is everywhere , a contradiction.
The above still holds if (2) is restricted to only hold for a group such that every orbit under is infinite.
I think the above argument shows why (2) is too strong; we shouldn't expect the world to look the same if you pick a "wrong" (ie. complicated) UTM to start off with. Weakening (2) might mean saying something like asserting only . To do this, we might define the measures and together (ie. finding a fixed point of a map from pairs to ). In such a model, constraints the transition probabilities, is stationary; it's not clear how one might formalise a derivation of from but it seems plausible that there is a canonical way to do it.
That sounds a rather odd argument to make, even at the time. Astronomy from antiquity was founded on accurate observations.
Astronomy and epistemology aren't quite the same. Predicting where Saturn would be on a given date requires accurate observation, and nobody objected to Coperniucus as a calculational tool. For example, the Jesuits are teaching Copernicus in China in Chinese about 2 years after he publishes, which implies they translated and shipped it with some alacrity.
The heavens were classically held to be made of different stuff; quintessense (later called aether) was not like regular matter -- this is obvious from the inside, because it maintains perpetual motion where normal matter does not. A lot of optical phenomena (eg. twinkling stars, the surface of the moon) were not seen as properties of the objects in question but properties of regular 4-elements matter between us and them.
By a modern standard, the physics is weird and disjointed... but that is historically how it was seen.
The precise phrasing is deliberately a little tendentious, but the issue of the epistemological status of the telescope was raised by loads of people at the time. For a modern review with heavy footnotes, see eg Galileo, Courtier: The Practice of Science in the Culture of Absolutism, pp 95-100, (though the whole chapter is good)
For example, the first anti-Galilean tract is by Horky in 1610 and focussed mostly on the lack of reliability of the telescope. For another, Magini's letters (confirmed in Kepler and Galileo) write of a "star party" in 1610 where Galileo attempted to convince a number of astronomers of the discovery of the Medician (now Galilean) moons; noone else could see the moons and additionally the telescope produced doubled images of everything more distant than the moon.
There wasn't much dispute about terrestial applications. Under Aristotle's physics everything above the moon is made of different stuff with different physics anyway, so any amount of accuracy when looking at stuff of the four elements doesn't allow one to induct to accuracy in observations of the heavens.
tl;dr: The side of rationality during Galileo's time would be to recognise one's confusion and recognise that the models did not yet cash out in terms of a difference in expected experiences. That situation arguably holds until Newton's Principia; prior to that no one has a working physics for the heavens.
The initial heliocentric models weren't more accurate by virtue of being heliocentric; they were better by virtue of having had their parameters updated with an additional 400 years of observational data over the previous best-fit model (the Alfonsine tables from the 1250s). The geometry was similarly complicated; there was still a strong claim that only circular motions could be maintained indefinitely, and so you have to toss 60 or so circular motions in to get the full solar system on either model.
Basically everyone was already using the newer tables as calculational tools, and it had been known from ancient times that you could fix any point you wanted in an epicyclic model and get the same observational results. The dispute was about which object was in fact fixed. Kepler dates to the same time, and will talk about ellipses (and dozens of other potential curves) in place of circular motion from 1610, but he cannot predict where a planet will be efficiently. He's also not exactly a paragon of rationality; astrology and numerology drive most of his system, and he quite literally ascribes his algebraic slips to god.
A brief but important digression into Aristotle is needed; he saw as key that was made was that the motion of the planets is unceasing but changes, whereas all terrestrial motions ceased eventually. He held that circular motions were the only kind of motion that could be sustained indefinitely, and even then, only by a certain special kind of perfect matter. The physics of this matter fundamentally differed from the physics of normal stuff in Aristotle. Roughly and crudely, if it can change then it has to have some kind of dissipative / frictional physics and so will run down.
Against that backdrop, Galileo's key work wasn't the Dialogue, but the Siderius Nuncius. There had been two novae observed in the 40 years prior, and this had been awkward because a whole bunch of (mostly neo-Platonists) were arguing that this showed the heavens changed, which is a problem for Aristotle. Now Galileo shows up and using a device which distorts his vision, he claims to be able to deduce:
From a viewpoint which sees a single unified material physics, these observations kill Aristotelian cosmology. You've got at least three centers of circular-ish motion, which means you can't mount the planets on transparent spheres to actually move them around. You have an indication that the Sun might be rotating, and is certainly dynamic. If you kill Aristotle's cosmology, you have to kill most of his physics, and thus a good chunk of his philosophy. That's a problem, because since Aquinas the Catholic church had been deriving theology as a natural consequence of Aristotle in order to secure themselves against various heresies. And now some engineer with pretensions is turning up, distorting his vision and claiming to upend the cart.
What Galileo does not have is a coherent alternative package of physics and cosmology. He claims to be able to show a form of circular inertia from first principles. He claims that this yields a form of relativity in motion which makes it difficult to discern your true motion without reference to the fixed stars. He claims that physics is kinda-sorta universal, based on his experience with cannon (which Aristotelian physics would dismiss because [using modern terminology] experiments where you apply forces yourself are not reproducible and so cannot yield knowledge). This means his physics has real issues explaining dissipative effects. He doesn't have action at a distance, so he can't explain why the planets do their thing (whereas there are physical models of Aristotelian / Ptolemaic models).
He gets into some pro forma trouble over the book, because he doesn't put a disclaimer on it saying that he'll retract it if it's found to be heretical. Which is silly and it gets his knuckles rapped over it. The book is "banned", which means two things, for there are two lists of banned books. One is "burn before reading" and the other is more akin to being in the Restricted Section; Galileo's work is the latter.
Then he's an ass in the Dialogue. Even that would not have been an issue, but at the time he's the court philosopher of the Grand Duke of Tuscany, Cosimo I de' Medici. This guy is a secular problem for the Pope; he has an army, he's not toeing the line, there's a worry that he'll annex the Papal states. So there's a need to pin his ears back, and Galileo is a sufficiently senior member of the court that Cosimo won't ignore his arrest nor will he go to war over it.
So the Inquisition cooks up a charge for political purposes, has him "tortured" (which is supposed to mean they /show/ him the instruments of torture, but they actually forget to), get him to recant (in particular get Cosimo to come beg for his release), and release him to "house arrest" (where he is free to come, go, see whoever, write, etc). The drama is politics, rather than anything epistemological.
As to the disputes you mention, some had been argued through by the ancient Greeks. For example, everyone knew that measurements were imprecise, and so moving the earth merely required that the stars were distant. It was also plain that if you accepted Galileo's observations as being indicative of truth, then Aristotelian gravity was totally dead, because some stuff did not strive to fall (cometary tails were also known to be... problematic).
Now, Riccioli is writing 20 years later, in an environment where heliocentrism has become a definite thing with political and religious connotations, associated to neo-Platonism, anti-Aristotelean, anti-Papal thinking. This is troublesome because it strikes at the foundational philosophy underpinning the Church, and secular rulers in Europe are trying to strategically leverage this. Much like Aquinas, Riccioli's bottom line is /written/ already. He has to mesh this new stack of observational data with something which looks at least somewhat like Aristotle. Descartes is contracted at about the same time to attempt to rederive Catholicism from a new mixed Aristotilean / Platonist basis.
As a corollary, he's being quite careful to list every argument which anyone has made, and every refutation (there's a comparatively short summary here). Most of the arguments presented have counterpoints from the other side, however strained they might seem from a modern view. It's more akin to having 126 phenomena which need to be explained than anything else. They don't touch on the apparently changing nature of the planets (by this point cloud bands on Jupiter could be seen) and restrict themselves mostly to the physics of motion. There's a lot of duplication of the same fundamental point, and it's not a quantitative discussion. There are some "in principle" experiments discussed, but a fair few had been considered by Galileo and calculated to be infeasible (eg. observing 1 inch deflections in cannon shot at 500 yards, when the accuracy is more like a yard).
Obviously Newton basically puts a stop to the whole thing, because (modulo a lack of mechanism) he can give you a calculational tool which spits out Kepler and naturally fixes the center of mass. There are still huge problems; the largest is that even point-like stars appear to have small disks from diffraction, and until you know this you end up thinking every other star has to be larger than the entire solar system. And the apparent madness of a universal law is almost impossible to understate. It's really ahistorical to think that a very modern notion of parsimony in physics could have been applied to Galileo and his contemporaries.
So, my observation is that without meta-distributions (or A_p), or conditioning on a pile of past information (and thus tracking /more/ than just a probability distribution over current outcomes), you don't have the room in your knowledge to be able to even talk about sensitivity to new information coherently. Once you can talk about a complete state of knowledge, you can begin to talk about the utility of long term strategies.
For example, in your example, one would have the same probability of being paid today if 20% of employers actually pay you every day, whilst 80% of employers never paid you. But in such an environment, it would not make sense to work a second day in 80% of cases. The optimal strategy depends on what you know, and to represent that in general requires more than a straight probability.
There are different problems coming from the distinction between choosing a long term policy to follow, and choosing a one shot action. But we can't even approach this question in general unless we can talk sensibly about a sufficient set of information to keep track of about. There are two distinct problems, one prior to the other.
Jaynes does discuss a problem which is closer to your concerns (that of estimating neutron multiplication in a 1-d experiment 18.15, pp579. He's comparing two approaches, which for my purposes differ in their prior A_p distribution.
The substantive point here isn't about EU calculations per se. Running a full analysis of everything that might happen and doing an EU calculation on that basis is fine, and I don't think the OP disputes this.
The subtlety is about what numerical data can formally represent your full state of knowledge. The claim is that a mere probability of getting the $2 payout does not. It's the case that on the first use of a box, the probability of the payout given its colour is 0.45 regardless of the colour.
However, if you merely hold onto that probability, then if you put in a coin and so learn something about the boxes you can't update that probability to figure out what the probability of payout for the second attempt is. You need to go back and also remember whether the box is green or brown. The point of Jaynes and the A_p distribution is that it actually does screen off all other information. If you keep track of it you never need to worry about remembering the colour of the box, or the setup of the experiment. Just this "meta-distribution".
Concretely, I have seen this style of test (for want of better terms, natural language code emulation) used as a screening test by firms looking to find non-CS undergraduates who would be well suited to develop code.
In as much as this test targets indirection, it is comparatively easy to write tests which target data driven flow control or understanding state machines. In such a case you read from a fixed sequence and emit a string of outputs. For a plausible improvement, get the user to log the full sequence of writes, so that you can see on which instruction things go wrong.
There also seem to be aspects of coding which are not simply being technically careful about the formal function of code. The most salient to me would be taking an informally specified natural language problem and reducing it to operations one can actually do. Algorithmic / architectural thinking seems at least as rare as fastidiousness about code.
To my knowledge, it's not discussed explicitly in the wider literature. I'm not a statistician by training though, so my knowledge of the literature is not brilliant.
On the other hand, talking to working Bayesian statisticians about "what do you do if we don't know what the model should be" seems to reliably return answers of broad form "throw that uncertainty into a two-level model, run the update, and let the data tell you which model is correct". Which is the less formal version of what Jaynes is doing here.
This seems to be a reasonable discussion of the same basic material, though in a setting of finitely many models rather than the continuum of p models for Jaynes.
Thank you for calling out a potential failure mode. I observe that my style of inquisition can come across as argumentative, in that I do not consistently note when I have shifted my view (instead querying other points of confusion). This is unfortunate.
To make my object level opinion changes more explicit:
I have had a weak shift in opinion towards the value of attempting to quantify and utilise weak arguments in internal epistemology, after our in person conversation and the clarification of what you meant.
I have had a much lesser shift in opinion of the value of weak arguments in rhetoric, or other discourse where I cannot assume that my interlocutor is entirely rational and truth-seeking.
I have not had a substantial shift in opinion about the history of mathematics (see below).
As regards the history of mathematics, I do not know our relative expertise, but my background prior for most mathematicians (including JDL_{2008}) has a measure >0.99 cluster that finds true results obvious in hindsight and counterexamples to false results obviously natural. My background prior also suggests that those who have spent time thinking about mathematics as it was done at the time fairly reliably do not have this view. It further suggests that on this metric, I have done more thinking than the median mathematician (against a background of Cantab. mathmos, I would estimate I'm somewhere above the 5th centile of the distribution). The upshot of this is that your recent comments have not substantively changed my views about the relative merit of Cauchy and Euler's arguments at the time they were presented; my models of historians of mathematics who have studied this do not reliably make statements that look like your claims wrt. the Basel problem.
I do not know what your priors look like on this point, but it seems highly likely that our difference in views on the mathematics factor through to our priors, and convergence will likely be hindered by being merely human and having low baud channels.
Even if it's the case that the statistics are as suggested, it would seem that a highly effective strategy is to ensure that there are multiple adults around all the time. I'll accept your numbers ad arguendo (though I think they're relevantly wrong).
If there's a 4% chance that one adult is an abuser, there's a 1/625 chance that two independent ones are, and one might reasonably assume that the other 96% of adults are unlikely to let abuse slide if they see any evidence of it. The failure modes are then things like abusers being able to greenbeard well enough that multiple abusers identify each other and then proceed to be all the adults in a given situation. Which is pretty conjunctive as failures go, and especially in a world where you insist that you know all the adults personally from before you started a baugruppe rather that letting Bob (and his 5 friends who are new to you) all join.
You also mention "selection for predators", but that seems to run against the (admittedly folk) wisdom that children at risk of abuse are those that are isolated and vulnerable. Daycare centres are not the central tendency of abuse; quiet attics are.