"I always remember, [Hamming] would come into my office and try to solve a problem [...] I had a very big blackboard, and he’d start on one side, write down some integral, say, ‘I ain’t afraid of nothin’, and start working on it. So, now, when I start a big problem, I say, ‘I ain’t afraid of nothin’, and dive into it."
—Bruce MacLennan
Have you heard of Rene Thom's work on Structural Stability and Morphogenesis? I haven't been able to read this book yet[1], but my understanding[2] of its thesis is that: "development of form" (i.e. morphogenesis, broadly construed, eg biological or otherwise) depends on information from the structurally stable "catastrophe sets" of the potential driving (or derived from) the dynamics - structurally stable ones, precisely because what is stable under infinitesimal perturbation are the only kind of information observable in nature.
Rene Thom puts all of this in a formal model - and, using tools of algebraic topology, show that these discrete catastrophes (under some conditions, like number of variables) have a finite classification, and thus (in the context of this morphological model) is a sort of finitary "sufficient statistics" of the developmental process.
This seems quite similar to the point you're making: [insensitive / stable / robust] things are rare, but they organize the natural ontology of things because they're the only information that survives.
... and there seems to be the more speculative thesis of Thom (presumably; again, I don't know this stuff), where geometric information about these catastrophes directly correspond to functional / internal-structure information about the system (in Thom's context, the Organism whose morphogenic process we're modeling) - this presumably is one of the intellectual predecessors of Structural Bayesianism, the thesis that there is a correspondence between internal structures of Programs or the Learning Machine with the local geometry of some potential.
I don't think I have enough algebraic topology background yet to productively read this book. Everything in this comment should come with Epistemic Status: Low Confidence.
From discussions and reading distillations of Thom's work.
Thank you for the suggestion! That sounds like a good idea, this thread seems to have some good recommendations, will check them out.
Learning algebraic topology, homotopy always felt like a very intuitive and natural sort of invariant to attach to a space whereas for homology I don't think I have anywhere as close of an intuitive handle or sense of naturality of this concept as I do for homotopy. So I tried to collect some frames / results for homology I've learned to see if it helps convince my intuition that this concept is indeed something natural in mathspace. I'd be very curious to know if there are any other frames or Deeper Answers to "Why homology?" I'm missing:
Not exactly about adversarial error correction, but: there is a construction (Çapuni & Gács 2021) of a (class of) universal 1-tape (!!) Turing machine that can perform arbitrarily long computation subject to random noise in the per-step action. Despite the non-adversarial noise model, naive majority error correction (or at least their construction of it) only fixes bounded & local error bursts - meaning it doesn't work in the general case, because even though majority vote reduces error probability, the effective error rate is still positive, so something almost surely goes wrong (eg error burst of size greater than what majority vote can handle) as .
Their construction, in fact, looks like a hierarchy of simulated turing machines where the higher-level TM is simulated by a level below it but at a bigger tape scale, such that it can resist larger error bursts - and the overall construction looks like "moving" the "live simulation" of the actual program that we want to execute up the hierarchy over time to coarser and more reliable levels.
Notes and reflections on the things I've learned while Doing Scholarship the last two three weeks (i.e. studying math).
(EDIT (Nov 18): I will post these less frequently, maybe once a month or two, and also make it more self-contained in context, since journal-like content like this probably isn't all that useful for most people. I will perhaps make a blog for more regular learning updates.)
The past three weeks were busier than usual so I had slower progress this time but here it is:
Chapter 6 continued: Sard's theorem
Chapter 8: Vector Fields
Chapter 10: Vector Bundles
Chapter 11: Cotangent Bundle
Chapter 13: Riemannian Manifold
Then I read some Bredon for Algebraic Topology.
There's a couple easy ones, like low rank structure, but I never really managed to get a good argument for why generic symmetries in the data would often be emulatable in real life.
Right, I expect emulability to be a specific condition enabled by a particular class of algorithms that a NN might implement, rather than a generic one that is satisfied by almost all weights of a given NN architecture[1]. Glad to hear that you've thought about this before, I've also been trying to find a more general setting to formalize this argument beyond the toy exponential model.
Other related thoughts[2]:
This is more about how I conceptually think they should be (since my motivation is to use their non-genericity to argue why certain algorithms should be favored over others), and there are probably interesting exceptions of symmetries that are generically emulatable due to properties of the NN architecture (eg depth).
Some of these ideas were motivated following a conversation with Fernando Rosas.
one goal whose attainment is not only impossible to observe
This part doesn't sound that unique? It's typical for agents to have goals (or more generally values) that are not directly observable (cf Human values are a function of Humans' latent variables), and very often they only have indirect evidence about the actualization of those goals / values (which may be indirect evidence for their actualization in the distant future at which the agent may not even exist to even potentially be able to observe) - such as my philanthropic values extending over people I will never meet and whose well-being I will never observe.
Doomers predicted that the Y2K bug would cause massive death and destruction. They were wrong.
This seems like a misleading example of doomers being wrong (agree denotationally, disagree connotationally), since I think it's plausible that Y2K was not a big deal (to such an extent that "most people think it was a myth, hoax, or urban legend") precisely because of the mitigation efforts stemmed by the doomsayers' predictions.
Notes and reflections on the things I've learned while Doing Scholarship the last two week (i.e. studying math).
Mostly the past two weeks were on differential geometry (Lee):
Rabbit holes that I could not afford to pursue:
Example of how reading books in parallel improves learning efficiency.
Why that long? The dimensionality reduction by projection is perhaps more nontrivial because of Sard, but the obvious gluing should have been sufficient to construct an immersion at least, albeit at the cost of inefficient codomain dimension. Maybe the historically difficult part was the concept of partition of unity and that it always exist in manifolds?
Perhaps relevant: An Informational Parsimony Perspective on Probabilistic Symmetries (Charvin et al 2024), on applying information bottleneck approaches to group symmetries: