I'm currently going through Brilliant's course on "Knowledge and Uncertainty". I just got through the part where it explains what Shannon entropy is. I'm now watching a wave of realizations cascade in my mind. For instance, I now strongly suspect that the "deep law" I've been intuiting for years that makes evolution, economics, and thermodynamics somehow instances of the same thing is actually an application of information theory.
(I'm honestly kind of amazed I was able to follow as much of rationalist thought and Eliezer's writings as I was without any clue what the formal definition of information was. It looks to me like it's more central than is Bayes' Theorem, and that it provides essential context for why and how that theorem is relevant for rationality.)
I'm ravenous to grok more. Sadly, though, I'm bumping into a familiar wall I've seen in basically all other technical subjects: There's something of a desert of obvious resources between "Here's an article offering a quick introduction to the general idea using some fuzzy metaphors" and "Here's a textbook that gives the formal definitions and proofs."
For instance, the book "Thinking Physics" by Lewis Carroll Epstein massively helps to fill this gap for classical physics, especially classical mechanics. By way of contrast, most intro to physics textbooks are awful at this. ("Here we derive the kinematic equation for an object's movement under uniform acceleration. Now calculate how far this object goes when thrown at this angle at this velocity." Why? Is this really a pathway optimized for helping me grok how the physical world works? No? So why are you asking me to do this? Oh, because it's easy to measure whether students get those answers right? Thank you, Goodhart.)
Another excellent non-example is the Wikipedia article on how entropy in thermodynamics is a special case of Shannon entropy. Its length is great as a kind of quick overview, but it's too short to really develop intuitions. And it also leans too heavily on formalism instead of lived experience.
(For instance, it references shannons (= bits of information), but it gives no hint that what a shannon is measuring is the average number of yes/no questions of probability 1/2 that you have to ask to remove your uncertainty. Knowing that's what a shannon is (courtesy of Brilliant's course) gives me some hint about what a hartley (= base ten version instead of base two) probably is: I'm guessing it's the average number of questions with ten possible answers each, where the prior on each answer is 1/10, that you'd have to ask to remove your uncertainty. But then what's a nat (= base e version)? What does it mean for a question to have an irrational number of possible equally likely answers? I'm guessing you'd have to take a limit of some kind to make sense of this, but it's not immediately obvious to me what that limit is let alone how to intuitively interpret what it's saying. The Wikipedia article doesn't even hint at this question let alone start to answer it. It's quite happy just to show that the algebra works out.)
I want to learn to see information theory in my lived experience. I'm fine with technical details, but I want them tied to intuitions. I want to grok this. I don't care about being able to calculate detailed probabilities or whatever except inasmuch as my doing those exercises actually helps with grokking this.
Even a good intuitive explanation of thermodynamics as seen through the lens of information theory would be helpful.
Any suggestions?
Oh, I would certainly love that. Statistical mechanics looks like it's magic, and it strikes me as absolutely worth grokking, and yeah I haven't found any entry point into it other than the Great Formal Slog.
I remember learning about "inner product spaces" as a graduate student, and memorizing structures and theorems about it, but it wasn't until I had already finished something like a year of grad school that I found out that the intuition behind inner products was "What kind of thing is a dot product in a vector space? What would 'dot product' mean in vector spaces other than the Euclidean ones?" Without that guiding intuition, the whole thing becomes a series of steps of "Yep, I agree, that's true and you've proven it. I don't know why we're proving that or where we're going, but okay. One more theorem to memorize."
I wonder if most "teachers" of formal topics either assume the guiding intuitions are obvious or implicitly think they don't matter. And maybe for truly gifted researchers they don't? But at least for people like me, they're damn close to all that matters.