Followup to: Explainers Shoot High, Illusion of Transparency
My first true foray into Bayes For Everyone was writing An Intuitive Explanation of Bayesian Reasoning, still one of my most popular works. This is the Intuitive Explanation's origin story.
In December of 2002, I'd been sermonizing in a habitual IRC channels about what seemed to me like a very straightforward idea: How words, like all other useful forms of thought, are secretly a disguised form of Bayesian inference. I thought I was explaining clearly, and yet there was one fellow, it seemed, who didn't get it. This worried me, because this was someone who'd been very enthusiastic about my Bayesian sermons up to that point. He'd gone around telling people that Bayes was "the secret of the universe", a phrase I'd been known to use.
So I went into a private IRC conversation to clear up the sticking point.
And he still didn't get it.
I took a step back and explained the immediate prerequisites, which I had thought would be obvious -
He didn't understand my explanation of the prerequisites.
In desperation, I recursed all the way back to Bayes's Theorem, the ultimate foundation stone of -
He didn't know how to apply Bayes's Theorem to update the probability that a fruit is a banana, after it is observed to be yellow. He kept mixing up p(b|y) and p(y|b).
It seems like a small thing, I know. It's strange how small things can trigger major life-realizations. Any former TAs among my readers are probably laughing: I hadn't realized, until then, that instructors got misleading feedback. Robin commented yesterday that the best way to aim your explanations is feedback from the intended audience, "an advantage teachers often have". But what if self-anchoring also causes you to overestimate how much understanding appears in your feedback?
I fell prey to a double illusion of transparency. First, I assumed that my words meant what I intended them to mean - that my listeners heard my intentions as though they were transparent. Second, when someone repeated back my sentences using slightly different word orderings, I assumed that what I heard was what they had intended to say. As if all words were transparent windows into thought, in both directions.
I thought that if I said, "Hey, guess what I noticed today! Bayes's Theorem is the secret of the universe!", and someone else said, "Yes! Bayes's Theorem is the secret of the universe!", then this was what a successful teacher-student interaction looked like: knowledge conveyed and verified. I'd read Pirsig and I knew, in theory, about how students learn to repeat back what the teacher says in slightly different words. But I thought of that as a deliberate tactic to get good grades, and I wasn't grading anyone.
This may sound odd, but until that very day, I hadn't realized why there were such things as universities. I'd thought it was just rent-seekers who'd gotten a lock on the credentialing system. Why would you need teachers to learn? That was what books were for.
But now a great and terrible light was dawning upon me. Genuinely explaining complicated things took months or years, and an entire university infrastructure with painstakingly crafted textbooks and professional instructors. You couldn't just tell people.
You're laughing at me right now, academic readers; but think back and you'll realize that academics are generally very careful not to tell the general population how difficult it is to explain things, because it would come across as condescending. Physicists can't just say, "What we do is beyond your comprehension, foolish mortal" when Congress is considering their funding. Richard Feynman once said that if you really understand something in physics you should be able to explain it to your grandmother. I believed him. I was shocked to discover it wasn't true.
But once I realized, it became horribly clear why no one had picked up and run with any of the wonderful ideas I'd been telling about Artificial Intelligence.
If I wanted to explain all these marvelous ideas I had, I'd have to go back, and back, and back. I'd have to start with the things I'd figured out before I was even thinking about Artificial Intelligence, the foundations without which nothing else would make sense.
Like all that stuff I'd worked out about human rationality, back at the dawn of time.
Which I'd considerably reworked after receiving my Bayesian Enlightenment. But either way, I had to start with the foundations. Nothing I said about AI was going to make sense unless I started at the beginning. My listeners would just decide that emergence was a better explanation.
And the beginning of all things in the reworked version was Bayes, to which there didn't seem to be any decent online introduction for newbies. Most sources just stated Bayes's Theorem and defined the terms. This, I now realized, was not going to be sufficient. The online sources I saw didn't even say why Bayes's Theorem was important. E. T. Jaynes seemed to get it, but Jaynes spoke only in calculus - no hope for novices there.
So I mentally consigned everything I'd written before 2003 to the trash heap - it was mostly obsolete in the wake of my Bayesian Enlightenment, anyway - and started over at what I fondly conceived to be the beginning.
(It wasn't.)
And I would explain it so clearly that even grade school students would get it.
(They didn't.)
I had, and have, much left to learn about explaining. But that's how it all began.
I see QED as a bit like stating the axioms of a mathematical theory. You can, in principle, derive the whole theory from the axioms, but in practice it takes generations of ingenuity to come up with the tools to do that. We take courses in mathematics not just to learn the axioms, but also, and primarily, to learn the vast library of tricks that let us do something useful with the axioms.
Similarly, I remember my first or second physics course, either mechanics or electromagnetism. The inside of the cover had, as I recall, all the "axioms", the fundamental laws from which everything could be derived. Those fit inside the cover. But, just as in a mathematical subject, the main body of the subject was the library of tricks that let us actually make specific predictions from those fundamental laws.
Feynman, as I recall, was very up front in QED about what it did and did not contain. He was explicit about it not including the tricks that we would need to learn to apply the fundamental principles to real predictions about real situations.
However, I would not really call the book "vague" or even "hand-waving", any more than I would call the inside cover of my physics textbook "hand-waving" or even "not physics". It was seriously lacking, yes, admittedly so. But not at all in the way that, say, quantum mechanics popularizations typically are. Popularizations include neither the axioms (fundamental laws) of the theory, nor the tricks, but instead are filled with metaphor and impressionistic talk and not a small amount of pop philosophy. Not the same thing at all as QED (I mean QED the book, not the subject of quantum electrodynamics).