Next Monday I am supposed to introduce a bunch of middle school students to Bayes' theorem. 

I've scoured the Internet for basic examples where Bayes' theorem is applied. Alas, all explanations I've come cross are, I believe, difficult to grasp for the average middle school student.

So what I am looking for is a straightforward explanation of Bayes' theorem that uses the least amount of Mathematics and words possible. (Also, my presentation has to be under 3 minutes.)

I think that it would be efficient in terms of learning for me to use coins or cards, something tangible to illustrate what I'm talking about.

What do you think? How should I teach 'em Bayes' ways?

PS: I myself am new to Bayesian probability.

New to LessWrong?

New Comment
19 comments, sorted by Click to highlight new comments since: Today at 6:24 PM

This is my favorite explanation of the theorem so far:

http://oscarbonilla.com/2009/05/visualizing-bayes-theorem/

But I doubt you can explain it to middle school students in only 3 min. If I were you, I wouldn't discuss the theorem itself, just the cancer patient problem. Have the students try to figure out the answer for themselves, and then surprise them with the real answer (and justify it by talking about a population of 1 million people or whatever; your explanation doesn't have to use probabilities, although the problem statement could).

Actually, you might want to come up with a different example than the standard one so students who happen to encounter the standard one later on will appreciate it. (I was turned off by Eliezer's Bayesian theorem explanation initially, because it started off by challenging me to solve the standard disease example, which I already knew the trick for.)

Nice one, I like it!

But there's something I fail to understand: where's the 9.6% rendered?

"9.6% of the area outside of event A." - wait, doesn't that little area outside A represent the women with cancer?

Pretty sure the 9.6% is the section of the green circle that doesn't overlap with the red circle.

Middle-school students may be a little hazy on what a theorem is, or have a notion of it that derives only from memorizing the teacher's password. ("Math has theorems, science has theories.")

Probability is a mathematical object called a measure, which means it obeys exactly the same rules as area or volume. This is why the "visualizing Bayes' theorem" link is exactly true. Probabilities are like circles (or other shapes) with area equal to their probability, and these circles overlap when two things happen together. So I think the Venn diagram explanation might help students remember it.

I was under the impression that ET Jaynes did not like the circle diagram because it implied an infinitude of outcomes:

http://www-biba.inrialpes.fr/Jaynes/cc02m.pdf

Hm, good point. For example, for his statements "It will rain today" and "the roof will leak," the points in the Venn diagram you'd draw to show how these probabilities overlap don't correspond to anything real. On the other hand, it's really useful to picture this stuff, and you can imagine "chunking up" your space into regions corresponding to the different discrete outcomes (like a bar graph), and the exact same rules are followed, except now it seems a bit more meaningful.

P(A and B) = P(B and A)
P(A and B) = P(A) * P(B|A)

=>

P(A and B) = P(A) * P(B|A)
P(B and A) = P(A) * P(B|A)
P(B) * P(A|B) = P(A) * P(B|A)
P(A|B) = P(A) * P(B|A) / P(B)

P(B and A) = P(A) * P(B|A)

To make it clearer, shouldn't this step be = P(B) * P(A|B)?

Also, are middle school pupil in the U.S. familiar with the notation? Maybe one should state it in English instead?

The probability of A if B is known to occur is equal to the probability of A times the probability of B if A is known to occur divided by the probability of B.

ETA

P(A and B) = P(A) * P(B|A)

The probability of A and B to occur at the same time is the same as the probability of A to occur alone times the probability of B to occur under the condition that A is known to occur.

Spot the typo.

Don't see one. Could you please tell me where?

I think the second line after the => was supposed to be P(B and A) = P(B) * P(A|B).

No, it's a substitution on the left hand side. I substituted P(A and B) for P(B and A)

Possibly relevant what specific grade(s) are these students from, and are they in any sort or gifted program or is it a normal middle school population.

I'm guessing 6th or 7h grade, average flock.

This isn't an approach I've seen much before, and so it may not be wise to do this your first time (I also recommend adapting this explanation), but maybe focus on that Bayes' theorem is when you have two competing hypotheses, and you get evidence that is more probable under one hypothesis than the other.

When you get evidence, you keep track of the probability of each hypothesis separately, but what matters is their normalized probability. (I'll use frequencies since those are easier for people to manipulate than probabilities.) You might start off with the knowledge that the chance someone has a rare disease is 100 out of a million, but the chance that someone has a common disease is 9,800 out of a million. Everyone with the rare disease goes to the doctor, but only half of the people with the common disease go to the doctor, and no healthy people visit the doctor- so in a city of one million people, 100 people with the rare disease visit the doctor, and 4,900 people with the common disease visit the doctor, and the probability someone at the doctor's office has the rare disease is 2%.

Then the doctor runs a test- it gives an A result 99 times out of a hundred for people with the rare disease, and a B result 1 time out of a hundred for people with the rare disease. It gives an A result 2 times out of a hundred for people with the common disease, and a B result 98 times out of a hundred. Now we have 99 people with rare and A, 1 person with rare and B, 98 people with common and A, and 4802 people with common and B. Of the people who got an A result, they have about a 50% chance to have either disease- but of people with a B result, almost all of them have the common disease.

The main mental strategy that Bayes' theorem helps people with is "keep multiple hypotheses in your head at once," and you may want to emphasize that to people just hearing about it.

You test 1,000 people for cancer. Of the people tested, only 10 actually have cancer. If the person has cancer, there's a 90% chance that the test will catch it. It will also claim that 1% of healthy people have cancer.

This means that of the 990 healthy people, 9 will get marked as having cancer. It also means that of the 10 ill people, only 9 of them will be noticed. So your odds of actually having cancer, given a positive test result, are only 50/50.

I mostly teach this to people who can follow the actual math without blinking, but I've found it's the fastest way to give a very basic explanation of what Bayes means. It's a specific, concrete example, and it's also one that feels intuitively useful - you've now learned about false positives and false negatives, and how this affects the meaning of actual results.

From there, you can go in to the math, but I've found most people who are bad at math can still work it out this way, and most people who are good at math can quickly derive the formulas from the example :)

I'd probably go in to some cool examples of what else it's been used for, possibly with worked examples if you have the time - 3 minutes is probably a bit short for anything beyond a single concrete example and a few points of why it's cool (being used to locate nuclear submarines in WW2, breaking the Enigma code, etc. :))

Possibly a handout with a few fun / cool problems in case kids want to practice, and maybe some pointers towards literate on the subject, but I suspect a pre-college audience won't generally be receptive to such a thing. You're probably better off just hooking their interest and trusting that the ones who find it cool will have the sense to look it up on Google :)

I have a post here including a short primer on Bayes' theorem.

http://lesswrong.com/r/discussion/lw/8lr/logodds_or_logits/