Ars Technica are holding a competition for people to make a science video up to 3 minutes long "to explain a scientific concept in terms that a high school science class would not only understand, but actually be interested in watching". Prizes in three categories: biology, physics, and mathematics. Deadline is December 25. More details here.
Anyone want to have a go at Bayes' theorem? Cognitive bias? Defeating death? Invisible purple dragons?
I suck at making videos and I hate my spoken voice, but here's an approximate transcript of what I would tell the kids.
Imagine that some terrible illness kills most of the men affected by it, but spares most of the women. Also assume that most affected women take a certain drug, while most men don't. And on top of that, assume that the drug is completely useless: it doesn't affect your chances of survival either way. Men die from the illness more because of some tiny physiological difference between men and women. And women take the drug more just because it's marketed toward women more.
In this situation, if you didn't get the lucky guess of counting men and women separately and instead counted them together, you'd arrive at the conclusion that the drug is pretty damn effective because taking it is very correlated with survival! So, just by splitting people into groups in clever ways, you may make apparent relationships of cause and effect appear and vanish. This is known in statistics as Simpson's paradox. It is often observed in practice, like in the famous Berkeley sex bias case of 1973 when the university as a whole was found to be biased toward admitting men, while every individual department was found to be biased toward admitting women.
Sometimes you may get around the problem by making experimental tests. If you change the cause and it affects the outcome, you can be sure you're looking at a real relationship, not a statistical illusion. But it's hard to imagine an ethical way to test the hypothesis that smoking kills. In such situations, if you can only observe real-world trends, your best bet is to use many "control variables", like counting men and women separately in my original example.
This seems impossible, and checking the data (posted on wikipedia) that isn't what happened.
[edit] Hm, I appear to not have been thinking clearly. See comment below.