Here's the new thread for posting quotes, with the usual rules:
- Please post all quotes separately, so that they can be voted up/down separately. (If they are strongly related, reply to your own comments. If strongly ordered, then go ahead and post them together.)
- Do not quote yourself
- Do not quote comments/posts on LW/OB
- No more than 5 quotes per person per monthly thread, please.
Could we start that reading with the classic Bayes' Theorem example? Suppose 1% of women have breast cancer, 80% of mammograms on a cancerous woman will detect it, 9.6% on an uncancerous woman will be false positives. Suppose woman A gets a mammogram which indicates cancer. What are the odds she has cancer?
p(A|X) = p(X|A)p(A)/(p(X|A)p(A)+p(X|~A)p(~A)) = 7.8% Hooray?
Now suppose women B, C, D, E, F... Z, AA, AB, AC, AD, etc., the entire patient list getting screened today, all test positive for cancer. Is the probability that woman A has cancer still 7.8%? Bayes' rule, with the priors above, still says "yes"! You need more complicated prior probabilities (e.g. what are the odds that the test equipment is malfunctioning?) before your evidence can tell you what's actually likely to be happening. But those more complicated, more accurate priors would have (very slightly) changed our original p(A|X) as well!
It's not that Bayesian updating is wrong. It's just that Bayes' theorem never allows you to have a non-zero posterior probability coming from a zero prior, and to make any practical problem tractable everybody ends up implicitly assuming huge swaths of zero prior probability.
It's not assuming zero probability. It's assuming independence. Under the original model, it's possible for all the women to get positives, but only 1% to actually have breast cancer. It's just that a better prior would give a much higher probability.