Related to: Bayes' Theorem Illustrated, What is Bayesianism?, An Intuitive Explanation of Bayes' Theorem
(Bayes' theorem is something Bayesians need to use more often than Frequentists do, but Bayes' theorem itself isn't Bayesian. This post is meant to be a light introduction to the difference between Bayes' theorem and Bayesian data analysis.)
Bayes' Theorem
Bayes' theorem is just a way to get (e.g.) p(B|A) from p(A|B) and p(B). The classic example of Bayes' theorem is diagnostic testing. Suppose someone either has the disease (D+) or does not have the disease (D-) and either tests positive (T+) or tests negative (T-). If we knew the sensitivity P(T+|D+), specificity P(T-|D-) and disease prevalence P(D+), then we could get the positive predictive value P(D+|T+) using Bayes' theorem:
For example, suppose we know the sensitivity=0.9, specificity=0.8 and disease prevalence is 0.01. Then,
This answer is not Bayesian or frequentist; it's just correct.
Diagnostic testing study
Typically we will not know P(T+|D+) or P(T-|D-). We would consider these unknown parameters. Let's denote them by Θsens and Θspec. For simplicity, let's assume we know the disease prevalence P(D+) (we often have a lot of data on this).
Suppose 1000 subjects with the disease were tested, and 900 of them tested positive. Suppose 1000 disease-free subjects were tested and 200 of them tested positive. Finally, suppose 1% of the population has the disease.
Frequentist approach
Estimate the 2 parameters (sensitivity and specificity) using their sample values (sample proportions) and plug them in to Bayes' formula above. This results in a point estimate for P(D+|T+) of 0.043. A standard error or confidence interval could be obtained using the delta method or bootstrapping.
Even though Bayes' theorem was used, this is not a Bayesian approach.
Bayesian approach
The Bayesian approach is to specify prior distributions for all unknowns. For example, we might specify independent uniform(0,1) priors for Θsens and Θspec. However, we should expect the test to do at least as good as guessing (guessing would mean randomly selecting 1% of people and calling them T+). In addition, we expect Θsens>1-Θspec. So, I might go with a Beta(4,2.5) distribution for Θsens and Beta(2.5,4) for Θspec:
Using these priors + the data yields a posterior distribution for P(D+|T+) with posterior median 0.043 and 95% credible interval (0.038, 0.049). In this case, the Bayesian and frequentist approaches have the same results (not surprising since the priors are relatively flat and there are a lot of data). However, the methodology is quite different.
Example that illustrates benefit of Bayesian data analysis
(example edited to focus on credible/confidence intervals)
Suppose someone shows you what looks like a fair coin (you confirm head on one side tails on the other) and makes the claim: "This coin will land with heads up 90% of the time"
Suppose the coin is flipped 5 times and lands with heads up 4 times.
Frequentist approach
"A 95% confidence interval for the Binomial parameter is (.38, .99) using the Agresti-Coull method." Because 0.9 is within the confidence limits, the usual conclusion would be that we do not have enough evidence to rule it out.
Bayesian approach
"I don't believe you. Based on experience and what I know about the laws of physics, I think it's very unlikely that your claim is accurate. I feel very confident that the probability is close to 0.5. However, I don't want to rule out something a little bit unusual (like a probability of 0.4). Thus, my prior for the probability of heads is a Beta(30,30) distribution."
After seeing the data, we update our belief about the binomial parameter. The 95% credible interval for it is (0.40, 0.64). Thus, a value of 0.9 is still considered extremely unlikely.
This illustrates the idea that, from a Bayesian perspective, implausible claims require more evidence than plausible claims. Frequentists have no formal way of including that type of prior information.
I'm curious, though, as to what all you are giving up by not talking about priors. In Felsenstein's field - roughly, constructing best estimates of the "tree of life" - you very frequently have prior information which you want to bring to the problem, but of course you don't want to bring in any prior information which is not neutral on the controversial issue that your study is supposed to shed light on.
One of the advantages of a Bayesian methodology is supposed to be the ability to combine information from sources with different qualities and coverages. To what extent are you prevented from doing that if you insist on doing all of your likelihood ratio work behind a "veil of ignorance"?
Well, let's be very explicit about that then. A good report will:
More mathematically, suppose that you mak... (read more)