Clarity comments on Frequentist Magic vs. Bayesian Magic - Less Wrong

41 Post author: Wei_Dai 08 April 2010 08:34PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (79)

You are viewing a single comment's thread. Show more comments above.

Comment author: wnoise 09 April 2010 01:58:22PM *  2 points [-]

It's really more than that. Consider the great anti-Bayesian Cosma Shalizi. He's shown that the use of a prior is really equivalent to a method of smoothing, of regularization on your hypothesis space, trading off (frequentist) bias and variance. And everyone has a regularization scheme, even if they claim it is "don't: 'let the data speak for themselves'".

Your assertion that the machine is exactly 90% biased is exactly equivalent to an evenly-split point-mass prior at 10% and 90%. A Bayesian with that prior would exactly reproduce your reasoning. You're absolutely right that it is in no way an incomplete prior -- it exactly specifies everything. One could consider it an overconfident prior, but if you do in fact know that it's either 10% or 90%, and have no idea which it's perfectly appropriate. The choice of which Frequentist statistical techniques to use is pretty much isomorphic to the choice of Bayesian priors.

EDIT: On the bias/variance trade-off, I just realized that the Frequentist prescription for binary frequency estimation s/(s+f) is, rather than a maximum entropy prior, a maximum variance prior. (To be specific it is an improper prior, the limiting case of the beta distribution, with both parameters going to zero. Although this has support everywhere in [0,1], if you try to normalize it, the integral going to infinity kills everything except delta spikes at 0 and 1. But normalizing after conditioning on data gives a reasonable prior, with the maximum likelihood estimate being the standard estimate.)

Comment author: Clarity 19 August 2015 12:10:32PM 1 point [-]

Cosma's writings for those interested