Ron_Fern comments on (Subjective Bayesianism vs. Frequentism) VS. Formalism - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (106)
I'm going by what I've read of Jaynes, Yudkowsky, and books by a couple of other writers on Bayesian statistics.
I don't believe there are any legitimate issues with Bayesian statistics, because Bayes's rule is derived from basic desiderata of rationality which I find entirely convincing, and it seems to me that the maximum entropy principle is the best computable approximation to Solomonoff induction (although I'd appreciate other opinions on that).
There may be legitimate issues with people failing to apply the simple mathematical laws of probability theory correctly, because the correct application can get very complicated - but that is not an issue with Bayesian statistics per se. I'm sure that in many cases, the wisest thing to do might be to use frequentist methods, but being a Bayesian does not prohibit someone from applying frequentist methods when they are a convenient approximation.
I'm pretty sure almost all of freqeuntist methods are derivable as from bayes, or close approximations of bayes. Do they have any tool which is radically un-bayesian?
See paulfchristiano's examples elsewhere in this thread.
Another example would be support vector machines, which work really well in practice but aren't Bayesian (although it's possible that they are actually Bayesian and I just can't figure out what prior they correspond to).
There are also neural networks, which are sort of Bayesian but (I think?) not really. I'm not actually that familiar with neural nets (or SVMs for that matter) so I could just be wrong.
ETA: It is the case that every non-dominated decision procedure is either a Bayesian procedure or the limit of Bayesian procedures (which I think could alternately be thought of as a Bayesian procedure with a potentially improper prior). So in that sense, for any frequentist procedure that is not Bayesian, there is another procedure that gets higher expected utility in all possible worlds, and is therefore strictly better. The only problem is that this is again an abstract statement about decision procedures, and doesn't take into account the computational difficulty of actually finding the better procedure.
This paper is the closest I've ever seen to a fully Bayesian interpretation of SVMs; mind you, the authors still use "pseudo-likelihood" to describe the data-dependent part of the optimization criterion.
Neural networks are just a kind of non-linear model. You can perform Bayes upon them if you want.