Interesting talk on Bayesians and frequentists

jsteinhardt

I recently started watching an interesting lecture by Michael Jordan on Bayesians and frequentists; he's a pretty successful machine learning expert that takes both views in his work. You can watch it here: http://videolectures.net/mlss09uk_jordan_bfway/. I found it interesting because his portrayal of frequentism is much different than the standard portrayal on lesswrong. It isn't about whether probabilities are frequencies or beliefs, it's about trying to get a good model versus trying to get rigorous guarantees of performance in a class of scenarios. So I wonder why the meme on lesswrong is that frequentists think probabilities are frequencies; in practice it seems to be more about how you approach a given problem. In fact, frequentists seem more "rational", as they're willing to use any tool that solves a problem instead of constraining themselves to methods that obey Bayes' rule.

In practice, it seems that while Bayes is the main tool for epistemic rationality, instrumental rationality should oftentimes be frequentist at the top level (with epistemic rationality, guided by Bayes, in turn guiding the specific application of a frequentist algorithm).

For instance, in many cases I should be willing to, once I have a sufficiently constrained search space, try different things until one of the works, without worrying about understanding why the specific thing I did worked (think shooting a basketball, or riffle shuffling a deck of cards). In practice, it seems like epistemic rationality is important for constraining a search space, and after that some sort of online learning algorithm can be applied to find the optimal action from within that search space. Of course, this isn't true when you only get one chance to do something, or extreme precision is required, but this is not often true in everyday life.

The main point of this thread is to raise awareness of the actual distinction between Bayesians and frequentists, and why it's actually reasonable to be both, since it seems like lesswrong is strongly Bayesian and there isn't even a good discussion of the fact that there are other methods out there.

the frequentist method says to use success/total

This is false (as explained in the linked-to video). If nothing else, the frequentist answer depends on the loss function (as does the Bayesian answer, although the posterior distribution is a way of summarising the answer simultaneously for all loss functions).

I think you're taking the frequentist interpretation of what a probability is and trying to forcibly extend it to the entire frequentist decision theory. As far as the "frequentist interpretation of probability" goes, I have never met a single statistician who even explicitly identified "probabilities as frequencies" as a possible belief to hold, much less claimed to hold it themselves. As far as I can tell, this whole "probabilities as frequencies" thing is unique to LessWrong.

Everyone I've ever met who identified as a frequentist meant "not strictly Bayesian". Whenever a method was identified as frequentist, it either meant "not strictly Bayesian" or else that it was adopting the decision theory described in Michael Jordan's lecture.

In fact, the frequentist approach (not as you've defined it, but as the term is actually used by statisticians) is used precisely because it works extremely well in certain circumstances (for instance, cross-validation). This is, I believe, what Mike is arguing for when he says that a mix of Bayesian and frequentist techniques is necessary.

11

Interesting talk on Bayesians and frequentists

11

11

11

Interesting talk on Bayesians and frequentists

11

11