Interesting talk on Bayesians and frequentists

jsteinhardt

I recently started watching an interesting lecture by Michael Jordan on Bayesians and frequentists; he's a pretty successful machine learning expert that takes both views in his work. You can watch it here: http://videolectures.net/mlss09uk_jordan_bfway/. I found it interesting because his portrayal of frequentism is much different than the standard portrayal on lesswrong. It isn't about whether probabilities are frequencies or beliefs, it's about trying to get a good model versus trying to get rigorous guarantees of performance in a class of scenarios. So I wonder why the meme on lesswrong is that frequentists think probabilities are frequencies; in practice it seems to be more about how you approach a given problem. In fact, frequentists seem more "rational", as they're willing to use any tool that solves a problem instead of constraining themselves to methods that obey Bayes' rule.

In practice, it seems that while Bayes is the main tool for epistemic rationality, instrumental rationality should oftentimes be frequentist at the top level (with epistemic rationality, guided by Bayes, in turn guiding the specific application of a frequentist algorithm).

For instance, in many cases I should be willing to, once I have a sufficiently constrained search space, try different things until one of the works, without worrying about understanding why the specific thing I did worked (think shooting a basketball, or riffle shuffling a deck of cards). In practice, it seems like epistemic rationality is important for constraining a search space, and after that some sort of online learning algorithm can be applied to find the optimal action from within that search space. Of course, this isn't true when you only get one chance to do something, or extreme precision is required, but this is not often true in everyday life.

The main point of this thread is to raise awareness of the actual distinction between Bayesians and frequentists, and why it's actually reasonable to be both, since it seems like lesswrong is strongly Bayesian and there isn't even a good discussion of the fact that there are other methods out there.

Thanks for your reference it is good to get down to some more specific examples.

Most AI techniques are model based by necessity: it is not possible to generalise from samples unless the sample is used to inform the shape of a model which then determines the properties of other samples. In effect, AI is model fitting. Bayesian techniques are one scheme for updating a model from data. I call them incomplete because they leave a lot of the intelligence in the hands of the user.

For example, in the thesis reference the author designs a model of transformations on handwritten letters that (thanks to the authors intelligence) is similar to the set of transformations applied to numeric characters. The primary reason why the technique is effective is because the author has constructed a good transformation. The only way to determine if this is true is through experimentation, I doubt the bayesian updating is contributing significantly to the results, if another scheme such as an SVM was chosen I would expect it to produce similar recognition results.

The point is that the legitimacy or otherwise of the model parameter updating scheme is relatively insignificant in comparison to the difficulty in selecting a good model in the first place. As far as I am aware, as there are a potentially infinite set of models, Bayesian techniques cannot be applied to select between them, leaving the real intelligence being provided by the user in the form of the model. In contrast, SVMs are an attempt to construct experimentally useful models from samples and so are much closer to being intelligent in the sense of being able to produce good results with limited human interaction. However, neither technique addresses the fundamental difficulty of replicating the intelligence used by the author in creating the transformation in the first place. Fixating on a particular approach to model updating when model selection is not addressed is to miss the point, it may be meaningful for gambling problems but for real AI challenges the difference it makes appears to be irrelevant to actual performance.

I would love to discuss what the real challenges of GAI are and explore ways of addressing them, but often the posts on LW seem to focus on seemingly obscure game theory or gambling based problems which don't appear to be bringing us closer to a real solution. If the model selection problem can't be addressed then there is no way to guarantee that whatever we want an AI to value, it won't create an internal model that finds something similar (like paperclips) and decides to optimise for that instead.

Silently down voting criticism of Bayesian probability without justification is not helpful either.

Model selection is definitely one of the biggest conceptual problems in GAI right now (I would say that planning once you have a model is of comparable importance / difficulty). I think the way to solve this sort of problem is by having humans carefully pick a really good model (flexible enough to capture even unexpected situations while still structured enough to make useful predictions). Even with SVMs you are implicitly assuming some sort of structure on the data, because you usually transform your inputs into some higher-dimensional space consisting of... (read more)

11

Interesting talk on Bayesians and frequentists

11

11

11

Interesting talk on Bayesians and frequentists

11

11