Vaniver comments on Open thread, Jan. 19 - Jan. 25, 2015 - Less Wrong

3 Post author: Gondolinian 19 January 2015 12:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (302)

You are viewing a single comment's thread. Show more comments above.

Comment author: iarwain1 19 January 2015 09:34:52PM 1 point [-]

General question: I've read somewhere that there's a Bayesian approach to at least partially justifying simplicity arguments / Occam's Razor. Where can I find a good accessible explanation of this?

Specifically: Say you're presented with a body of evidence and you come up with two sets of explanations for that evidence. Explanation Set A consists of one or two elegant principles that explain the entire body of evidence nicely. Explanation Set B consists of hundreds of separate explanations, each one of which only explains a small part of the evidence. Assuming your priors for each individual explanation is about equal, is there a Bayesian explanation for our intuition that we should bet on Explanation Set A?

What about if your prior for each individual explanation in Set B is higher than the priors for the explanations in Set A?

Example:

Say you're discussing Bible Criticism with a religious friend who believes in the traditional notion of complete Mosaic authorship but who is at least somewhat open to alternatives. To your friend, the priors for Mosaic authorship are much higher than the priors for a documentary or fragmentary hypothesis. (If you want numbers, say that your friend's priors are .95 in favor of Mosaic authorship.)

Now you present the arguments, many of which (if I understand them correctly) boil down to simplicity arguments:

  • Mosaic authorship requires either a huge number of tortured explanations for individual verses, or it requires saying "we don't know" or "God kept it secret for some reason". Documentary-type hypotheses, on the other hand, postulate a few basic principles and use them to explain virtually everything.
  • Several different lines of local internal evidence often point to exactly the same conclusions. For example, an analysis of the repetitions within a story might lead us to divide up the verses between authors in a certain way, while at the same time an independent stylistic analysis leads us to virtually the same thing. So we again have a single explanation set that resolves multiple sets of difficulties, which again is simpler / more elegant than the alternative of proposing numerous individual explanations to resolve each difficulty, or just throwing up our hands and saying God keeps lots of secrets.

The question is, is your friend justified in rejecting your simplicity-based arguments based on his high priors? What about if his priors were lower, say .6 in favor of Mosaic authorship? What about if he held 50-50 priors?

Comment author: Vaniver 19 January 2015 10:13:45PM 2 points [-]

I think you'll get somewhere by searching for the phrase "complexity penalty." The idea is that we have a prior probability for any explanation that depends on how many terms / free parameters are in the explanation. For your particular example, I think you need to argue that their prior probability should be different than it is.

I think it's easier to give a 'frequentist' explanation of why this makes sense, though, by looking at overfitting. If you look at the uncertainty in the parameter estimates, they roughly depend on the number of sample points per parameter. Thus the fewer parameters in a model, the more we think each of those parameters will generalize. One way to think about this is the more free parameters you have in a model, the more explanatory power you get "for free," and so we need to penalize the model to account for that. Consider the Akaike information criterion and Bayesian information criterion.