You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

DanielVarga comments on [Link] Better results by changing Bayes’ theorem - Less Wrong Discussion

3 Post author: XiXiDu 09 March 2012 07:38PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (17)

You are viewing a single comment's thread. Show more comments above.

Comment author: DanielVarga 10 March 2012 12:11:09AM *  4 points [-]

Excellent explanation. I would add that the source of this overconfidence is not a mystery at all. Models for estimating Pr(f|e) are so ridiculously simplistic that a layperson would laugh us out if we explained them to her in plain English instead of formulas. For example, P(f|e) was sometimes defined as the probability that we can produce f from e by first applying a randomly chosen lexicon translation for each word of e, and then do a random local reordering of words. Here the whole responsibility of finding a random reordering that leads to a grammatical English sentence rests on the shoulders of Pr(e). It's almost like the translation model spits out a bag of words, and the language model has to assemble them into a chain of words. (The above simple example is far from being state of the art, but actual state of the art it is not that much more realistic either.)