DanielLC comments on [Link] Better results by changing Bayes’ theorem - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (17)
Two possible explanations come to mind:
They are defining "better" as something other than higher expected value of number of correctly translated words.
The probabilities they're using are biased, and this reverses the bias.
I haven't watched the video, but are they using expected value at all or are they just using the most likely word? Accidentally using a nonoptimal common word seems like it would produce a better translation than accidentally using a nonoptimal uncommon word, so this effect might just be making their algorithm more like expected utility and less like raw probabilities.