Random thing that I can't recall seeing on LW: Suppose A is evidence for B, i.e. P(B|A) > P(B). Then by Bayes, P(A|B) = P(A)P(B|A)/P(B) > P(A)P(B)/P(B) = P(A), i.e. B is evidence for A. In other words, the is-evidence-for relation is symmetric.
For instance, this means that the logical fallacy of affirming the consequent (A implies B, and B is true, therefore A) is actually probabilistically valid. "If Socrates is a man then he'll probably die; Socrates died, therefore it's more likely he's a man."
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I would be interested to see the results of some Clustering Algorithm on the comment data. It may be, that long comments can be classified into high karma and low karma and we can then analyze what the differences between them are. If it is possible to extract features of high-quality posts, then those features can be the goal, instead of just the length.
I also think it's dangerous to focus too strongly on karma, because karma score is only a rough approximation of actual quality. For example, I believe many short comments, that only ask for some clarification are generally more important than is reflected by their karma.