Is the point of karma to provide feedback to the poster or the potential audience? It seems like these are both being used to determine whether to upvote. Maybe there should be explicit;y separate karma scores for different goals.
Perhaps posts should be evaluated on more comprehensive criteria: Helpful to me Relative value compared to other posts Feedback on post quality to author
I'm not really sure where the line is between not enough things to rate and too many though. But I've never been comfortable with single number systems.
Given LW’s keen interest in bias, it would seem pertinent to be aware of the biases engendered by the karma system. Note: I used to be strictly opposed to comment scoring mechanisms, but witnessing the general effectiveness in which LWers use karma has largely redeemed the system for me.
In “Social Influence Bias: A Randomized Experiment” by Muchnik et al, random comments on a “social news aggregation Web site” were up-voted after being posted. The likelihood of such rigged comments receiving additional up-votes were quantified in comparison to a control group. The results show that users were significantly biased towards the randomly up-voted posts:
At the end of their five month testing period, the comments that had artificially received an up-vote had an average rating 25% higher than the control group. Interestingly, the severity of the bias was largely dependent on the topic of discussion:
The herding behavior outlined in the paper seems rather intuitive to me. If before I read a post, I see a little green ‘1’ next to it, I’m probably going to read the post in a better light than if I hadn't seen that little green ‘1’ next to it. Similarly, if I see a post that has a negative score, I’ll probably see flaws in it much more readily. One might say that this is the point of the rating system, as it allows the group as a whole to evaluate the content. However, I’m still unsettled by just how easily popular opinion was swayed in the experiment.
This certainly doesn't necessitate that we reprogram the site and eschew the karma system. Moreover, understanding the biases inherent in such a system will allow us to use it much more effectively. Discussion on how this bias affects LW in particular would be welcomed. Here are some questions to begin with:
Notes:
In the paper, they mentioned that comments were not sorted by popularity, therefore “mitigating the selection bias.” This of course implies that the bias would be more severe on forums where comments are sorted by popularity, such as this one.
For those interested, another enlightening paper is “Overcoming the J-shaped distribution of product reviews” by Nan Hu et al, which discusses rating biases on websites such as amazon. User gwern has also recommended a longer 2007 paper by the same authors which the one above is based upon: "Why do Online Product Reviews have a J-shaped Distribution? Overcoming Biases in Online Word-of-Mouth Communication"