OrphanWilde comments on Measuring open-mindedness - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (23)
Let's say 25% of your users are inherently "optimate", 50% are inherently "populare", and 25% aren't really either.
Would your algorithm sort the people who don't strongly agree with either side with the "optimates", since their preferences are closer to the "optimate" group than the "populare" group? And would that produce the effect you're seeing, since half the "optimate" group are upvoting more or less equally?
In principle, this is possible. The system assigns each user a number corresponding to his/her position on the “left-right” (“populare-optimate”) axis. If, based on their votes, 25% of users are assigned “-10”, 50% are assigned “10” and 25% are assigned “0”, then the average is “2.5” which would make those with “0” into “left-wingers”.
At least in our first group (where the effect was the strongest and the distribution was pretty close to Gaussian) this is not what had happened.
Is it correct to say that you're basing your assignment of each user into the two categories based on the same variable you're analyzing - the distribution (or more specifically the clustering) of the votes? (My reading suggests the system is producing the vectors you're noticing based on clustering, and then you're naming the vectors?)
Yes.
Okay, why have you elevated the hypothesis of open-mindedness?
Without looking at the data, I couldn't say with certainty what the dominant cause is, but I can reasonably confidently say that your clustering algorithm, with its built-in assumption of a roughly even divide on both sides of its vectors, is responsible for at least part of it.
The prime issue is that you are algorithmically creating the data - the clusters - you're drawing inferences on. Your algorithm should be your most likely candidate for -any- anomalies. You definitely shouldn't get attached to any conclusions, especially if they're favorable to the group of people you more closely identify with. (It's my impression that the "open-mindedness" conclusion -is- favorable to the people you identify with, given that you give it higher elevation than the possibility that the opposing side is producing better arguments.)
Suppose people are divided by some arbitrary criteria (e.g., blondes vs. brunettes) and then it turns out that blondes upvote brunettes much more often than vice versa. You could still ask the same question.
Regarding elevation, I simply wanted a short and easy to understand title and it did not occur to me that it would be perceived as prejudicial.
Except in this case you're grouping on the same behavior you're measuring - given that you're doing statistical analysis on what is essentially traffic-analysis grouped data, I can't think of a trivial example to compare to. That's bound to lead to some variable dependency issues.
And I think you did realize that, given your care in not naming names or sides, but I'm not attacking you, I'm suggesting you should be cautious in taking conclusions. You want to measure - so you're not taking it as a given, which is good skepticism - but you skipped skepticism of your techniques.
Suppose, for the sake of the argument, that my own data is totally wrong and consider the same question for a purely hypothetical case:
Group A upvotes only its own comments. Group B upvotes preferentially its own comments. Is there a way to tell whether the difference lies in the comment quality or the characters of the group members?