Let me be an Excel sidekick among statistical analysis heroes.
I saw the OKCupid stuff as well, I ran a quick test in Excel to see if the variance in attractiveness contributes to the decision to meet beyond the attractiveness mean. Here's what I got doing regression, with apologies for the hideous formatting:
......... Coefficients ..Standard Error ..t Stat ..P-value
Intercept -0.569931558 0.042946471 -13.27074239 4.65749E-35
avg_attr 0.156634411 0.005238302 29.90175402 2.6299E-117
attr_std 0.028596624 0.012485497 2.290387431 0.022377128
The dependent variable is match percent (percent of people who decided they want to date the ratee), avg attr is the mean and attr std the standard deviation of the physical attractiveness ratings. attr std is not the attractiveness to STDs ;-)
As we can see, the coefficient for attractiveness deviation is significantishly positive. It actually has a small negative correlation with match and a larger negative correlation with attractiveness. This means that there is more consensus on the attractiveness of prettier people. Holding attractiveness constant, variance, which is visible for a single rater as an "unusual look", increases the chances that people will want to date you. Put some flowers in your hair!
Consider the two statements:
Most people would agree that there's some truth to each of these statements. At Thing of Things Ozy wrote:
This post explores the question of the extent to which each of the two statements is true, using data from a study of speed dating events conducted by Raymond Fisman and Sheena Iyengar.
The basic facts that I describe here are:
There's much more to say about how to interpret the group consensus and its implications, which I'll go into in a later post.
Each event involved ~15 men and ~15 women, and everybody of a given gender went on speed dates with everyone of opposite gender. Each participant on each date rated his or her partner on a number of dimensions, including attractiveness, on a scale from 1 to 10. For the purpose of this post, I focused on how attractive raters found a ratee relative to other ratees. For this reason, I scaled each rater's ratings so that the averages are the same for all raters of a given gender.
Gender differences
One sees essentially the same phenomena when the raters are men and the ratees are women as one does when the genders are reversed. There is however one very important difference: the average of the ratings that men gave women was ~6.5, and the average of the ratings that women gave men was ~5.9. The standard deviations were the (interestingly) same in both cases, and in terms of standard deviations, women were rated 0.5 SD higher than men were. This fact may have profound ramifications. I've pictured the distributions of average attractiveness ratings of men and of women below:
The main difference between the distributions is that the one for women is shifted to the right relative to the one for men. The shapes of the distributions are also a little bit different, but one can verify that the difference within the range of what one would expect by chance.
Hierarchical modeling
We're interested in what the average ratings would be if a sufficiently large number of raters rated a given ratee.
The ratees who are rated highest and lowest are also the ratees whose ratings are most likely to be unrepresentative of the entire population's consensus on their attractiveness: there's regression to the mean.
A methodology that allows us to correct for this is Bayesian hierarchical modeling, which involves simultaneously estimating the "true" distribution of average attractiveness ratings of all hypothetical ratees together with the true average attractiveness ratings of the particular ratees in the dataset. The default assumption in Bayesian hierarchical modeling is that the true distribution is a normal distribution with mean and standard deviation to be determined. The histograms above suggest that this is close to being true in our setting.
If we use Bayesian hierarchical modeling to generate refined estimates for the averages, we get distributions that look something like the following:
Note that the in contrast with the actual averages, the refined estimates are never below 4.5 or above 8 – the participants weren't rated by enough people for us to be confident that any participant is that far away from average.
The standard deviations of the distributions are nearly identical: 0.6 points on the 10 point scale.
The distribution of ratings for a fixed person
The image below shows the ratings of 18 women by 17 men.
One sees that with the exception of the ratees in columns 10 and 16, all ratees had at least one rater who perceived her attractiveness to be noticeably above average and at least one rater who perceived her attractiveness to be noticeably below average.
The graph below shows the median rating (black), maximum rating (red) and minimum rating (blue) for all ratees in the study, together with best fit curves:
Here too, one sees that there are very few people who are consistently rated as being above average or below average.
This is consistent with the fact that the fact that the standard deviation of the ratings that an individual was given was roughly the same as the standard deviation of average ratings of the population of ratees. I've plotted the standard deviations for individual ratees below:
We see that the standard deviations have a strong central tendency, with mean equal to ~0.7 points.
The average standard deviation being 0.7 points overstates the variability in perceptions of an individual's attractiveness. Some reasons for this are:
In order to estimate the true standard deviation of the distribution of perceptions of a given person's attractiveness, I examined the relative predictive power of:
(i) Our refined estimate of the group consensus on ratees' attractiveness
(ii) The extent to which a rater's rating deviates from this estimate
in the context of predicting a rater's decisions as to whether or not to see a ratee again.
I found that 60% of the predictive power comes from the group consensus and 40% of the predictive power comes from deviations from the group consensus, suggesting that the standard deviation of variation in perceptions of a ratee's attractiveness is about 2/3 that of the standard deviation of the group consensus across ratees. In terms of points on a 10 point scale, this is about 0.45 points.
To be continued...
In subsequent posts, I'll describe how the data bears on the following questions: