The Role of Attractiveness in Mate Selection: Individual Variation

JonahS

This post reports on a portion of my analysis of Fisman and Iyengar's speed dating dataset which bears on the question of how people select romantic partners.

Note: I made very substantial edits to the second to last section of this post having posted it, addressing questions of generalizability. I've also cross-posted to my blog.

Summary

Participants rated one another on several dimensions. The majority of variation in the ratings is captured by the average of the different rating types: some people were regarded as good overall, and others were regarded as not good overall.
The second most important source of variation in the ratings given to participants is that some were regarded as more attractive and fun than they were intelligent/sincere, and for others, the situation was reversed.
Broadly, when people had to chose between partners who were seen as attractive and fun and partners who were seen as intelligent and sincere, they had a moderately strong preference for partners who were seen as attractive and fun.
Individuals varied substantially in how they responded to the tradeoff, with some showing very strong preference for people who were seen as attractive and fun people, and others showed virtually no such preference.

The speed dating context may be unusual in that people make a decision on whether or not to see somebody again after only 4 minutes of interaction. On the other hand, some people do meet their partners in contexts such as bars and speed dating events where decisions are made based on brief interactions. To this extent, the empirical phenomena in data from the study are relevant to understanding mate selection in general.

The Predictive Power of Attractiveness

In How Subjective Is Attractiveness? I described how the group consensus on somebody's attractiveness explained 60% of the variance in people's perceptions of attractiveness. My original purpose in writing it was as background for a discussion of how much attractiveness influenced people's decisions as to whether or not to see their partners again.

I touched on this in Predictors of Selectivity and Desirability at Speed Dating Events. The group consensus on attractiveness is highly predictive of how often people wanted to see somebody again. I remember being slightly shocked upon first viewing the graphs below:

If we average over all participants, we find that participants of above average attractiveness had twice as many suitors as participants of below average attractiveness.

There are questions of how the group consensus on attractiveness should be interpreted: for example, how much it's determined by physical appearance as opposed to other characteristics. But up to that ambiguity, the question of whether the connection between attractiveness and desirability was causal is a semantic one — the group consensus on attractiveness picked up on some characteristic that resulted in certain people having many more suitors than others. If we define attractiveness to be whatever that characteristic is, then the connection is causal by definition.

Despite the strong predictive power of the group consensus on attractiveness, there was substantial variability in how much people's decisions were influenced by attractiveness, whether measured by group consensus or by their own assessment. While 98% of participants had perceptions of attractiveness that overlapped with those of the others in the group, only 93% of participants made decisions that were correlated with the consensus of others on their partners' attractiveness.

Individual responsiveness to attractiveness

To visualize the distribution of the degree to which people's decisions were influenced by their partners' attractiveness, for each individual, we form the angle between the vectors the participant's decisions, and the average attractiveness of his or her partners, and then plot these angles. The two vectors are in some ways qualitatively different, so the angles don't give a good sense for how much somebody's decisions were influenced by attractiveness in absolute terms, but they're helpful for thinking about how influenced people were relative to others.

An angle of 0 degrees represents perfect correlation while an angle of 90 degrees represents the person's decisions being orthogonal to the group's consensus on his or her partners' attractiveness. Angles greater than 90 degrees represent negative correlation. One can see that the angle was about 90 degrees for a small but significant fraction of participants, while for others the angle is very small, approaching 0 degrees.

The actual preferences of the participants surely vary less than the above graph suggests if it's taken at face value: the difference between those at the extremes and those in the middle would shrink with

A larger sample of dates per person
Better estimates of group consensus (based on ratings from a larger number of raters).

Still, the graph renders it plausible that the weight that people gave to attractiveness varied a lot, even if the variation was smaller than it is in the graph.

We could proceed to make "best guess" estimates of what the true distribution is, but we can get greater insight into what's going on by first adopting a shift in perspective.

Overall desirability and the tradeoffs

Participants rated each other on attractiveness, fun, ambition, intelligence and sincerity, as well as overall likeability. Ratings on the different dimensions were all correlated, sometimes strongly. (More here). This is partially explained by perceptions of somebody on one dimension influencing perceptions of the person on other dimensions (the Halo Effect). It could be partially explained by actual correlations between the underlying traits being measured. I'll explore possible explanations in greater detail in the future. From the point of view of understanding how people's preferences vary, the main point is that though we have 6 rating types, we have fewer than 6 independent of pieces of information: ratings of intelligence aren't just ratings of intelligence, ratings of ambition aren't just ratings of ambition, etc.

We would like to throw out the redundant information so that we can focus on the essentials. A method that facilitates this is principal component analysis (PCA), an automated procedure that takes the 6 ratings as inputs and returns an output of 6 weighted averages of the ratings (called "principal components") that are independent of one another. The key point is that it's often the case that the procedure compresses much of the information present in all of the variables into the first few principal components (something that the procedure designed to do), and that we can discard the other principal components with little cost, reducing the number of variables that we need to consider.

If we apply PCA to the 6 ratings, the first combination that the procedure gives is a weighted average where each rating gets almost equal weight:

good= 4* (Attractiveness) + 5*(Like) + 4*(Fun) + 4*(Intelligence) + 4*(Ambition) + 3*(Sincerity)

This can be thought of as corresponding to overall favorable impressions of somebody, so I named it "good." It captures roughly 60% of the information that was in the original ratings.

The second weighted average that PCA gives is not nearly as symmetric:

tradeoff = 4.5*(Attractiveness) + 3*(Like) + 3*(Fun) — 6*(Intelligence) — 2*(Ambition) — 5*(Sincerity)

This principal component picks up on the fact that after the variation picked up on by the first principal component, the second largest source of variation comes from those who were rated falling on a spectrum between the two poles

attractive, fun and likable <-------------> sincere, intelligent and ambitious

The first cluster of traits is more closely connected with mainstream romance than the second cluster of traits, which are thought of as positive, but less relevant.

The"tradeoff" combination captures roughly 20% of the information in the original ratings. So together, the first two principal components capture 80% of the information in the original ratings. We could look at the rest of the combinations that PCA gives us, but doing so would complicate the analysis without telling us much more.

Individual differences in romantic preferences

Having extracted the two principal components "good" and "tradeoff", we can examine how participants vary with respect to how their decisions depend on their partners' levels of each. Participants didn't vary very much with respect to their responsiveness to the "good" dimension. It's more interesting to examine how people differed with respect to preferences on the "tradeoff" dimension.

As background context, if we're content not to take into account differences in romantic preferences, we can model the probability of a participant's decision being yes by using a linear model for the log odds ratio:

LOR ~ 2*good + tradeoff + (general willingness to see partners again)

The fact that we're adding the tradeoff term rather than subtracting it corresponds to people tending to favor attractive and fun partners over intelligent and sincere partners, when forced to choose.

To individualize the model while attempting to correct for the variation that one would expect by chance, I followed Andrew Gelman's suggestion and used Bayesian hierarchical modeling. We replace the equation above with

LOR ~ 2*good + (personal tradeoff coefficient)*tradeoff + (general willingness to see partners again)

where "personal tradeoff coefficient" is a constant that depends on the individual making the decision.

The plot below shows the distribution of best guess estimates for the personal tradeoff coefficients. The title of the plot is a loose description of the "tradeoff" principal component, the precise definition of which I gave above.

The lefthand tail corresponds to some people having exhibited virtually no preference for attractive and fun partners over intelligent and sincere partners. The righthand tail corresponds to some people's preference being almost twice as strong as average.

What this means in tangible terms

In my first draft of this post, I postponed discussion of statistical significance until later, but I subsequently realized that I could address it succinctly.

I formed the graphs below by:

Estimating participants' coefficients based on the first 65% of the dates that they went on. These dates are the train set for our model.
Forming a "high" and "low" groups of participants according to whether their coefficients were in the top or bottom 1/3^rd.
Restricting consideration to those dates that were not in the first 65% of dates. These dates are the test set for our model.

Thus, the dates that I used to estimate the coefficients are completely disjoint from the dates that I used to form the graphs, so that we get unbiased estimates for the romantic preferences that the two groups of people would show in contexts similar to those of the study.

The first graph shows the frequency with which people's decision was 'yes' as as a function of their partners' attractiveness level.

The slope is slightly larger for the the group with high coefficient: you can see that the initial difference between the two groups in selectivity shrinks as one passes from partners with low attractiveness to high attractiveness.

The visual appearance of the graph understates the difference between the two groups: the high group virtually never expressed interest people lowest part of the attractiveness spectrum, whereas people in the low group were several more times more likely to. This comes across more clearly if we replace the percentage on the y-axis with the corresponding Log Odds Ratio . Here "odds" has the same meaning that it does in gambling (e.g.Roulette) and "log" refers to "logarithm." In the graph below, the 0 on the y axis corresponds to decisions being yes 50% of the time, and an increase of 1 along the y-axis corresponds to the odds of a yes decision increasing by 2x:

From this, one sees that while the high group was ~4x more selective than the low group when it came to partners at the low end of the attractiveness, it was only ~ 1.5x as selective as the low group when it came to partners at the high end of the attractiveness spectrum.

The corresponding graphs with attractiveness replaced by intelligence and sincerity are

(Note the difference in scales on the axes: there was much less variation in perceptions of sincerity and intelligence than there was in perceptions of attractiveness.)

One sees that past a certain point, the high group is not responsive to increasing sincerity and intelligence, whereas the low group is.

Of course, the high group and the low group don't differ most with respect to their responsiveness to attractiveness, or intelligence, or sincerity as individual traits. They differ the most in how they respond to a tradeoff between attractiveness/fun and intelligence/sincerity. The graph that depicts this is:

In passing from partners for whom the tradeoff term is lowest to partners for whom its highest, the odds of being selected by members of the low group increase by 5.5x, whereas the odds of being selected by the members of the high group increase by only 1.4x.

The differences between the groups correspond to generalizable phenomena. In fact, I knew that the differences are statistically robust and generalizable before even doing a train/test split as I did above. What made it obvious to me is that the tradeoff coefficient correlates with many other features of the participants that were collected prior to the events...

To Be Continued...

The question now arises: who are the people who lie at the two ends of the continuum between relative preference for attractiveness/ fun and relative preference for intelligence / sincerity? How did they spend their time? What career paths did they pursue? How did members of the opposite sex view them?

I'll offer partial answers to this questions in my next post. Readers who are intrigued can take a look at the survey instrument for a list of features present in the dataset, and guess which features correlated with the personal tradeoff coefficient.

I would like to know the third principal component. Even if you don't go further and calculate a lot based on it, it could be somewhat interesting.

Sure, it's a sincerity vs. ambition axis (very low on the other rating types) explaining an additional 10% of the variance. It correlates only very weakly with the other features available, aside from career and field of study. There is potentially statistically significant individual variation in responsiveness to it, but the variance is only about 25% as great as is the case of the "tradeoff" principal component.

Which end of that axis do most people prefer?

The sign of the coefficient isn't statistically significant if one looks at the entire population rather than individualizing it the model. I think it's probably not picking up on anything deep, and that the the variation in individual responsiveness may reflect spurious correlations that are specific to the context of the study.

I'd be interested to see how the 'goal' category in the survey aligned with the tradeoff coefficient. I can see people looking for a lot different things depending on whether they are looking for a quick fun date, or a long-term relationship.

The distribution of stated goals was

Seemed like a fun night out (~40%)
To meet new people (~40%)
To get a date (~10%)
Looking for a serious relationship (~3%)
To say I did it (~ 6%)
Other (~ 4%)

The fact that the percentage who stated that they were looking for a serious relationship is so small is itself a puzzle – my best guess is that people's answers were biased by the order in which the choices were presented.

Those who chose answer (4) did in fact have lower tradeoff coefficient, but the effect size is small enough so that given the small sample size, it's not statistically significant. The other answers correlate with the tradeoff coefficient to varying degrees, but the effect sizes are small, and none are statistically significant at the 1% level.

what made it obvious to me is that the tradeoff coefficient correlates with many other features of the participants that were collected prior to the events...

Did you look at how one's position on the tradeoff axis impacted their personal tradeoff coefficient? I'd weakly expect a positive relationship (weakly because I can see lots of arguments that point different directions).

Yes, this is in fact the case. The two things have have a number of correlates in common.

Yo, attraction guy, any reason nerds like us like glasses?

The data has evidence for what your claim, while not offering an explanation. Demographics didn't predict very much about perceptions of attractiveness, but though I didn't document it, I found that people in academia perceived people in academia to be slightly more attractive than others perceived people in academia to be, even after controlling for differences in rating scales, etc. I might write about this later.

I'm not in academia. But women with glasses are more attractive to me regardless.

Yes, but it's still relevant evidence.

What's the shaded area in the very first plot? Usually this area is one deviation around the fit line, but here it's clearly way too small to be that.

I don't have a good answer to that. The curve was generated using LOESS, which I haven't studied, and I assume that the shaded area has a interpretation in that framework.

I suspect it's parameter uncertainty rather than data uncertainty--that is, instead of showing the the fit plus/minus one stdev so you can check that about two-thirds of the data points fall in that rectangle, it's giving you a sense of what family of fit lines all fit the data 'well enough' (i.e. within some distance of the best fit).

That's probably it. When fitting a line using MCMC you'll get an anticorrelated blob of probabilities for slope and intercept, and if you plot one deviation in the fit parameters you get something that looks like this. I'd guess this is a non-parametric analogue of that. Notice how both grow significantly at the edges of the plots.

It seems that I've been the only commentator who has not got reply from the author. Thank you very much.

I genuinely appreciate your interest. I'm not selectively discriminating against you: note that you left your comments long after most commenters did – replying to comments was high priority to me then, but I have a lot of other things on my plate at the moment. I'll reply when I get a chance.

I’m sorry fo the delay but I came across these articles accidentally through google and long after its date of publication. Not even knew this website until recently.

I guess you're not interested in perform more analisys about mate choice, since you're mathematician and I guess you were focused mainly on just developing statistical analysis of data.

Anyway speaking as someone with an education in evolutionary biology, I feel that it was needed a synthesis of descriptions of human mating systems. Quantifying the shape and strength of mating preferences is a vital component of the study of sexual selection and reproductive options, but the influence of experimental design on these estimates is unclear sometimes in most studies I've been reading.

While online dating offers a platform to obtain high size sample values from a naturalistic setting that should confer high external reliability. In speed dating all the interactions are enforced, because they are not the result of interaction of courtship. And other issues such as low population density (number of daters) and keeping artificially event operational sex ratios near 1:1. Speed dating eliminates the component of the pre-selection in the human mating. While in others systems as human natural leks (nightclubs, bar, etc) and online dating the attracting attention is the first goal. Attention is elicited through the display of signals that excite the interest of possible mates. In mating field non-verbal solicitation is mainly done by the female as a basis for the male decision to approach her.

What most studies tells us, is that since physical attractiveness (independent variable) is the limiting factor for both sexes (since other attributes act as dependent variables), I'm going to focus in this parameter to address other issues.

Attractiveness ratings:

In this article you illustrate how revealed preferences (, preferences inferred through a speed dating event) can be used to investigate the nature of mate preferences. You describe how revealed preferences can be estimated and how the reliability of these estimates can be established. Then revealed preference estimates were used to explore the level of consensus in judgments of who is and is not attractive and whether revealed preferences are systematically related to self-reported mate preferences and personality traits.

Some of the graphics are pretty obtuse me for me. I'd like to ask you if it would be possible to exposing other type of graphs where the data could become more clarifiers.

Participants of both genders showed substantial consensus in judgments of whom they found attractive and unattractive, but what sex showed higher consensus? Is the standard deviation in your speed dating study of attraction ratings for a specific opposite-sex face on average smaller when looking at a specific gender?

It seems that in most studies women have a higher variance in ratings of sex-objects than men (Jankowiak et al. 1992; Townsend & Wasserman 1997). But this should be taken with a grain of salt because attractiveness rankings have much higher variation when ranking males as opposed to females.

Schulman & Hoskins (1986) found that ratings of female photos had statistically significant lower variance than male photos for both male and female raters. Thus, the effect could partially be in that both sexes are worse at judging attractiveness of males.

There is never going to be rigorous agreement on any kind of informal attractiveness metric, so the subjective discussions are missing the point. And here’s where something interesting beings to happen with this whole rating system. The more imbalanced the mating dynamic becomes, the more asymmetric – in terms of their distribution between the male and female populations – these rankings become.

In their study is a notable absence of individuals at extremes of attractiveness. Rather, future work might best reveal decision rules by manipulating the distribution of quality among potential mates; such manipulations would predict if people, mainly females, are using sample-based or threshold-based decision rules. So, it comes to a point that I've usually observed on my own experiments, that male ‘ratings’ are bottom heavy in distribution, while female ‘ratings’ are top heavy (meaning there are more female 7′s than male 7′s, by virtue of the fact that a female 7 has a greater probability of attracting a male 7, than the reverse). Although this does not seem to be the case in your study.

Furthermore It would also be interesting to know the assortative/ disassortative mating coefficients. How do perceptions of male attractiveness differ from perceptions of female attractiveness? I know that a speed dating event does not represent a potentially robust source of attractiveness data. And it is clear, however, that the site’s audience may not be very representative of the population as a whole. Anyway I’d like if you could address one aspect of that problem by attempting to determine whether and how the distribution of male attractiveness in your speed dating sample differs from the distribution of female attractiveness: female/male population distribution. It seems your graph (which does not support that females are more selective, given that rating skew is a corollary of selectivity), which poses too many confounders in the data to rely upon too strongly.) differ substantially from those found here:

http://onlinelibrary.wiley.com/doi/10.1111/jomf.12072/full

http://blog.okcupid.com/index.php/your-looks-and-online-dating/

And what’s yes/no decisions distributions? Intuitively, several answers to this question seem plausible. On one hand, it seems anecdotally to be true that in our study there are not nobody extremely attractive people, many average looking people, and few extremely unattractive people. Such logic could lead one to predict a normal attractiveness distribution as your findings.

But Okcupid blog and Kreager/Cavanag study, for example, find this gaussian distribution only in male population distribution, since women women rate 80% of guys as worse-looking than medium.

I'd like to know whether participants’ ratings of hypothetical partners, for example, reflect whom they would actually choose to date (yes/no). I don’t understand the distribution of decisions/ attractiveness angles chart. What's the relationship between individuals’ own physical attractiveness (as rated by other users) and the attractiveness of the people they wanted to meet?

“men’s decision were yes for 48% of the dates in the sample, and women’s decisions were yes for 33% of the dates in the sample.”

I'd like to know whether participants’ ratings of hypothetical partners, for example, reflect whom they would actually choose to date (yes/no). ie, How the percentage of acceptances (number of acceptances) is distributed for each attractiveness range of males and females in this system?. For example, for a woman 4 in attractiveness, what is her total number of yeses she gave respect of all male daters? and for a 6 women? And of this number of acceptances or yeses, what percentage of acceptances is corresponding to men rated as 1…4, 5, 6,..?or how their offerings are distributed between the different spectrum of quality, since the optimal threshold depends on the attributes of prospective mates ( and her own quality), and the distribution of the quality of these ones.

By the other hand, it would be interesting try to find out on here if there is a genetically determined threshold (threshold-based decisions) or there is any other unlearned threshold (sampled-based decisions) . These considerations would also reveal a simple algorithm by which the threshold could be learned. Peter M. Todd et al tried to make that test. See http://141.14.165.6/CogSci09/papers/547/paper547.pdf

First, it's important to know ,analizing yes/no rates, if less attractive people is accepting less attractive dates (My own analysis of data from online dating suggests that this is not the case) or are focusing in most deserable opposite-sex individuals. And analyzing attractiveness data, if less attractive individuals' assessment are higher than those from most attractive ones. (i.e. if less attractive people do not delude themselves into thinking that their dates are more physically attractive than others perceive them to be). True that it could be a conditioning problem the absence of highly attractive individuals (top of the beauty scale) in the study sample.

It would be necessary introducing into the sample several highly attractive daters (>8 points) to see if this data tend to remain constant or conversely betray a predictable patterns in demonstrating a near universal preference for this very narrow range of male/female physical phenotypes.

It is a mistake the absence of a number of subjects that can be classified as highly attractive (above 8). A speed dating event does not represent a potentially robust source of attractiveness data (small size). And it is clear, however, that the site’s audience may not be very representative of the population as a whole. Most people are within the medium spectrum, and only a handful are good-looking.

I would say that real mate choice (in broader mating leks) is concentrated in a narrow population range, especially in female choice. Since the most reliable data / investigation (online dating and field courship) agree in this frame of observation. And what this tells us, is that since physical attractiveness was a limiting factor for BOTH sexes, and women are MORE selective in assessing attractive males – women are MORE likely (than men) to cull prospects according to assessments of physical attractiveness. Where women tend to fixate on the top ~ %10-20 of males. See: freakanomics data, http://jonmillward.com/blog/attraction-dating/cupid-on-trial-a-4-month-online-dating-experiment/, or my own experiment::https://sirtyrionlannister.wordpress.com/2014/02/23/female-mating-skew-ii-supported-by-online-dating-experiment/), considering the bottom %80 of males as, inexplicably, less than average (see OK Cupid data), the variance in that top %10-20 tends to split a lot of trivial hairs (making the differences harder to quantify, with respect to an attractiveness ranking system).

Congratulations on your work, is a great statistical analysis. Anyway I would like to raise some questions and suggestions:

Do you consider that speed dating is a mating system with enough ecological validity?

Online dating and speed dating are a new line of studies, which is a recent, highly artificial cultural construct. True that speed dating method has increased ecological validity compared to traditional self-report methods (and superior identification of variables compared to archival methods like data-mining printed personal ads), but its ecological validity is limited compared to other mating leks like online dating and mating field research, since speed dating manipulates the normal parameters of courtship (i.e. selective interactions), sufficiently to ‘confuse’ our perceptions.

A lot of inconsistencies in a lot of speed dating findings could be due in part to the variation in experimental context and potential lack of ecological validity. I always suggest work with data of other kind of studies extracted from more natural mating leks, which attempts to overcome these limitations in an ecologically-valid experimental paradigm: online dating research, and mainly field courtship research.

You should work with online dating because it provides an ecologically valid or true-to-life context for examining the risks, uncertainties and rewards of initiating real relationships with real people at an unprecedented scale. I think the confusion happens because speed-dating precludes selective interactions which would normally limit spurious signals of interest, keeping them to a minimum (ie. that courtship interactions imply genuine interest is generally a justified assumption in other contexts). But, in the case of speed-dating, the lack of selective interactions ensures a high incidence of spurious signalling, which is (apparently) difficult to falsify.

Moreover speed-dating is such an unconventional courtship practice (ie. everyone meets everyone else), the non rotating sex (traditionally females) may be confusing friendly/courteous body language for genuine signals of interest (since such signals DO tend to imply interest in other settings – but not necessarily in a speed dating context, where these behaviors are procedurally manipulated). It could result in overconfidence (i.e. assuming a genuine level of interest from everyone who sits down with you and ‘smiles’, etc.), and thus higher selectivity, for the non-rotating sex (again, traditionally females). It is simple to test this hypothesis, but I won’t be holding my breath.

While online dating offers a platform to obtain high size sample values from a naturalistic setting that should confer high external reliability. While actual patterns of communication in a real online dating system are similar to those occurring in real life. In speed dating everyone meets everyone else. A sequential aspiration-level model occurs for mate choice within human mating leks as online dating and field dating. First an individual assesses a sexual prospect’s trait values in sequence, from those most easily perceived through those that take longest to judge. Individuals assign an aspiration level to each of four traits (visual scanning, conversation, dating, exclusive dating, marriage), and pursue further courtship only with prospects who exceed each successive aspiration level. These aspiration-level ‘hurdles’ can be set at different heights depending on the individual’s sexual strategy and mating goals, and can be set relative to one’s own trait values, or relative to a population percentile value.

It might be adaptive to filter prospects through a series of aspiration levels, using physical appearance to decide whom to talk to, conversation to decide whom to form a short-term relationship with, and psychological compatibility to decide whom to form a long-term relationship with. In this sequential aspiration model, mate choice sets up a series of hurdles, and the courting individual tries to jump over them.

Potential tool leveraging relative pragmatism and honesty of the LW community: "Hot or Not?" or attractiveness-rating app for members, done by the opposite gender, focused on physical attractiveness and specific criticism on what works and what doesn't.

Routes around anxiety/weirdness of doing this IRL, specifically the honest commentary part.

There was some discussion of the subject around this thread from 4 years ago (I don't remember exactly where – it could have been in the comments, or in a subsequent post).

Interesting that the people who were more responsive to attractiveness were also more selective in general. Any idea why this is?

Sure, the two most probable explanations seem to be:

The people in the high group were themselves considered more desirable, and so could afford to be more choosy.
The people in the high group tended to rate others as less attractive, corresponding to somebody needing to be at a higher percentile in order for people in the high group to be attracted to them.

These two things may seem to be completely orthogonal, but they're not... Giving low ratings to other people's attractiveness was correlated with being attractive oneself. I suspect that there's an element of acclimation /hedonic treadmill – when somebody has the option of dating many people, the less attractive of them start to seem less attractive than they would be if one didn't have so many options.

And the low group's apparent substantially greater responsiveness to sincerity and intelligence once one passes the 50th percentile hints that the difference between the two groups isn't cleanly captured by "desirable people who can be choosy vs. undesirable people who have to settle," though there does appear to be a component of that, and there are also statistical subtleties around the question.