Epistemic status: speculating about things I'm not familiar with; hoping to be educated in the comments. This post is a question, not an answer.
ETA: this comment thread seems to be leading towards the best answer so far.
There's a question I've seen many times, most recently in Scott Alexander's recent links thread. This latest variant goes like this:
Old question “why does evolution allow homosexuality to exist when it decreases reproduction?” seems to have been solved, at least in fruit flies: the female relatives of gayer fruit flies have more children. Same thing appears to be true in humans. Unclear if lesbianism has a similar aetiology.
Obligate male homosexuality greatly harms reproductive fitness. And so, the argument goes, there must be some other selection pressure, one great enough to overcome the drastic effect of not having any children. The comments on that post list several other proposed answers, all of them suggesting a tradeoff vs. a benefit elsewhere: for instance, that it pays to have some proportion of gay men who invest their resources in their nieces and nephews instead of their own children.
But how do we know if this is a valid question - if the situation really needs to be explained at all?
For obvious political and social reasons, it's hard to be sure how many people are homosexual. Note that we are interested only in obligate homosexuality - bisexuals presumably don't have strongly reduced fitness. The Wikipedia article doesn't really distinguish obligate homosexuality from bi-, pan- and even trans-sexuals. The discussion in the SSC comments used an (unsourced?) range of 1%-3%, which seems at least consistent with other sources, so let's run with that.
The rate of major birth defects in the US, as reported by the CDC, is also about 3%. This counts both developmental and genetic problems, and includes everything from anencephaly (invariably fatal) through Down syndrome (severe but survivable) to cleft palates (minor). But most of these, at least 1.5% of births, were always fatal before modern medicine, and many of the others reduced fitness (via mate selection, if nothing else). Various other defects and diseases, which only manifest later in life, are also thought to be influenced or determined during early development. And so is sexual preference.
(Whether homosexuality is a developmental disorder is not the point; I'm comparing the effect of selection pressure on fatal teratology with its effect on reduced-fitness homosexuality.)
Embryological development is a complex and fragile process, and there are many ways for it to go wrong. We don't wonder how it is possible that selection pressure allows anencephaly to occur in 1 in 4859 births. There are certainly direct causes of anencephaly, explanations of why it happens when it does, but (I think) we don't a priori expect them to be due to tradeoffs yielding benefits elsewhere. It's just as plausible that the tradeoffs involved are against even worse (counterfactual) problems elsewhere - or that there are just no available mutations that don't have these or equally severe problems.
Could it be that linking sexual preference to the biological gender is, for some complex developmental reason, fragile enough that it goes wrong despite all selection pressure to the contrary, that it has no redeeming qualities from the viewpoint of evolution, and that is all there is to it?
When faced with any phenotype with reduced fitness, how can we judge if there is something to be explained - a beneficial tradeoff elsewhere to search for - or merely a hard problem evolution couldn't solve completely? And is there a way to quantify this question, relating it to the known mathematical models of genetics?
Notes:
1. I'm posting this in the spirit of recent suggestions to post more and accept lower quality of (our own) posts to Discussion.
2. I'm going to sleep now and will start replying to comments about 10 hours from now; sorry for the inconvenience.
Lots of things are simple. If the world is not simple, inference is impossible. Many things turn out to be straightforward; as complex and intricate a phenomenon as AIDS is, 'HIV causes AIDS' is much more accurate than 'AIDS is not determined by any one factor but by a combination of genetic, hormonal, and environmental influences; in recent years, biologically-based theories have been favored by experts...' In statistical modeling, it's far from surprising to discover that a few variables have most of the predictive value and that it's only the last few percent which require extreme complexity to predict or model.
You have explained why inference is hard in biology :-)
A technical term for the "problem" is pleiotropy.
Many small scale biological features are re-used over and over, if they break, many things can break a bit. Primary ciliary dyskinesia is an example if this. In the meantime, many complex adaptive structures (like "the ability to hear") are caused by more than one subcomponent, so any of several different subcomponents breaking can produce a symptomatically similar disruption of... (read more)