You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

NancyLebovitz comments on [LINK] Why I'm not on the Rationalist Masterlist - Less Wrong Discussion

21 Post author: Apprentice 06 January 2014 12:16AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (866)

You are viewing a single comment's thread. Show more comments above.

Comment author: NancyLebovitz 06 January 2014 04:00:57PM 7 points [-]

I'm not sure I understand your question, but eliminating the left tail of a bell curve would change the average but not necessarily extend the right tail.

Comment author: Douglas_Knight 06 January 2014 08:48:16PM *  2 points [-]

What exactly happens depends on the model, but I think it would be very difficult to build a model with nonzero heritability that produced a bell curve and where truncating the left tail did not affect the right tail.

Usually bell curves occur from the sum of many small discrete variables. That appears to be true for IQ. Under this model, any form of selection has basically the same effect, at least in the long term. If the old equilibrium had random mating and the next generation is also produced by random mating, then a new bell curve will be produced in the very next generation. If the old distribution were due to assortative mating, and that continues, it will take longer to reach equilibrium. But it will affect the right tail eventually.

Added: no, more than a generation to equilibrium.

Comment author: Eugine_Nier 07 January 2014 05:29:50AM 5 points [-]

What exactly happens depends on the model, but I think it would be very difficult to build a model with nonzero heritability that produced a bell curve and where truncating the left tail did not affect the right tail.

Well, since IQ is forced to be a bell curve by definition, the fact that it is a bell curve doesn't count as evidence for anything.

Comment author: jsalvatier 07 January 2014 06:44:49PM 2 points [-]

IQ tests are normalized (so they have a median of 100 and standard deviation of 15, but they are not forced to be normally distributed), so I think the distributional properties can be evidence for something.

Comment author: Douglas_Knight 07 January 2014 07:39:52PM 3 points [-]

I think you are mistaken and they simply are forced to be bell curves.

But even if IQ is an affine transformation of the number of questions answered correctly, the simple act of adding up the questions is likely to produce a bell curve, so its appearance is not much evidence.

Comment author: Viliam_Bur 08 January 2014 11:43:21AM *  5 points [-]

I confirm that IQ tests are forced to be bell curves; at least those using the methodology I learned at university.

Calibrating the test (giving it to many people) returns information like: "50% of test subjects can solve 23 problems of these 50" and "98% of test subjects can solve 41 problems of these 50".

Then the next step is to put these data in the bell curve, saying: "therefore 23/50 means 0 sigma = 100 IQ" and "therefore 41/50 means 2 sigma = 130 IQ".

But you can't assume that this is linear. To explain it simply, let's assume that the more intelligent person always solves a superset of the problems the less intelligent person solved. Therefore, any person with IQ between 100 and 130 would solve all the 23 "easy" problems, some of the 18 "hard" problems, and none of the 9 "impossible" problems. But how many exactly -- that depends on how difficult exactly those "hard" problems are. Maybe they are relatively easy, and a person with IQ 115 will solve all of them; and maybe they are relatively hard, and a person with IQ 115 will solve none of them. But that is a fact about the test, not about the intelligence distribution of the population. Therefore this fact should be removed in the normalization.

Comment author: Eugine_Nier 10 January 2014 02:49:39AM -2 points [-]

Therefore this fact should be removed in the normalization.

Perhaps, but it doesn't follow that the new normalization should be Gaussian. One test I'd like to see is what happens when you give a test calibrated for one population to a different one.

Comment author: Viliam_Bur 10 January 2014 09:38:59AM *  1 point [-]

If the test is normalized for a population A, then if we give it to a population B, the results don't have to be Gaussian. The normalization occurs only once, when the relationship between the raw scores and the IQ values is defined. Later the existing definition can be reused.

You would get somewhat different shape when you a) calibrate the test for population A and then measure population B, or b) calibrate the test for A+B and then measure population B.

Probably the most correct way to compare two populations would be to skip the normalization step and just compare the histograms of raw scores for both populations. (I am not good enough in math to say how exactly.)

Also, I am not sure how much such comparison would depend on the specific test. Let's imagine that we have one population with average IQ 100 and other population with average IQ 120. If we give them a test consisting of IQ-110-hard questions, the two populations will probably seem more different than if we give them a test consisting of a mix of IQ-80-hard and IQ-140-hard questions.

Comment author: NancyLebovitz 10 January 2014 05:19:45PM 1 point [-]

This backs my general notion that for a lot of measurements (especially of people?), we need graphs, not single numbers.

Comment author: Eugine_Nier 11 January 2014 07:46:14AM *  -1 points [-]

Also, I am not sure how much such comparison would depend on the specific test. Let's imagine that we have one population with average IQ 100 and other population with average IQ 120. If we give them a test consisting of IQ-110-hard questions, the two populations will probably seem more different than if we give them a test consisting of a mix of IQ-80-hard and IQ-140-hard questions.

You can compare by looking at which percentile of population B, the median of population A corresponds to.

Edit: also once you've compared several populations this way, you can try to see if there is a way to normalize the test such that the distributions for all the populations have similar shapes.

Comment author: Lumifer 08 January 2014 03:46:33PM *  -1 points [-]

Then the next step is to put these data in the bell curve, saying: "therefore 23/50 means 0 sigma = 100 IQ" and "therefore 41/50 means 2 sigma = 130 IQ".

This is NOT forcing the outcome to be a bell curve. This is just normalizing to a given mean and standard deviation, a linear operation that does not change the shape of the distribution.

Consider a hypothetical case where an IQ test consists of 100 questions and 100 people take it. These hundred people all get a different number of questions correct -- from 1 to 100: the distribution of the number of correct answers is flat or uniform over [1 .. 100]. Now you normalize the mean to 100 and one standard deviation to 15 -- and yet the distribution remains flat and does not magically become a bell curve.

Comment author: Viliam_Bur 08 January 2014 04:58:36PM *  4 points [-]

These hundred people all get a different number of questions correct -- from 1 to 100: the distribution of the number of correct answers is flat or uniform over [1 .. 100].

This is a fact about the test.

Now you normalize the mean to 100 and one standard deviation to 15 -- and yet the distribution remains flat and does not magically become a bell curve.

Maybe it was wrong for me to use the word "normalization" in this context, but no, the distribution of raw scores is not mapped linearly to the distribution of IQs. It is mapped onto the bell curve.

Otherwise every intelligence test would produce a different intelligence curve, because inventing 100 questions such that they get the same distribution of raw scores as some other set of 100 questions, that would be an impossible task. (Just try to imagine how you would try to obtain the set of 100 questions for which the distribution of raw scores is linear. Keep in mind that every testing on many real subjects costs you a lot of money, and on a few subjects you won't get statistical significance.)

Comment author: Lumifer 08 January 2014 05:04:15PM -1 points [-]

the distribution of raw scores is not mapped linearly to the distribution of IQs. It is mapped onto the bell curve.

Could you provide links showing this to be the case?

because inventing 100 questions such that they get the same distribution of raw scores as some other set of 100 questions, that would be an impossible task.

There is a helpful theorem.

Comment author: [deleted] 08 January 2014 06:40:02PM 1 point [-]

There is a helpful theorem.

It assumes that all the variables you're summing are independent.

Comment author: Douglas_Knight 07 January 2014 06:53:59AM 0 points [-]

Oh, yeah. But I think It is probably true that it is difficult to build a model of a continuous trait in which truncation of one tail does not affect the equilibrium of the other tail.

The more relevant point is additive heritability (aka h^2 or narrow sense heritability. Any model will have some, so my condition of having any is not helpful. But if a trait has a lot, that means the trait is pretty close to counting genes, hence the distribution must be a bell curve. But that doesn't mean that it is a constraint on models.

Comment author: Douglas_Knight 07 January 2014 01:49:39AM *  0 points [-]

Here's a short-term analysis that may be more convincing.

I assume perfect heritability and pm's choice of 50% selection, both to make the effects larger. I assume additive genetics because that's what we expect from the assumption of a bell curve. The far right tail is largely produced from two parents both on the right half, even on the tail. The farther right you go, the more true this is. Assuming mating is at random. For each person who could have a right tail child, if only they found the right mate, eliminating half of the population that wouldn't do doubles their odds of having an appropriate mate and thus a right tail child. Thus, the right tail is twice as big. The further out we go, the closer it is to twice as big. If everyone has twice as many children to make up for the population being cut in half, then the tail is four times as big.

If there is strong assortative mating, the people on the right tail weren't going to going to have children with the left half and the first effect doesn't apply, since the selection only eliminates pairings that weren't going to happen. Indeed, assortative mating is very similar to truncation selection, so combining the two is redundant in the first generation.

In the first generation, the left tail does not look at all gaussian. In the long term, it does become gaussian. In the short term right becomes a thicker tail, but in the long term the variance has gone down and the right tail becomes smaller, starting at two standard deviations from the original mean.

Comment author: pianoforte611 06 January 2014 06:03:06PM *  2 points [-]

If you did that then after one or two generations, regression to the mean would set the average IQ right back to where it was (almost). If you eliminated enough of the left tail over several generations to actually change the average to a stable higher value, then the right tail would be extended.

Like I said I'm not commenting on the effect of the Holocaust because I don't know anything about it.

Comment author: private_messaging 06 January 2014 08:02:06PM *  3 points [-]

If UberHitler kills everyone with IQ<100, that raises the average IQ without increasing the number of people with high IQ. After a few generations, you are back to a Gaussian with a smaller variance (you lost some genetic diversity) and a slightly larger mean, which means that at some IQ level that is sufficiently high you have fewer people with that IQ .

Comment author: IlyaShpitser 06 January 2014 08:08:37PM 2 points [-]

I am not following how killing people who do poorly on a test does not evoke the evolution demon, eventually.

Comment author: private_messaging 06 January 2014 08:27:33PM 0 points [-]

The average increased, that's your evolution. If you let many generations pass, for the mutations to happen and genetic diversity to restore, you will get the variance back as well.

Comment author: Eugine_Nier 07 January 2014 06:04:15AM 0 points [-]

Assuming random mating, you'll already get higher IQ kids in the next generation since people with exceptionally high IQ are more likely to mate.

Comment author: pianoforte611 06 January 2014 11:24:34PM *  3 points [-]

The reversal test makes this sound a bit strange:

If you have a population with an average IQ of 100 and you add in an equal number of people with an IQ of 80 then after a generation, you will have a Gaussian with a larger variance. Hence there will be more geniuses due to more genetic variation.

Surely you don't believe that? I realize that this isn't a perfect reversal but that sounds very odd to me.

Anyway here is the crude model of intelligence that I working with - I admit I'm not an expert on this topic, and I have some reading up to do on the genetic basis of intelligence. Intelligence is a polygenetic trait that can be roughly (very roughly) modeled as a bunch of genetic sites with either a plus or minus alleles (keeping it simple with just 2 possibilities). The more plus alleles you have the more likely you are to have a high IQ (genes and intelligence aren't perfectly correlated). Populations with a higher average IQ have a higher concentration of plus alleles so the chance of receiving many of them is increased. But if you take away all of the people who due to bad luck received a very large number of minus alleles, you haven't altered the concentration of alleles in the gene pool that much - this is part of why regression to the mean occurs. But if you consistently select for people with a higher concentration of plus alleles, then the odds of any one child having a lot of plus alleles increases in the population. This is how artificial selection occurs in any trait that is polygenetic. Corn kernels are huge because the people who cultivated corn selected for the biggest corn kernels - yes there was a loss of genetic diversity and yes there was decrease in the variance, but that nevertheless what was observed were corn kernels that were bigger than any corn before.

Comment author: private_messaging 06 January 2014 11:49:44PM *  1 point [-]

Surely you don't believe that? I realize that this isn't a perfect reversal but that sounds very odd to me.

It would happen in your model, if there is no perfect overlap between the set of sites in one population and the set of sites in the other population. With two populations, you have more sites. The smartest possible mega-genius is from the mixed population and has + alleles on each site; none of the original populations can have a genius this smart at all.

To see that on less extreme rarity (and approximately for a large number of alleles), write down the ratio of two Gaussians with different means and variances. Simplify. Observe that the ratio of the larger variance Gaussian to the smaller variance Gaussian gets arbitrarily high far from the mean.

Comment author: pianoforte611 07 January 2014 12:03:09AM *  1 point [-]

Okay but that is an incredibly weak claim - I'm not interested in switching all of the plus alleles on because additivity starts to break down and having an IQ of say 500 isn't particularly meaningful. For any reasonable definition of genius, artificially selecting for the smartest members of a population (what super-Hitler is doing), will increase the number of them.

Comment author: Douglas_Knight 07 January 2014 03:50:26AM 2 points [-]

Assume total heritability, random mating, additive genetics, and a single 50% truncation event. In the first generation, the right tail becomes 4x larger as a proportion of the population, but it gets smaller in equilibrium. The new mean is 0.8 standard deviations above the old mean. The new standard deviation is 0.6 times the old one. When it reaches equilibrium and becomes a Gaussian with those parameters, the crossover where the old population had a thicker tail than the new is about two standard deviations. At three standard deviations, the new distribution is only 1/10 of the old distribution. But I don't know how much time it takes to get there.

Comment author: pianoforte611 08 January 2014 04:44:09PM *  0 points [-]

Thank you, I'm pretty surprised by that result. Two questions: does assortive mating merely slow down that process? And is there any way to increase the both the average and the standard deviation?

Comment author: private_messaging 17 January 2014 06:19:34PM *  1 point [-]

You need new mutations to increase the standard deviation, that takes a lot of time and a big population size.

Also, having a genetic disorder applies larger selection pressure to the other genes.

If we are to think of some real 'eugenic' population bottleneck, such as WW2 related, the correlation between intelligence and survival is, frankly, shit, plus a lot of small, geographically co-located sub-populations where a bunch of beneficial genes have been slowly increasing in prevalence get completely wiped out, with loss of all copies of that gene.

Bottom line is, selective breeding of larger corn kernels works quickly because the nature hasn't been breeding for larger corn kernels to begin with, it has been breeding optimum kernel sizes, and to get large kernels you're just selecting genetic disorders. There's nothing that you can wreck about the brain that would turn you into a genius, there's a plenty of things you can wreck about growth that would make corn kernels big.

Comment author: Lumifer 17 January 2014 06:35:57PM *  1 point [-]

You need new mutations to increase the standard deviation, that takes a lot of time and a big population size.

Or just some mutagens.

Comment author: Douglas_Knight 09 January 2014 04:06:33AM 0 points [-]

If you truncate less of the tail, it takes more generations to move the mean, but I believe that by the time it moves the same distance, the variance shrinks less.

If you have a randomly mating population, apply assortative mating for a few generations, apply one generation of selection, and let randomly mix, it costs less variance for the same mean as if you don't do assortative mating. That's because assortative mating is a kind of selection, so this is like several generations of selection. If you start and end with an equilibrium of assortative mating, I'm not sure what happens. Also, assortative mating increases the variance, so you have to distinguish between the variance of the population and the variance of the population that would result if you switched to random mating.

Comment author: private_messaging 07 January 2014 12:34:42AM 0 points [-]

I made a weak claim (all sites) to make it easier for you to see how that works within your own additive model. Of course, you don't have to have plus alleles on all locations for a genius to be more common in the mixed population than in the original populations.

For any reasonable definition of genius (someone with an IQ of 160+), artificially selecting for the smartest members of a population (what super-Hitler is doing), will increase the number of them.

This would depend on the population sizes involved, number of locations, and overlap between locations.

Comment author: gjm 06 January 2014 11:40:02PM 0 points [-]

What is the process by which you expect the mean to regress enough to leave you with a thinner upper tail than before UberHitler did his thing?