You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

gwern comments on Problems in Education - Less Wrong Discussion

65 Post author: ThinkOfTheChildren 08 April 2013 09:29PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (318)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 11 April 2013 09:13:54PM *  2 points [-]

I don't think the sample of experiments reviewed is large enough to evaluate sample size versus effect size; throw out the outliers and there's nothing left.

The first Rosenthal meta-analysis used 345 studies. That is pretty big. And the individual studies listed in table 17.1 have large n, ranging from 79 to 5000+.

I'm now heavily concerned about the validity of the IQ test used; however, that's more due to the 8 point increase in the control group, when no increase is expected.

No, that's not a problem that should concern you. Children IQ scores are less stable than older people's scores, test-retest effects will give you a number of IQ points (that's why one uses controls), and children are constantly growing.

What should concern you is that the researchers involved were willing to pass on and champion a result driven solely by obviously impossible nonsensical meaningless data. A kid going from 18 IQ to 122? or 113 to 211? This can't even be explained by incompetence in failing to exclude scores from kids refusing to cooperate, because tests in general (much less the specific test they used!) are never normed from 18 to 211. (How do you get a sample big enough to norm as high as 7.4 standard deviations?)

Worrying about the control's gains and not the actual data is like reading a physics paper reporting that they measured the speed of several neutrinos at 50 hogsheads per milifortnight, and saying 'Hm, yes, but are they sure they properly corrected for GPS clock skew and did accurately record the flight time of their control photons?"

Comment author: Decius 12 April 2013 04:06:20AM 0 points [-]

Unstable IQ scores should provide a net zero; an average increase of half a standard deviation across the entire population already means that the norms are fucked.

Therefore, the IQ test used simply wasn't properly normed; if we assume that it was equally improperly normed for all students in the study, we still see an increase of 4 points based on teachers being told to expect more. Whether an increase of 4 points is statistically significant on that (improperly normed) test is a new question.

Comment author: gwern 12 April 2013 03:42:49PM 1 point [-]

Unstable IQ scores should provide a net zero; an average increase of half a standard deviation across the entire population already means that the norms are fucked.

Only if you make the very strong assumptions that there is no systematic bias or selection effect or regression to the mean or anything which might cause the unstability to favor an increase.

Plus you ignored my other points.

Plus we already know from the pairs of before-afters that these researchers are either incredibly incompetent or actively dishonest.

Plus we already know biases in analysis or design or data collection can be introduced much more subtly. Gould's brainpacking problems is only the latest example.

Therefore, the IQ test used simply wasn't properly normed; if we assume that it was equally improperly normed for all students in the study,

Which claim and assumption we will make because we are terminally optimistic, and to borrow from the '90s, "I want to believe!"

we still see an increase of 4 points based on teachers being told to expect more. Whether an increase of 4 points is statistically significant on that (improperly normed) test is a new question.

Wow, you still aren't giving up on the Pygmalion study? Just let it go already. You don't even have to give up on your wish for self-fulfilling expectations - there are plenty of followup studies which turned in your desired significant effects.

Comment author: Decius 12 April 2013 04:16:27PM -1 points [-]

Only if you make the very strong assumptions that there is no systematic bias or selection effect or regression to the mean or anything which might cause the unstability to favor an increase.

What effects could cause an increase of 8 points on a properly normed test across the board? Why would there a significant benefit to being in the control group of this study?

Plus we already know from the pairs of before-afters that these researchers are either incredibly incompetent or actively dishonest.

You can rule out that they were using a test which produced the scores that they recorded, perhaps by using raw score rather than normed output. You can rule out every other explanation for why the recorded results aren't valid scores. You can even rule out that they were competently dishonest, since competent dishonesty would be nontrivial to detect; your only possible conclusion is incompetence, which isn't evidence which should change your priors.

Incompetence is the social equivalent of the null hypothesis, and there is very rarely any significant evidence against it.

Therefore, the IQ test used simply wasn't properly normed; if we assume that it was equally improperly normed for all students in the study,

Which claim and assumption we will make because we are terminally optimistic, and to borrow from the '90s, "I want to believe!"

Assuming only incompetence as you have, the expected result would be equally erratic for all students. You can assign any likelihood to the assumption that the incompetence was the primary factor and that dishonesty doesn't modify it significantly, but you have already concluded systemic incompetent dishonesty across a large number of studies.

Wow, you still aren't giving up on the Pygmalion study? Just let it go already. You don't even have to give up on your wish for self-fulfilling expectations - there are plenty of followup studies which turned in your desired significant effects.

As you say, it's been confirmed by other studies. I'm not insisting that a particular study was done correctly, I'm explaining why their conclusions being true is consistent with the errors in their study. (Which means that a study with those flaws would be expected to reach the same conclusions, if those conclusions were true)

Comment author: gwern 12 April 2013 05:37:19PM *  1 point [-]

What effects could cause an increase of 8 points on a properly normed test across the board? Why would there a significant benefit to being in the control group of this study?

I already gave you three separate explanations for why an increase is possible, even in controls.

your only possible conclusion is incompetence, which isn't evidence which should change your priors. Incompetence is the social equivalent of the null hypothesis, and there is very rarely any significant evidence against it.

I have no idea what you mean by this, and I think that if one accepts their incompetence, the best thing to do is to ignore their data as having been poisoned in unknown ways - maliciousness, ideology, and stupidity often being difficult to tell apart.

Assuming only incompetence as you have, the expected result would be equally erratic for all students.

Why is that? The competent result is, since IQ interventions almost universally fail (our prior for any result like 'we increased IQ by 8 points' ought to be very low, as in, well below 1%, because hundreds of interventions have failed to pan out and 8 points is astounding and practically on the level of iodization) and the followups confirm that there is only a much much smaller effect, that there is no or a small effect. Any incompetence is going to lead to an extreme result. Like what they found.

As you say, it's been confirmed by other studies.

'Confirmed'? Well, this is an active debate as to what counts as a replication. Near the same magnitude or just having the same sign? If someone publishes a study claiming to find a weight loss drug that will drop 100 pounds, and exhaustive replications find that the true estimate is actually 1 pound, has the original claim been "confirmed"? After all, both estimates are non-zero and both estimates have the same sign...

Comment author: Decius 13 April 2013 01:01:18AM -1 points [-]

So, "systematic bias or selection effect or regression to the mean" can result in average properly normed IQ scores increasing by 8 points? Doesn't the normalizing process (when done properly) force the average score to remain constant?

Comment author: gwern 13 April 2013 01:26:57AM 1 point [-]

Doesn't the normalizing process (when done properly) force the average score to remain constant?

What normalizing process? You mean the one the paid psychometricians go through years before any specific test is purchased by researchers like the ones doing the Pygmalion study? Yeah, I suppose so, but that's irrelevant to the discussion.

Comment author: Decius 13 April 2013 02:17:37AM -1 points [-]

Right- because the entire population going up half a SD in a year isn't unusual at all, and the test purchased for use in this study was normalized the way one would expect it to be, despite the fact that it had results that are impossible if it was normalized in that manner.

Comment author: gwern 13 April 2013 02:51:06AM *  2 points [-]

...'entire population'?

Alright, I have to admit I have no idea what test you are now referring to. I thought we were discussing the Pygmalion results in which a small sample of elementary school students turned in increased IQ scores, which could be explained by a number of well-known and perfectly ordinary processes.

But it seems like you're talking about something completely else and may be thinking of country-level Flynn effects or something, I have no idea what.

Comment author: Decius 14 April 2013 12:36:38AM 0 points [-]

The PitC study showed an 8 point IQ increase in the control group. You offered those three explanations and said that they explained why that wasn't particularly unusual, and my understanding of normed IQ tests is that they are expected to remain constant over short times.