Kaj_Sotala comments on Problems in Education - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (318)
Seconded. A relatively low correlation I could believe, but none? As a friend pointed out, this would imply that if there's a math prodigy in the class, the teacher would be just as likely to recommend advanced classes as they would be to recommend the student needing extra help with basic stuff? I could accept prodigies slacking off due to boredom and therefore sometimes getting mistaken for people with bad skills, but 50-50?
Would you believe that many teachers use 'effort' as an explicit factor in assigning grades? As in, someone who understands the material without putting forth visible effort is assigned a lower grade than someone who visibly struggles to have the same level of understanding.
They are officially required to (in Slovakia). But it is just one on many confusing, sometimes mutually contradicting, mostly applause-lights criteria.
(From my memory: The grades have to reflect knowledge, they have to reflect effort, they have to be motivating, and they have to respect human rights, whatever that means. And a dozen other conditions.)
Yes, this is explicitly required by and explicitly forbidden by the rules. Welcome in the world of educational system!
One more data point; was it a politician with no educator qualifications that wrote the requirements?
I don't know, but my guess would be that a group of bureaucrats with zero educational experience from the Department of Education prepared the document, and some minister just signed it, because it seemed okay (it contained all the applause lights).
This is as much effort as I'm willing to put in, but you were probably right to be sceptical; it seems unlikely to be zero.
On the whole, the results revealed high levels of validity for the teacher-judgment measures. The studies revealed, however, some variability across teachers in accuracy levels and suggested the operation of certain other moderator variables.
Teachers' ratings of children's academic abilities, classroom skills, and personal-social characteristics were highly predictive of subsequent school achievement. Although teachers tended to be more positive toward girls, within each sex ratings were predictive of future performance. Teacher ratings were more predictive of scholastic success than ratings made by mothers. (MV)
See also: Pygmalion in the Classroom.
It's entirely reasonable that teacher's ratings of children's academic abilities &tc cause future achievement.
No, it's not. Did you read Carl's comment in this same thread?
It's been demonstrated by controlled research that students who have teachers who expect them to perform better than their peers do, even when the expectations of the teachers are not founded on fact.
The Jussim et al review of that literature is worth reading. Expectations do seem to have causal impact, but the effect is usually small relative to measures of past performance and ability, and teacher expectations tend to reflect past performance more.
The review covers some serious challenges to the effect sizes claimed by Rosenthal and coauthors, such as effect sizes declining with sample size and publication bias. Or, regarding the original Pygmalion/Oak School experiment:
As an aside, Rosenthal pioneered meta-analysis in psychology because the effect only replicated a third of the time in the published literature (despite the presence of publication bias and QRPs). In doing so he promulgated a test for publication bias which implicitly assumed the absence of any publication bias, and so almost always output the conclusion that no publication bias was present. These methods were eagerly adopted by the parapsychology community, as the same methodology that appeared to show strong expectancy effects also appeared to show ESP in the ganzfeld psychic experiment, as Rosenthal (1986) agreed.
Since I think that the ESP literature reflects the scale of apparent effect that can be shown in the absence of a real effect, purely through publication bias, experimenter bias, optional stopping, and other questionable research practices, this makes me suspicious of the stronger claims about expectation effects.
I don't think the sample of experiments reviewed is large enough to evaluate sample size versus effect size; throw out the outliers and there's nothing left.
I'm now heavily concerned about the validity of the IQ test used; however, that's more due to the 8 point increase in the control group, when no increase is expected. I'll have to dig further, exclude any of the controls with out-of-band scores and redo the math.
One result of the meta-analysis, however, is that experimentally-induced changes to teacher expectation have a small casual effect on student performance; another result is that non-induced teacher expectations correlate well with performance in the same year, and less well with long term performance. I would rephrase that as 'Teacher expectations of student performance in their class tend to be accurate, but correlate poorly with student performance in other classes.'
In any case, thanks for the link. I'm going to have to spend some time determining how much I should change my mind with this new evidence, but my gut feeling is that the objectively worst possible data (my own experience with performing well when expected to perform well, and performing poorly when expected to perform poorly), will continue to dominate my personal opinion on the matter.
Upvoted for candor.
The first Rosenthal meta-analysis used 345 studies. That is pretty big. And the individual studies listed in table 17.1 have large n, ranging from 79 to 5000+.
No, that's not a problem that should concern you. Children IQ scores are less stable than older people's scores, test-retest effects will give you a number of IQ points (that's why one uses controls), and children are constantly growing.
What should concern you is that the researchers involved were willing to pass on and champion a result driven solely by obviously impossible nonsensical meaningless data. A kid going from 18 IQ to 122? or 113 to 211? This can't even be explained by incompetence in failing to exclude scores from kids refusing to cooperate, because tests in general (much less the specific test they used!) are never normed from 18 to 211. (How do you get a sample big enough to norm as high as 7.4 standard deviations?)
Worrying about the control's gains and not the actual data is like reading a physics paper reporting that they measured the speed of several neutrinos at 50 hogsheads per milifortnight, and saying 'Hm, yes, but are they sure they properly corrected for GPS clock skew and did accurately record the flight time of their control photons?"
Unstable IQ scores should provide a net zero; an average increase of half a standard deviation across the entire population already means that the norms are fucked.
Therefore, the IQ test used simply wasn't properly normed; if we assume that it was equally improperly normed for all students in the study, we still see an increase of 4 points based on teachers being told to expect more. Whether an increase of 4 points is statistically significant on that (improperly normed) test is a new question.
Only if you make the very strong assumptions that there is no systematic bias or selection effect or regression to the mean or anything which might cause the unstability to favor an increase.
Plus you ignored my other points.
Plus we already know from the pairs of before-afters that these researchers are either incredibly incompetent or actively dishonest.
Plus we already know biases in analysis or design or data collection can be introduced much more subtly. Gould's brainpacking problems is only the latest example.
Which claim and assumption we will make because we are terminally optimistic, and to borrow from the '90s, "I want to believe!"
Wow, you still aren't giving up on the Pygmalion study? Just let it go already. You don't even have to give up on your wish for self-fulfilling expectations - there are plenty of followup studies which turned in your desired significant effects.
That's going further than I did. It's a reasonable prior, and the evidence is at least consistent with weak effects.
Eh. Decius was clearly thinking of, and still is thinking of, substantial and longlasting effects rather than the almost trivially small disappearing effects confirmed by the followups and meta-analyses. That is completely unreasonable a view to hold after reading that review, and I would suggest that even that small nonzero effect is dubious since it seems that few to none of the studies fully accounted for the accuracy issue and there is obviously publication bias at play.
I wonder if the teachers making the predictions were the same ones who then taught the students, and examined them to determine the outcome.
Here's a review of the literature on teacher self-fulfilling prophecies from Lee Jussim, who is skeptical but finds that they occur and are of nontrivial magnitude, moreso for grades vs standardized tests, although they dissipate quickly and teacher judgments are more driven by accuracy than stereotypes in the aggregate.