The Jussim et al review of that literature is worth reading. Expectations do seem to have causal impact, but the effect is usually small relative to measures of past performance and ability, and teacher expectations tend to reflect past performance more.
The review covers some serious challenges to the effect sizes claimed by Rosenthal and coauthors, such as effect sizes declining with sample size and publication bias. Or, regarding the original Pygmalion/Oak School experiment:
Snow (1995) also pointed out that the intelligence test used in Pygmalion was only normed for scores between 60 and 160. If one excluded all scores outside this range, the expectancy effect disappeared. Moreover, there were five "bloomers" with wild IQ score gains: 17-110, 18-122, 133-202, 111-208, and 113-211. If one simply excluded these five bizarre gains, the difference between the bloomers and the controls evaporated.
As an aside, Rosenthal pioneered meta-analysis in psychology because the effect only replicated a third of the time in the published literature (despite the presence of publication bias and QRPs). In doing so he promulgated a test for publication bias which implicitly assumed the absence of any publication bias, and so almost always output the conclusion that no publication bias was present. These methods were eagerly adopted by the parapsychology community, as the same methodology that appeared to show strong expectancy effects also appeared to show ESP in the ganzfeld psychic experiment, as Rosenthal (1986) agreed.
Since I think that the ESP literature reflects the scale of apparent effect that can be shown in the absence of a real effect, purely through publication bias, experimenter bias, optional stopping, and other questionable research practices, this makes me suspicious of the stronger claims about expectation effects.
I don't think the sample of experiments reviewed is large enough to evaluate sample size versus effect size; throw out the outliers and there's nothing left.
I'm now heavily concerned about the validity of the IQ test used; however, that's more due to the 8 point increase in the control group, when no increase is expected. I'll have to dig further, exclude any of the controls with out-of-band scores and redo the math.
One result of the meta-analysis, however, is that experimentally-induced changes to teacher expectation have a small casual effect on student ...
Post will be returning in Main, after a rewrite by the company's writing staff. Citations Galore.