A piece I saw that Benjamin Todd adapted from THINK's module on charity assessment. Some of you may recall the network's recent launch.
Lots of social interventions end up doing more harm than good. Many more make no difference at all, and are just a waste of resources. At times, we’ve probably argued with friends about which interventions we’d like to see, and which we wouldn’t. But are we any good at judging what’s likely to work?
Here’s a cool bit of content adapted from THINK. Try and guess which of these eight programs made a difference, which had no effect, and which made things worse.
cipergoth said that it should be emphasised that this isn't a trick question where the answer is they all worked or none did.
Round #1: Scared Straight
Program description: “In the 1970s, inmates serving life sentences at a New Jersey (USA) prison began a program to ‘scare’ or deter at‐risk or delinquent children from a future life of crime. The program, known as ‘Scared Straight’, featured as its main component an aggressive presentation by inmates to juveniles visiting the prison facility. The presentation depicted life in adult prisons, and often included exaggerated stories of rape and murder … The program received considerable and favorable media attention and was soon replicated in over 30 jurisdictions nationwide … Although the harsh and sometimes vulgar presentation in the earlier New Jersey version is the most famous, inmate presentations are now sometimes designed to be more educational than confrontational but with a similar crime prevention goal. Some of these programs featured interactive discussions between the inmates and juveniles, also referred to as ‘rap sessions.’(2)
Did the program decrease the rate of juvenile crime?
Round #2: Nurse‐Family Partnership
Program description: “The Nurse‐Family Partnership program provides nurse home visits to pregnant women with no previous live births, most of whom are i) low‐income, ii) unmarried, and iii) teenagers. The nurses visit the women approximately once per month during their pregnancy and the first two years of their children’s lives. The nurses teach i) positive health related behaviors, ii) competent care of children, and iii) maternal personal development (family planning, educational achievement, and participation in the workforce). The program costs approximately $12,500 per woman over the three years of visits (in 2010 dollars).”(6)
Did the program improve the quality of child care?
Round #3: Drug Abuse Resistance Education (DARE)
Program description: “DARE is a highly‐structured substance‐abuse prevention program taught by uniformed police officers … The program is typically provided over the course of 10‐20 weekly hour‐long sessions, during which the police officers use lectures, class discussion, role plays, and homework assignments to i) teach students about substance use and its effects, ii) teach students decision‐making and peer pressure resistance skills, and iii) boost students’ self‐esteem. Prior to teaching, the police officers take an 80‐hour training course on teaching techniques, classroom management, and the DARE curriculum … DARE costs approximately $130 (in 2004 dollars) per student and, as of 2001, was operating in 75% of American school districts.”(8)
Did the program decrease the rate of drug use?
Rounds #4 and #5: 21st Century Community Learning Centers
Program description: “21st Century Community Learning centers is a large ($1 billion per year) US Department of Education program which funds optional after‐school programs for elementary and middle school students in mostly high‐poverty schools. Key goals of the program are to i) provide students with a safe place after school, and ii) improve their academic performance. Recipients of program funds (ie, school districts and/or non‐profit educational/community organizations) are required to provide academically focused “extended learning activities” (e.g., instructional enrichment programs, tutoring, or homework assistance). Most centers also offer enrichment/recreational activities such as martial arts, sports, dance, art and/or music … (Elementary school) centers vary in the activities they offer and other key features, and thus comprise a range of after‐school interventions rather than a single intervention. In a typical center,
students may spend an hour doing homework and having a snack, an hour on additional academic activity (eg, a lesson or working in a computer lab), and an hour doing recreational or cultural activities;
the center’s staff are a mixture of certified teachers, instructional aides, and representatives of community youth organizations;
the center is open 4‐5 days per week for three hours after school, and serves approximately 85 students per day; and
the average student attends the center 2‐3 days per week.
Centers spend approximately $1,000 (in 2005 dollars) on each enrolled student per year.”(10)
Did the program increase the students’ academic achievement?
Did the program improve the behavioural problems at the schools?
Round #6: Even Start Family Literacy program
Program description: “The Even Start program is intended to ‘help break the cycle of poverty and illiteracy by improving the educational opportunities of the nation’s low‐income families by integrating early childhood education, adult literacy or adult basic education, and parenting education into a unified family literacy program’. In 2000‐2001, there were 855 Even Start projects serving 31,896 families … Even Start grantees had considerable flexibility in designing services to meet the needs of the low‐income families, but all were required to offer four services:
adult education to develop basic educational and literacy skills;
early childhood education services to provide developmentally appropriate services to help prepare children for school;
parenting education to help parents support the educational growth of their children; and
parent‐child literacy activities.”(13)
Did the program increase literacy?
Round #7: Big Brothers Big Sisters
Program description: “Big Brothers Big Sisters’ community‐based mentoring program matches youths aged 6‐18, predominantly from low‐income, single‐parent households, with adult volunteer mentors who are typically young (20‐34) and well‐educated (the majority are college graduates) … The mentor and youth typically meet for 2‐4 times per month for at least a year, and engage in activities of their choosing (e.g., studying, cooking, playing sports). The typical meeting lasts 3‐4 hours … For the first year, Big Brothers Big Sisters case workers maintain monthly contact with the mentor, as well as the youth and his or her parent, to insure a positive mentor‐youth match, and to help resolve any problems in the relationship. Mentors are encouraged to form a supportive friendship with the youths, as opposed to modifying the youth’s behavior or character… In 2008, Big Brothers Big Sisters served 255,000 youths and 470 agencies nationwide. The national average cost of making and supporting a match is approximately $1,300 in 2009 dollars.”(14)
Did the program decrease drug use and violent behavior?
Round #8: Top 16 Educational Software
Program description: “In the No Child Left Behind Act of 2002, Congress called for a rigorous study of the effectiveness of educational technology for improving student academic achievement … In fall 2003, developers and vendors of educational technology products responded to a public invitation and submitted products for possible inclusion in the national study. Mathematica Policy Research, Inc. staff selected 40 of the 160 submissions for further review by two panels of outside experts, one for reading products and one for math products … In January 2004, (the US Department of Education] considered the panel’s recommendations and selected 16 products for the study. In selecting products, (the US Department of Education) grouped them into four areas:
early reading (first grade),
reading comprehension (fourth grade),
pre‐algebra (sixth grade), and
algebra (ninth grade).
The products ranged widely in their instructional approaches and how long they had been in use. In general, however, the criteria weighted the selection towards products that had evidence of effectiveness from previous research, or, for newer products, evidence that their designs were based on approaches found to be effective by research. Twelve of the sixteen products had received awards or been nominated for awards (some as recently as 2006) by trade associations, media, teachers, or parents.”(15)
Did the program improve test scores?
Here are the answers!
Round #1: Scared Straight
Negative! Several randomized controlled trials have shown that Scared Straight had a negative effect. Going through Scared Straight made children more likely to commit crimes in the future (3). Fun fact: Scared Straight programs are still being run today (4), and people promote them as being effective, despite the fact that they are harmful (5).
Round #2: Nurse‐Family Partnership
Positive! Three randomized controlled trials have shown that the Nurse‐Family Partnership had a positive effect. The program led to a reduction in child abuse/neglect, child injuries (20‐50% reduction) and an improvement in cognitive/educational outcomes for children of mothers with low mental health/confidence/intelligence (e.g., 6 percentile point increase in grade 1‐6 in reading/math achievement) (7).
Round #3: Drug Abuse Resistance Education (DARE)
No effect!Two randomized controlled trials have shown that DARE did not have an effect on the rate of drug use among participants. The rate of drug use did not increase or decrease (9).
Round #4: 21st Century Community Learning Centers
No effect! A randomized controlled trial has shown that the 21st Century Community Learning Centers had no effect on participating students’ academic performance. Students who participated were neither helped nor harmed by the program.(11)
Round #5: 21st Century Community Learning Centers
Negative! A randomized controlled trial has shown that the 21st Century Community Learning Centers caused an increase in the behavioral problems of participating students (12).
Round #6: Even Start Family Literacy Program
No effect! A randomized controlled trial on a subset of Even Start programs found no evidence of an increase or decrease in literacy in parents or children (17).
Round #7: Big Brothers Big Sisters
Positive! A randomized controlled trial has shown that Big Brothers Big Sisters caused youths to be 46% less likely to have started using illegal drugs, 27% less likely to have started using alcohol, 32% less likely to have hit someone in the previous year and fewer days of skipping school during the past year (18).
Round #8: Top 16 Educational Software
No effect!The study described was a randomized controlled trial, and showed that the software did not make a noticeable difference in any of the categories. It did not help or hurt with 1) early reading (first grade), 2) reading comprehension (fourth grade), 3) pre‐ algebra (sixth grade), or 4) algebra (ninth grade) (19).
How did you do?
If you got 7-8 right, there’s less than a 1% chance you were guessing. If you got 5-6 right, there was only an 8.5% chance you were guessing, so it might be skill. If you got 1-4 right, then you did no better than randomly guessing. If you got zero right … we could get useful information by always doing the opposite of what you do.
The effects of social interventions are extremely complex. All of these programs sound good, but unintended consequences can get in the way. It’s very difficult to work out what’s going to be successful ahead of time. Instead, we need to test, measure the results, and take it from there.
I thought Round 2 would have no effect and expected Round #5 to have no effect not a negative one, I got 6 out of 8 correct. How well did you do?
I recommend checking out the links and references. Gwern's comment there was also interesting.
I used to be excited about the idea of harnessing the high intellectual ability and strong norms of politeness on LW to reach accurate insight about various issues that are otherwise hard to discuss rationally. However, more recently I've become deeply pessimistic about the possibility of having a discussion forum that wouldn't be either severely biased and mind-killed or strictly confined to technical topics in math and hard sciences.
It looks like even if a forum approaches this happy state of affairs, the way old Overcoming Bias and early LessWrong arguably did for some time, this can happen only as a brief and transient phenomenon. (In fact, it isn't hard to identify the forces that inevitably make this situation unstable.) So, while OB ceased to be much of a discussion forum long ago, LW is currently in the final stages of turning into a forum that still has unusual smarts and politeness, but where on any mention of controversial issues, battle lines are immediately drawn and genuine discussion ceases, just like elsewhere. (Even if the outcome may still look very calm and polite by the usual internet standards.)
The trouble is, the only way a "no-mindkillers" rule can improve things is if it's done in an extreme form and with ruthless severity, by reducing the permissible range of topics to strictly technical questions in some areas of math and hard science and consistently banning everything else. The worst possible outcome is to institute a partial "no-mindkillers" rule, which would work under a pretense that rational and unbiased discussion of a broad range of topics outside of math and hard sciences is possible without bringing up any controversial and charged issues, and without giving serious consideration to disreputable and low-status views. This would lead to an entrenched standard of cargo-cult "rationality" that incorporates all the biases, delusions, and taboos of the respectable opinion wholesale, under a pretense of a neutral, pragmatic, and unbiased restriction of irrelevant and distracting controversial topics.
Thus, it seems to me like the only realistic possibilities at this point are: (1) increasing ideological confrontations and mind-killing, (2) enforcement of the above-described cargo-cult rationality standards, and (3) reduction of discussion topics to strictly technical questions, backed by far stricter, MathOverflow-type standards. Neither of these looks like a fulfilment of LW's mission statement, but (2) seems to me like the worst failure scenario from its point of view.
Are you concluding too hastily about the cause of deterioration? In the early days, OB had two major voices with conflicting ideologies. I think that's what lent it greater intellectual excitement. I do not think it a matter of ideological alignments being absent in the golden age. Rather, it allowed space for discussion of fundamental differences--as op... (read more)