Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
You know how people make public health decisions about food fortification, and medical decisions about taking supplements, based on things like the Recommended Daily Allowance? Well, there's an article in Nutrients titled A Statistical Error in the Estimation of the Recommended Dietary Allowance for Vitamin D. This paper says the following about the info used to establish the US recommended daily allowance for vitamin D:
The correct interpretation of the lower prediction limit is that 97.5% of study averages are predicted to have values exceeding this limit. This is essentially different from the IOM’s conclusion that 97.5% of individuals will have values exceeding the lower prediction limit.
The whole point of looking at averages is that individuals vary a lot due to a bunch of random stuff, but if you take an average of a lot of individuals, that cancels out most of the noise, so the average varies hardly at all. How much variation there is from individual to individual determines the population variance. How much variation you'd expect in your average due to statistical noise from sample to sample determines what we call the variation of the sample mean.
When you look at frequentist statistical confidence intervals, they are generally expressing how big the ordinary range of variation is for your average. For instance, 90% of the time, your average will not be farther off from the "true" average than it is from the boundaries of your confidence interval. This is relevant for answering questions like, "does this trend look a lot bigger than you'd expect from random chance?" The whole point of looking at large samples is that the errors have a chance to cancel out, leading to a very small random variation in the mean, relative to the variation in the population. This allows us to be confident that even fairly small differences in the mean are unlikely to be due to random noise.
The error here, was taking the statistical properties of the mean, and assuming that they applied to the population. In particular, the IOM looked at the dose-response curve for vitamin D, and came up with a distribution for the average response to vitamin D dosage. Based on their data, if you did another study like theirs on new data, it ought to predict that 600 IU of vitamin D is enough for the average person 97.5% of the time.
They concluded from this that 97.5% of people get enough vitamin D from 600 IU.
This is not an arcane detail. This is confusing the attributes of a population, with the attributes of an average. This is bad. This is real, real bad. In any sane world, this is mathematical statistics 101 stuff. I can imagine that someone who's heard about a margin of error a lot doesn't understand this stuff, but anyone who has to actually use the term should understand this.
Political polling is a simple example. Let's say that a poll shows 48% of Americans voting for the Republican and 52% for the Democrat, with a 5% margin of error. This means that 95% of polls like this one are expected to have an average within 5 percentage points of the true average. This does not mean that 95% of individual Americans have somewhere between a 43% and 53% chance of voting for the Republican. Most of them are almost definitively decided on one candidate, or the other. The average does not behave the same as the population. That's how fundamental this error is – it's like saying that all voters are undecided because the population is split.
Remember the famous joke about how the average family has two and a half kids? It's a joke because no one actually has two and a half kids. That's how fundamental this error is – it's like saying that there are people who have an extra half child hopping around. And this error caused actual harm:
The public health and clinical implications of the miscalculated RDA for vitamin D are serious. With the current recommendation of 600 IU, bone health objectives and disease and injury prevention targets will not be met. This became apparent in two studies conducted in Canada where, because of the Northern latitude, cutaneous vitamin D synthesis is limited and where diets contribute an estimated 232 IU of vitamin D per day. One study estimated that despite Vitamin D supplementation with 400 IU or more (including dietary intake that is a total intake of 632 IU or more) 10% of participants had values of less than 50 nmol/L. The second study reported serum 25(OH)D levels of less than 50 nmol/L for 15% of participants who reported supplementation with vitamin D. If the RDA had been adequate, these percentages should not have exceeded 2.5%. Herewith these studies show that the current public health target is not being met.
Actual people probably got hurt because of this. Some likely died.
This is also an example of scientific journals serving their intended purpose of pointing out errors, but it should never have gotten this far. This is a send a coal-burning engine under the control of a drunk engineer into the Taggart tunnel when the ventilation and signals are broken level of negligence. I think of the people using numbers as the reliable ones, but that's not actually enough – you have to think with them, you have to be trying to get the right answer, you have to understand what the numbers mean.
I can imagine making this mistake in school, when it's low stakes. I can imagine making this mistake on my blog. I can imagine making this mistake at work if I'm far behind on sleep and on a very tight deadline. But if I were setting public health policy? If I were setting the official RDA? I'd try to make sure I was right. And I'd ask the best quantitative thinkers I know to check my numbers.
The article was published in 2014, and as far as I can tell, as of the publication of this blog post, the RDA is unchanged.
(Cross-posted from my personal blog.)
Anna Salamon, executive director of CFAR (named with permission), recently wrote to me asking for my thoughts on fundraisers using matching donations. (Anna, together with co-writer Steve Rayhawk, has previously written on community norms that promote truth over falsehood.) My response made some general points that I wish were more widely understood:
- Pitching matching donations as leverage (e.g. "double your impact") misrepresents the situation by overassigning credit for funds raised.
- This sort of dishonesty isn't just bad for your soul, but can actually harm the larger world - not just by eroding trust, but by causing people to misallocate their charity budgets.
- "Best practices" for a charity tend to promote this kind of dishonesty, because they're precisely those practices that work no matter what your charity is doing.
- If your charity is impact-oriented - if you care about outcomes rather than institutional success - then you should be able to do substantially better than "best practices".
So I'm putting an edited version of my response here.
Passing this announcement along from GiveWell:
GiveWell is holding an event at our offices in San Francisco for Bay Area residents who are interested in Effective Altruism. The evening will be similar to the research events we hold periodically for GiveWell donors: it will include presentations and discussion about GiveWell’s top charity work and the Open Philanthropy Project, as well as a light dinner and time for mingling. We’re tentatively planning to hold the event in the evening ofor .We hope to be able to accommodate everyone who is interested, but may have to limit places depending on demand. If you would be interested in attending, please fill out this form.We hope to see you there!
Discussion article for the meetup : DC EA meetup / Petrov day dinner
On this day in 1983, in an unparalleled feat of Effective Altruism, Stanislav Petrov declined to destroy the world: http://lesswrong.com/lw/jq/926_is_petrov_day/
We'll celebrate his achievement by getting together for food, drinks, and not destroying the world.
Food and drinks will be provided, though please feel free to help.
Ben Hoffman will give a brief talks about the test run of the project to comment on proposed regulations. Most of the night will be free discussion on anything we want.
7:00 - 7:30 PM Arrive 7:30 - 8:00 PM Talk on DC EA projects 8:00 - 9:00 PM Dinner, drinks, discussion 9:00 - 9:30 PM Petrov day ritual 9:30 - Late: Free discussion
Discussion article for the meetup : DC EA meetup / Petrov day dinner
Discussion article for the meetup : Rationality Practice - Be Specific
Being specific can help you notice when you don't know what you're talking about, and avoid unnecessary miscommunication and arguments over definitions.
Let's come up with some ways to teach ourselves the habit of being specific, and giving and thinking through concrete examples. Related: http://lesswrong.com/lw/bc3/sotw_be_specific/
Discussion article for the meetup : Rationality Practice - Be Specific
Help, having a brain blank. I can come up w examples of times something happened, but not times something didnt-happen. What heuristic?
— Kate Donovan (@donovanable) April 29, 2014
If I tell 100 people not to think of an elephant, what's the single thing they're all most likely to think about over the next five minutes, aside from sex?
An elephant, of course.
Negation and oppositeness are perfectly intelligible semantic concepts - in general, no one is confused about what "Don't think of an elephant" means - or, more generally, "Don't do [X]," where X is any intelligible behavior. And people would know how to comply, if [X] were a physical action like sitting down. But even if they wanted to, they don't know how to not think of an elephant - even though that's a behavior they exhibit most of their waking lives, and in some sense on purpose.
Even for physical actions we are not only admonished to refrain from, but have a strong personal interest in not doing, we feel an impulse to do them anyway. Standing on a narrow ledge, afraid of falling, you might feel a strong urge to jump. Why?
Because a part of your mind that is trying to take care of you is thinking, as hard as it can, "Don't jump!" And there's another part of your mind, whose job it is to fetch ideas related to the things you're interested in. This fetcher doesn't understand words like "don't," but it does understand that you're very interested in the idea of jumping off that ledge, so it helpfully suggests ways to do so.
This can be a big problem if you're trying to find ways not to do something, or for something not to happen.
It is not possible to find ways for something not to happen.
Knowing this, how should we use our brains differently than we did before? For obvious reasons, I am not just going to tell you to avoid thinking of the things you want in terms of negations. Instead, I'm going to tell you some stories of how I used techniques designed with this in mind, to win at life.
The Case of the Missing Car Keys
A few days ago, I was on my way to an eagerly anticipated debate presided over by the incomparable Leah. I had gotten my scheduled prior weekend chores out of the way, and even had time to stop by the local Le Pain Quotidien for a leisurely brunch (for which the service was no more intolerably slow than usual, but this time they apologized without prompting and comped about half the meal), and read a chapter of Global Catastrophic Risks. In short, everything was going horribly right. Right in precisely that way that makes the bad news so upsetting by contrast.
This was the day I discovered that I am not smart enough to hold onto car keys, but I am smart enough to avoid getting defensive and starting a fight about it. They fell out of my pocket, either on the sidewalk or at the restaurant, or at the Whole Foods where I had plenty of time to pick up snacks for the event. I retraced my steps and asked after the keys at both places I'd been. No luck. I got back to the debate location just in time, and despondent. It didn't ruin the debate for me, since that was a pleasant and engrossing distraction with lots of happy people talking about interesting things, but afterwards I had to ask my girlfriend to come bring me the spare key so I could bring the car home.
Not only was I upset that I lost time waiting for the keys, and feeling bad about myself for losing them, and anticipating the hassle of going to the dealer to get another extra key (if that's even possible) - but I also put my girlfriend in a bad mood, which made me expect to be criticized for losing the keys. My brain was looking for ways to preemptively blame her. (There were plausible ways to argue it, but nothing that could be accurately described as her fault to anyone except my increasingly desperate defensive brain.)
I managed to suppress that particular comment preemptively blaming her, but on the car ride home, she brought up a few more things that could have turned into fights. But I (just barely) managed to say, "let's talk about these things if you still think that's a problem when we're both in better moods."
Haha, fightbrain, YOU LOSE! (For now.)
I would have totally failed at this as recently as a couple of months ago. What changed?
Well, over the past few months, I've been meditating for about 10 minutes a day, on average. More recently I even set up a Beeminder goal for this. I'm not meditating for spiritual insights or inner calm - I'm meditating to train my mind to do what I want. In particular, I'm practicing this pattern:
Me: I'm going to focus on X.
My brain: Y! Y! Y Y Y Y Y Y Y Y!
Me: I notice that I'm thinking about Y. Now let's think about X.
Over and over again, for as long as it takes. Not fighting the passing thought - not responding to "Y" with "not-Y" (which as we now know just gets parsed as "Y") - but gently redirecting my attention back to X, where X can be the feeling of my breath as it moves through the bottom of my nostrils, or the task of bringing the car safely home.
I still had to expend some WILLPOWER, which is evil, and means I'm not as good at this as I want to be, but in the past I would have lost and picked a fight. This time I won, and put off the conversations about what happened and what needed to change until I could engage productively.
Another thing I did in between getting upset and having a calm conversation about the keys, was talk with people whom my brain did not want to get mad at. People totally uninvolved with the conflict. This got my brain into a mode of thinking about my losing the car keys that had nothing to do with blaming or being blamed or defending or attacking - I was just explaining what happened and thinking about how I could hold onto my car keys better in the future.
(If you have ideas, I want to hear them! My pocket obviously isn't reliable. I'm likely enough to lose a bag that it's no better. A carabiner can come off, and a regular clip is even worse. I've considered using a combination padlock to hold the keys onto my belt, but that seems more hassleful than it's worth. )
How I Come Up With Ideas When I Can't Come Up With Any Ideas
Let's say I have something I want to do, and I can't think of any good ways it can be done. Like improving my emotional vocabulary - I want to figure out what exercises I can do that will increase the number of emotions I can recognize and name in the moment, and the rate at which I remember them afterwards. At first I thought I couldn't think of anything good.
Then I tried to come up with ten terrible ideas.
My working model of how this happens is that I implicitly have a stack of ideas, and my idea-fetcher assumes that the top of the stack is probably the best idea, so when I query my mind for "ideas about how to do X" the fetcher inspects the top item, finds it terrible, and decides that there are no ideas. If I ask again, the fetcher goes back to the stack, inspects the same top item, judges it unacceptable, and returns "no results" again.
So why does asking for terrible ideas fix this? Because it's not actually possible to query my mind for terrible ideas. Appending the word "terrible" doesn't actually suppress the good ideas - it just stops me from suppressing the bad ones. And once I've retrieved the top idea from the stack (even though it often is pretty terrible), my fetcher will turn up something different when I query it again. So I can inspect the second, and third, etc. Often, in my list of ten "terrible" ideas, some will obviously be good ones, and some others will be bad but improvable. And you can make a lot more improvements to a bad idea you are considering, than a bad idea you aren't even thinking of.
A few months ago, I asked Carl Shulman for ideas about how to build the forecasting and reasoning skills necessary to judge the importance of different existential risks, and he gave me about fifteen different really good ideas in about five minutes. It felt like magic, and I regret to report that at the time, it didn't occur to me to ask him how he was so good at coming up with ideas. But I think he was just using some version of this technique - at any rate, looking back, it doesn't feel like it would have been impossible for me to come up with those ideas anymore. My censors are off. I have the Intent To Solve The Problem. I will accept even terrible ideas.
Swim Parallel to the Shore
Let's say I am going into a social interaction and am nervous that it will be awkward because I'm not good with strangers. We now know that "don't be awkward" is not a query that will produce useful plans. Even "be socially skilled" is a problem - if you're worried about being awkward, you don't necessarily have a strong and vivid an image of what a generic successful conversation looks like - but you sure know what an awkward one looks like. Even if the explicit verbal instruction you give your mind is "tell me how to be socially skilled in this conversation," it will get parsed as "tell me how to be not awkward" and your fetcher will in turn parse that as "be awkward" and helpfully suggest ways to accomplish that goal.
Instead, you might want to make the other person laugh, or get some information from them, or ask them for a favor, or just let them know that you like them and want to be their friend. Pick a goal - or more than one - that is sideways relative to awkwardness, and optimize for that. Your conversation won't be perfect, but it will be a lot less awkward than if you spend all your energy thinking about how to be awkward.
Do the same thing you're supposed to do when you're swimming in the ocean, and the undertow threatens to draw you out to sea. They don't just tell you not to fight the tide, though - they tell you to swim orthogonally to it, parallel to the shore. Pick a new direction, and optimize for that.
An Alternative Approach: Flip The Sign
Kate unsurprisingly has her own interesting take on this. She talks about flipping ideas around so if you don't want X, then you can create a positive goal that's the complement of X. For example, she turns the aversive goal "I don’t want to be the sort of person who avoids things because they’re emotionally weighty" into the positive goal "I want to be the sort of person who tackles emotionally weighty conflicts".
I think this is likely to be a problem because your brain may be stupid but it's also smart. It can sometimes tell when your oh-so-positive wording is just a tricky way of circumlocuting a negation. I'd expect more success with something like, "I want to be compassionate during emotionally weighty conflicts," since that goal pushes sideways, not against the aversion.
You the reader should be happy we disagree, since it means you're more likely to have found a technique that will work for you. If one of our ideas doesn't work for you, try the other. If one works, try the other anyway. Try lots of things! Then keep doing the ones that work.
This person seems to have the virtue of non-compartmentalization. What rationalist skill can we learn from this? Maybe look for ways a strong belief in one domain, to another where it's more testable?
- There is a substantial flaw or missing element to my model that someone will point out.
- Many readers, who are bad at small talk because they don't see the point, will get better at it as a result of acquiring understanding.
I grew up in a Jewish household, so I didn't have Santa Claus to doubt - but I did have the tooth fairy.
It was hard for me to believe that a magical being I had never seen somehow knew whenever any child lost their tooth, snuck into their house unobserved without setting off the alarms, for unknown reasons took the tooth, and for even less fathomable reasons left a dollar and a note in my mom's handwriting.
On the other hand, the alternative hypothesis was no less disturbing: my parents were lying to me.
Of course I had to know which of these terrible things was true. So one night, when my parents were out (though I was still young enough to have a babysitter), I noticed that my tooth was coming out and decided that this would be...
A Perfect Opportunity for an Experiment.
I reasoned that if my parents didn't know about the tooth, they wouldn't be able to fake a tooth fairy appearance. I would find a dollar and note under my pillow if, but only if, the tooth fairy were real.
I solemnly told the babysitter, "I lost my tooth, but don't tell Mom and Dad. It's important - it's science!" Then at the end of the night I went to my bedroom, put the tooth under the pillow, and went to sleep. The next morning, I woke up and looked under my pillow. The tooth was gone, and in place there was a dollar and a note from the "tooth fairy."
This could have been the end of the story. I could have decided that I'd performed an experiment that would come out one way if the tooth fairy were real, and a different way if the tooth fairy were not. But I was more skeptical than that. I thought, "What's more likely? That a magical creature took my tooth? Or that the babysitter told my parents?"
I was furious at the possibility of such an egregious violation of experimental protocol, and never trusted that babysitter in the lab again.
An Improvement in Experimental Design
The next time, I was more careful. I understood that the flaw in the previous experiment had been failure to adequately conceal the information from my parents. So the next time I lost a tooth, I told no one. As soon as I felt it coming loose in my mouth, I ducked into the bathroom, ran it under the tap to clean it, wrapped it in a tissue, stuck it in my pocket, and went about my day as if nothing had happened. That night, when no one was around to see, I put the tooth under my pillow before I went to sleep.
In the morning, I looked under the pillow. No note. No dollar. Just that tooth. I grabbed the incriminating evidence and burst into my parents bedroom, demanding to know:
"If, as you say, there is a tooth fairy, then how do you explain THIS?!"
What can we learn from this?
The basic idea of the experiment was ideal. It was testing a binary hypothesis, and was expected to perfectly distinguish between the two possibilities. However, if I had known then what I know now about rationality, I could have done better.
As soon as my first experiment produced an unexpected positive result, just by learning that fact, I knew why it had happened, and what I needed to fix in the experiment to produce strong evidence. Prior to the first experiment would have been a perfect opportunity to apply the "Internal Simulator," as CFAR calls it - imagining in advance getting each of the two possible results, and what I think afterwards - do I think the experiment worked? Do I wish I'd done something differently? - in order to give myself the opportunity to correct those errors in advance instead of performing a costly experiment (I had a limited number of baby teeth!) to find them.
View more: Next