Fact Posts: How and Why

sarahconstantin

LESSWRONG
LW

Fact Posts: How and Why — LessWrong

Bonus: Best Essays of LessWrong

268 Fact Posts: How and Why

by sarahconstantin

2nd Dec 2016

4 min read

268

The most useful thinking skill I've taught myself, which I think should be more widely practiced, is writing what I call "fact posts." I write a bunch of these on my blog. (I write fact posts about pregnancy and childbirth here.)

To write a fact post, you start with an empirical question, or a general topic. Something like "How common are hate crimes?" or "Are epidurals really dangerous?" or "What causes manufacturing job loss?"

It's okay if this is a topic you know very little about. This is an exercise in original seeing and showing your reasoning, not finding the official last word on a topic or doing the best analysis in the world.

Then you open up a Google doc and start taking notes.

You look for quantitative data from conventionally reliable sources. CDC data for incidences of diseases and other health risks in the US; WHO data for global health issues; Bureau of Labor Statistics data for US employment; and so on. Published scientific journal articles, especially from reputable journals and large randomized studies.

You explicitly do not look for opinion, even expert opinion. You avoid news, and you're wary of think-tank white papers. You're looking for raw information. You are taking a sola scriptura approach, for better and for worse.

And then you start letting the data show you things.

You see things that are surprising or odd, and you note that.

You see facts that seem to be inconsistent with each other, and you look into the data sources and methodology until you clear up the mystery.

You orient towards the random, the unfamiliar, the things that are totally unfamiliar to your experience. One of the major exports of Germany is valves? When was the last time I even thought about valves? Why valves, what do you use valves in? OK, show me a list of all the different kinds of machine parts, by percent of total exports.

And so, you dig in a little bit, to this part of the world that you hadn't looked at before. You cultivate the ability to spin up a lightweight sort of fannish obsessive curiosity when something seems like it might be a big deal.

And you take casual notes and impressions (though keeping track of all the numbers and their sources in your notes).

You do a little bit of arithmetic to compare things to familiar reference points. How does this source of risk compare to the risk of smoking or going horseback riding? How does the effect size of this drug compare to the effect size of psychotherapy?

You don't really want to do statistics. You might take percents, means, standard deviations, maybe a Cohen's d here and there, but nothing fancy. You're just trying to figure out what's going on.

It's often a good idea to rank things by raw scale. What is responsible for the bulk of deaths, the bulk of money moved, etc? What is big? Then pay attention more to things, and ask more questions about things, that are big. (Or disproportionately high-impact.)

You may find that this process gives you contrarian beliefs, but often you won't, you'll just have a strongly fact-based assessment of why you believe the usual thing.

There's a quality of ordinariness about fact-based beliefs. It's not that they're never surprising -- they often are. But if you do fact-checking frequently enough, you begin to have a sense of the world overall that stays in place, even as you discover new facts, instead of swinging wildly around at every new stimulus. For example, after doing lots and lots of reading of the biomedical literature, I have sort of a "sense of the world" of biomedical science -- what sorts of things I expect to see, and what sorts of things I don't. My "sense of the world" isn't that the world itself is boring -- I actually believe in a world rich in discoveries and low-hanging fruit -- but the sense itself has stabilized, feels like "yeah, that's how things are" rather than "omg what is even going on."

In areas where I'm less familiar, I feel more like "omg what is even going on", which sometimes motivates me to go accumulate facts.

Once you've accumulated a bunch of facts, and they've "spoken to you" with some conclusions or answers to your question, you write them up on a blog, so that other people can check your reasoning. If your mind gets changed, or you learn more, you write a follow-up post. You should, on any topic where you continue to learn over time, feel embarrassed by the naivety of your early posts. This is fine. This is how learning works.

The advantage of fact posts is that they give you the ability to form independent opinions based on evidence. It's a sort of practice of the skill of seeing. They likely aren't the optimal way to get the most accurate beliefs -- listening to the best experts would almost certainly be better -- but you, personally, may not know who the best experts are, or may be overwhelmed by the swirl of controversy. Fact posts give you a relatively low-effort way of coming to informed opinions. They make you into the proverbial 'educated layman.'

Being an 'educated layman' makes you much more fertile in generating ideas, for research, business, fiction, or anything else. Having facts floating around in your head means you'll naturally think of problems to solve, questions to ask, opportunities to fix things in the world, applications for your technical skills.

Ideally, a group of people writing fact posts on related topics, could learn from each other, and share how they think. I have the strong intuition that this is valuable. It's a bit more active than a "journal club", and quite a bit more casual than "research". It's just the activity of learning and showing one's work in public.

Fact postsScholarship & LearningPracticalWorld Modeling

Frontpage

268

Pain is not the unit of Effort

94 comments598 karma

The correct response to uncertainty is *not* half-speed

46 comments285 karma

New Comment

34 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:07 AM

[-]Scott Alexander9y620

Some additional thoughts:

Don't underestimate Wikipedia as a really good place to get a (usually) unbiased overview of things and links to more in-depth sources.
The warning against biased sources is well-taken, but if you're looking into something controversial, you might have to just read the biased sources on both sides, then try to reconcile them. I've found it helpful to find a seemingly compelling argument, google something like "why X is wrong" or "X debunked" into Google, and see what the other side has to say about it. Then repeat until you feel like both sides are talking past each other or disagreeing on minutiae. This is important to do even with published papers!
Success often feels like realizing that a topic you thought would have one clear answer actually has a million different answers depending on how you ask the question. You start with something like "did the economy do better or worse this year?", you find that it's actually a thousand different questions like "did unemployment get better or worse this year?" vs. "did the stock market get better or worse this year?" and end up with things even more complicated like "did employment as measured in percentage of job-seekers finding a job within six months get better" vs. "did employment as measured in total percent of workforce working get better?". Then finally once you've disentangled all that and realized that the people saying "employment is getting better" or "employment is getting worse" are using statistics about subtly different things and talking past each other, you use all of the specific things you've discovered to reconstruct a picture of whether, in the ways important to you, the economy really is getting better or worse.

[-]Viliam9y90

Don't underestimate Wikipedia as a really good place to get a (usually) unbiased overview of things and links to more in-depth sources.

Always look at the Talk page, to get an overview about what kind of information is being removed from the article.

Connotational disclaimer: I don't mean to imply that the removed information is always or even usually true. It's just, sometimes the opposing views are at least mentioned in the article in a "criticism" section, but sometimes they are removed without a trace ("the article is already too long" can be a convenient excuse, especially when one side can make it too long by adding many irrelevant details).

[-]Jiro9y60

Don't underestimate Wikipedia as a really good place to get a (usually) unbiased overview of things and links to more in-depth sources.

Don't overestimate it, either.

[-]Manfred9y460

I would recommend just correctly estimating everything, really.

[-]Lumifer9y20

Yes, but there is the bias-variance trade-off.

[-][anonymous]9y370

I have about six of these floating around in my drafts. This makes me think that maybe I should post them; I didn't think they were that interesting to anyone but me.

Recently, I spent about ten hours reading into a somewhat complicated question. It was nice to get a feel for the topic, first, before I started badgering the experts and near-experts I knew for their opinions. I was surprised at how close I got to their answers.

[-]Scott Alexander9y150

I have about six of these floating around in my drafts. This makes me think that maybe I should post them; I didn't think they were that interesting to anyone but me.

Please!

[-]sleepingthinker9y00

I am the same way, although I think that I have much more than 6 drafts of these types of posts. :) I have hundreds in fact! I often start writing on something, and then switch to a different topic without finishing my essay on the first one!

It's the first time I see the concept of a "fact-post", however in my experience writing posts on history is a good practice for this. Of course, "history" is often biased and many history books have slants based on ideologies, biases or other perspectives, but there are such things as dates, names, events...etc. which are facts and if you start putting them in chronological order, you can arrive at good fact posts.

Once you start digging a bit deeper and writing more in depth history posts, you also start noticing your inherent biases a lot more. Oftentimes you might skip over some fact, event or name just because it doesn't fit with your internal vision of the world. For example, I have a hard time accepting that some dinosaurs had feathers, since I have already formed a preconceived ideal type of what dinosaurs looked like in my head and when I write about dinosaurs, i conveniently try to skip recent paleontological findings pointing to evidence that indeed some types of dinosaurs had feathers at least on parts of their bodies.

However since I write these things down, I am forced to internally confront this inherent bias and maybe over time it lessens.

[-]Richard Korzekwa9y100

I took a course in graduate school in which I interpreted a series of assignments as doing basically this (instead of something much more boring), but a bit less quantitative and a bit less focused on a single question. I found it to be so enjoyable that I spent entirely too much time researching things like trash-fired power plants and the problems associated with moving 400-ton power transformers around the country, and not nearly enough time doing actual science for my dissertation. The papers would need a little rewriting to be made into blog posts, but they're short and I should be able to make most of them work.

Or I could just learn more cool shit and write new ones.

[-]RomeoStevens9y90

Very good tips. Especially the part about getting more scope sensitive, this has positive flow through effects to other areas of life when practiced.

conventionally reliable sources

this step can be trickier than it seems at first blush because of optimization for looking prestigious and reliable by orgs that use ideologically motivated data gathering techniques. As a first pass, if the org seems to data gather to fuel some sort of lobbying effort you should only trust the data they gather that is unrelated to their area of interest (that they collected incidentally) ht Jim Babcock.

[-]Benquo9y80

I found this helpful, and it makes me more likely to write this kind of post.

One thing I found especially helpful was a sense of the relationship between the exploration process and the finished product. I often see fact posts with a central and then evidence, and this can create the impression that this is the order in which one should proceed. First decide on a question, and then find the data that answer it.

Seems like you're suggesting there's a lot of value from doing things the other way around. Once you're interested in a question, look at the data, and see what it wants to tell you, changing the questions you ask to fit what the data can tell you, rather than torturing the data to answer your question.

[-]Vladimir_Nesov9y70

For some topics, finding an undergraduate textbook (or sometimes a handbook) is much more efficient, and similarly screens off conventionally unreliable sources. Not as good for original seeing, but it's not clear that it's any use, other than as an exercise, before enough background is in place (even though that background may bias towards the status quo). But when the absolutely standard material is at hand, this looks like a plausible move for organizing the rest.

[-]Lumifer9y120

finding an undergraduate textbook (or sometimes a handbook) is much more efficient

For hard sciences, yes. For soft sciences, no.

[-]Vladimir_Nesov9y00

In what way (how does the difference work)? My experience is almost exclusively with hard sciences.

[-]Lumifer9y100

Well, the discussion of the differences between the hard and the soft sciences is a complicated topic.

But very crudely, the soft sciences have to deal with situations which never exactly repeat, so their theories and laws are always approximate and apply "more or less". In particular, this makes it hard to falsify theories which leads to proliferation of just plain bullshit and idiosyncratic ideas which cannot be proven wrong and so continue their existence. Basically you cannot expect that a social science will reliably converge on truth the way a hard science will.

So if you pick, say, an undergraduate textbook in economics, what it tells you will depend on which particular textbook did you pick. Two people who read two different econ textbooks might well end up with very different ideas of how economics work and there is no guarantee that either of them will explain the real-world data well.

[-]alex_zag_al9y10

the soft sciences have to deal with situations which never exactly repeat

This is also true of evolutionary biology--I think it's not widely recognized that evolutionary biology is like the soft sciences in this way.

[-]AnnaSalamon9y50

You can basically believe the contents of an intro chemistry textbook; you have to be much more careful with the contents of an intro psychology or sociology textbook.

[-]NatashaRostova9y20

Check out Yvain's sequence on Game Theory. I've actually studied game theory at a grad level, and had nothing to learn from what he wrote. But he opened it up in a fun/interesting/well-written way, which was specifically written for this audience, and addressed relevant interests here more than a textbook.

It's challenging to imagine a sequence on introductory chemistry that would have the same appeal. Having said that, I'm sure a sufficiently educated/talented writer could do one on intro chem.

[-]johnlawrenceaspden9y20

"Asimov on Chemistry" was a childhood favourite of mine.

[-]Lumifer9y10

So, undergrad textbooks. Let me quote Andrew Gelman (a professor of statistics at Columbia):

Dear Major Academic Publisher,

You just sent me, unsolicited, an introductory statistics textbook that is 800 pages and weighs about 5 pounds. It’s the 3rd edition of a book by someone I’ve never heard of. That’s fine—a newcomer can write a good book. The real problem is that the book is crap. It’s just the usual conventional intro stat stuff. The book even has a table of the normal distribution on the inside cover! How retro is that?

The book is bad in so many many ways, I don’t really feel like going into it. There’s nothing interesting here at all, the examples are uniformly fake, and I really can’t imagine this is a good way to teach this material to anybody. None of it makes sense, and a lot of the advice is out-and-out bad (for example, a table saying that a p-value between 0.05 and 0.10 is “moderate evidence” and that a p-value between 0.10 and 0.15 is “slight evidence”). This is not at all the worst thing I saw; I’m just mentioning it here to give a sense of the book’s horrible mixture of ignorance and sloppiness.

I could go on and on. But, again, I don’t want to do so.

I can’t blame the author, who, I’m sure, has no idea what he is doing in any case. It would be as if someone hired me to write a book about, ummm, I dunno, football. Or maybe rugby would be an even better analogy, since I don’t even know the rules to that one.

[-]Vladimir_Nesov9y20

Yes, it's worthwhile to spend quite a lot of time on choosing which textbooks to study seriously, at least a fraction of the time needed to study them, and to continually reevaluate the choice as you go on.

[-]VipulNaik9y50

I like the spirit of the suggestion here, but have at least two major differences of opinion regarding:

The automatic selection of venue: I think that blogs are only a place of "last resort" for facts and not the goto place. I would suggest venues like Wikipedia (when it's notable enough and far enough away from original research), wikiHow and Wikia (for cases somewhat similar to Wikipedia but suited to the specifics of those sites), and domain-specific sharing fora as better choices in some contexts.
The filtering out of opinion and biased sources: I think separating out factual sources from opinion-based sources is harder than it looks, that many numbers, esp. in the social sciences, are based on a huge amount of interpretation conventions that you can't fully grok without diving into the associated opinion pieces from different perspectives, and that epistemic value is greater when you integrate it all. That said, a "facts-only" approach can be a nice starting point for bringing priors into a conversation.

Automatic selection of venue

Collecting and organizing facts is great not just for the fact-gatherer but also for others who can benefit from the readymade process. In some cases, your exploration heavily includes personal opinion or idiosyncratic selection of direction. For these cases, posting to a personal site or blog, or a shared discussion forum for the topic, is best. In other cases, a lot of what you've uncovered is perfectly generic. In such cases, places like Wikipedia, wikiHow, Wikia, or other wikis and fact compendiums can be good places to share your facts. I've done this quite a bit, and also sponsored others to do similar explorations. This provides more structure and discipline to the exercise and significantly increases the value to others.

Filtering out of opinion and biased sources

There are a few different aspects to this.

First, the numbers you receive don't come out of thin air; they are usually a result of several steps of recording and aggregation. Understanding and interpreting how this data is aggregated, what it means on the ground, etc. are things that require both an understanding of the mathematical/statistical apparatus and of the real-world processes involved. Opinion pieces can point to different ways of looking at the same numbers.

For instance, if you just download a table of fertility rates and then start opining on how population is changing, you're likely to miss out on the complex dynamics of fertility calculations, e.g., all the phenomena such as tempo effects, population momentum, etc. You could try deriving all these insights yourself (which isn't that hard, just takes several days of looking at the numbers and running models) or you could start off by reading existing literature on the subject. Opinionated works often do a good job of covering these concepts, even when they come to fairly wrong conclusions, just because they have to cover the basics to even have an intelligent conversation.

Moreover, there are many cases where people's opinions ARE the facts that you are interested in. To take the example of fertility, let's say that you uncover the concepts of ideal, desired, and expected fertility and are trying to use them to explain how fertility is changing. How will you understand who men's and women's ideal fertility numbers are changing over time? Surveys only go so far and are fairly inadequate. Opinion pieces can shed important light on the opinions of the people writing them, and comments on them can be powerful indicators of the opinions of the commenters. In conjunction with survey data, this could give you a richer sense of theories of fertility change.

It's also hard to keep your own biases, normative or factual, out of the picture.

My experience and view is that it's better to read opinion pieces from several different perspectives to better identify your own biases and control for them, as well as get a grounding in the underlying conceptual lexicon. This could be done before, after, or in conjunction with the lookup of facts -- each approach has its merits.

[-]Commander Zander9y50

This could be a cool thing for meetups to experiment with, too.

[-]hamnox9y40

I find myself very confused about how to tell which journals are reputable. Do you have a good heuristic (or list) for finding this out?

[-]btrettel9y50

Learning of the reputation of the journal from someone knowledgeable about its field is the most reliable way I can think of for someone outside the field of interest.

Impact factors seem inappropriate to me, as they can vary wildly between fields and even wildly among subfields. A more specialized, but still high quality, journal could have a much lower impact factor than a more general journal, even if the two are at roughly the same average quality. Also, some foreign language journals can be excellent despite having low impact factors for the field of the journal. This unfortunately is true even for cover-to-cover or partial translations of those foreign language journals.

You also could learn what signs to avoid. Some journals publish nonsense, and that's usually pretty obvious after looking at a few articles. (Though, in some fields it can be hard to separate nonsense from parody.)

Beyond the previous recommendations, it's probably better to focus on the merits of the article rather than the journal. I can immediately think of one article in particular which has a non-obvious major flaw that is published in a small journal that I consider excellent. This error should have been caught in review, but it was not, probably because catching the error requires redoing math the authors skipped over in the article. (In intend to eventually publish a paper on this error after I finish my correction to it.)

[-]Manfred9y40

The worst part is that there's a lot the journal doesn't protect you from, no matter how reputable. Data shown in the paper can be assumed to be presented in the prettiest possible way for the specific data set they got, and interpretation of the data can be quite far off base and still get published if none of the reviewers happen to be experts on the particular methods or theories.

[-]waveman9y20

Ask people in the field, if you know someone.

[-]ChristianKl9y00

Looking at the impact factor of the journal can be a way to filter out junk journals.

[-]Douglas_Knight9y50

Right, the highest impact factors are noise mining.

[-]Elo9y30

This sounds like the feynman notebook method - www.calnewport.com/blog/2015/11/25/the-feynman-notebook-method/

Also see this post - http://lesswrong.com/lw/mmu/how_to_learn_a_new_area_x_that_you_have_no_idea/

For relevant procedural systems of evaluating what you know.

[-]Gucci Ming2y10

You explicitly do not look for opinion, even expert opinion. You avoid news, and you're wary of think-tank white papers. You're looking for raw information. You are taking a sola scriptura approach, for better and for worse.

What do you think about reading opinion pieces from different viewpoints, but at the same time trying to validate those statements? Would that help us save time? What could be the downside of doing so?

[-]TJ5y10

Great post. Thanks, that was inspiring! I'm a beginning (science) blogger and I have recently been involved in EA. This post really helped for writing my first article. I still have to figure out many practical things, like picking the right platform and finding out how to reach out to others. It would be great if you could write something about your own experience on this.

Cheers!

[-]schizoidboy9y10

Nice post. I think a similar type of post would be a theory post, where facts might still exist (except for a priori theories) but the investigation is to try to understand some philosophy or realm of thought. I think these are important because it keeps the fact posts in context. For example, if I were a middle class person in the soviet union researching facts on pregnancy and childbirth by going to libraries in Moscow in the 1950s, the available facts (e.g. what scientific test was run versus not run) might be different.

[-]feverus9y00

A metaphor: Knowledge is a jigsaw puzzle, and the search for truth is a process of trial and error fitting new pieces alongside those already have. The more pieces you have in place, the quicker you can accept or reject new ones; the more granular the detail you perceive in their edges, the better you can identify the exact shape of holes in the puzzle and make new discoveries.

And if there's a misshapen piece you absolutely refuse to move it will screw up the entire puzzle and you'll never get it right. This method is great - generally reliable sources which fit together are free pieces which act as your foundation to even get started.

Unfortunately, it's often easy and natural to force contradicting new data into your existing model even if it really doesn't fit - patching the conflicts without ever really noticing the dissonance, and overfitting your theory without actually restructuring your beliefs. One useful trick for checking yourself: explicitly asking yourself "what do I expect this figure / fact to be or say?" on each step of the project before you look it up. If you go in with reasonably certain expectations and the data reads wildly out of bounds, maybe you've found a major hole in your understanding of the issue, maybe the info is bad, or maybe that figure is saying something very different than you interpreted it.

Moderation Log