Kawoomba comments on Causal Diagrams and Causal Models - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (274)
Well, in some sense this is why causal inference is hard. Most of the time if you see independence that really does mean there is nothing there. The reasonable default is the null hypothesis: there is no causal effect. However, if you are poking around because you suspect there is something there, then not seeing any correlations does not mean you should give up. What it does mean is you should think about causal structure and specifically about confounders.
What people do about confounders is:
(a) Try to measure them somehow (epidemiology, medicine). If you can measure confounders you can adjust for them, and then the effect cancellation will go away.
(b) Try to find an instrumental variable (econometrics). If you can find a good instrument, you can get a causal effect with some parametric assumptions, even if there are unmeasured confounders.
(c) Try to randomize (statistics). This explicitly cuts out all confounding.
(d) You can sometimes get around unmeasured confounders by using strong mediating variables by means of "front-door" type methods. These methods aren't really well known, and aren't commonly used.
There is no royal road: getting rid of confounders is the entire point of causal inference. People have been thinking of clever ways to do it for close to a hundred years now. If you have infinite samples, and know where unobserved confounding is, there is an algorithm for getting the causal effect from observational data by being sneaky. This algorithm only succeeds sometimes, and if it doesn't, there is no other way in general to do it (e.g. it's "complete"). More in my thesis, if you are curious.
As far as I can tell, epidemiology and medicine are mostly doing (c), in the form of RCTs (which are the gold standard of medical evidence, other than meta-reviews). There are other study designs such as most variants of case-control studies and cohort studies which do take the (a) approach, but they aren't considered to be the same level of evidence as randomized controlled trials.
Quite rightly -- if we randomize, we don't care what the underlying causal structure is, we just cut all confounding out anyways. Methods (a), (b), (d) all rely on various structural assumptions that may or may not hold. However, even given those assumptions figuring out how to do causal inference from observational data is quite difficult. The problem with RCTs is expense, ethics, and statistical power (hard to enroll a ton of people in an RCT).
Epidemiology and medicine does a lot of (a), look for the keywords "g-formula", "g-estimation", "inverse probability weighting," "propensity score", "marginal structural models," "structural nested models", "covariate adjustment," "back-door criterion", etc. etc.
People talk about "controlling for other factors" when discussing associations all the time, even in non-technical press coverage. They are talking about (a).
True, true. "Gold standard" or "preferred level of evidence" versus "what's mostly conducted given the funding limitations". However, to make it into a guideline, there are often RCT follow-ups for hopeful associations uncovered by the lesser study designs.
I, of course, know all of those. The letters, I mean.