gwern comments on Thermodynamics of Intelligence and Cognitive Enhancement - Less Wrong

8 Post author: CasioTheSane 03 April 2014 11:17PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (69)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 10 September 2014 01:14:12AM 0 points [-]

Epidemiology seems to call this "overadjustment"; for example, "Overadjustment Bias and Unnecessary Adjustment in Epidemiologic Studies":

We define overadjustment bias as control for an intermediate variable (or a descending proxy for an intermediate variable) on a causal path from exposure to outcome. We define unnecessary adjustment as control for a variable that does not affect bias of the causal relation between exposure and outcome but may affect its precision. We use causal diagrams and an empirical example (the effect of maternal smoking on neonatal mortality) to illustrate and clarify the definition of overadjustment bias, and to distinguish overadjustment bias from unnecessary adjustment. Using simulations, we quantify the amount of bias associated with overadjustment. Moreover, we show that this bias is based on a different causal structure from confounding or selection biases.

Comment author: IlyaShpitser 21 September 2014 07:42:05PM *  2 points [-]

Hi, sorry I missed this post earlier. Yes, this is sometimes called overadjustment. Their definition of overadjustment is incomplete -- they are missing the case where there is a variable associated with both exposure and outcome, is not an intermediate variable, but adjusting for it increases bias anyways. This case has a different name, M-bias, and occurs for instance in this graph:

A -> Y <- H1 -> M <- H2 -> A

Say we do not observe H1, H2, and A is our exposure (treatment), Y is our outcome. The right thing to do here is to not adjust for M. It's called "M-bias" because the part of this graph involving H variables kind of looks like an M, if you draw it using the standard convention of unobserved confounders on top.


But there is a wider problem here than this, because sometimes what you are doing is 'adjusting for confounders,' but in reality you shouldn't even be using the formula that adjusting for confounders gives you, but use another formula. This happens for example with longitudinal studies (with a non-genetic treatment that is vulnerable to confounders over time). In such studies you want to use something called the g-computation algorithm instead of adjusting for confounders.

I guess if I were to name the resulting bias, it would be "causal model misspecification bias." That is, you are adjusting for confounders in a particular way because you think the true causal model is a certain way, but you are wrong about that -- the model is actually different and the causal effect requires a different approach from what you are using.


I have a paper with Tyler Vanderweele and Jamie Robins that characterizes exactly what has to be true on the graph for adjustment to be valid for causal effects. So you will get bias from adjustment (for a particular set) if and only if the condition in the paper does not hold for your model.