Houshalter comments on Causal Diagrams and Causal Models - Less Wrong

61 Post author: Eliezer_Yudkowsky 12 October 2012 09:49PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (274)

You are viewing a single comment's thread. Show more comments above.

Comment author: Daniel_Burfoot 12 October 2012 05:57:56PM *  25 points [-]

After reading this post I was stunned. Now I think the central conclusion is wrong, though I still think it is a great post, and I will go back to being stunned if you convince me the conclusion is correct.

You've shown how to identify the correct graph structure from the data. But you've erred in assuming that the directed edges of the graph imply causality.

Imagine you did the same analysis, except instead of using O="overweight" you use W="wears size 44 or higher pants". The data would look almost the same. So you would reach an analogous conclusion: that wearing large pants causes one not to exercise. This seems obviously false unless your notion of causality is very different from mine.

In general, I think the following principle holds: inferring causality requires an intervention; it cannot be discovered from observational data alone. A researcher who hypothesized that W causes not-E could round up a bunch of people, have half of them wear big pants, observe the effect of this intervention on exercise rates, and then conclude that there is no causal effect.

Comment author: Houshalter 19 August 2015 12:33:55AM 0 points [-]

He addressed that in the third footnote.

Or there might be some hidden third factor, a gene which causes both fat and non-exercise. By Occam's Razor this is more complicated and its probability is penalized accordingly, but we can't actually rule it out. It is obviously impossible to do the converse experiment where half the subjects are randomly assigned lower weights, since there's no known intervention which can cause weight loss.

The model assumes that those are the only relevant variables. Given that assumption, we can prove that weight causes exercise. And that it can't be the other way around.

If there are unobserved variables, it's possible that they can cause weight and cause exercise. However that wasn't one of the hypotheses anyone believed beforehand; they were arguing whether weight causes exercise or if exercise causes weight.

Second, even if there is an unobserved variable, it still suggests that exercising more will not improve your weight. Otherwise internet use would correlate with weight. Because internet use affects exercise. If exercise affected weight at all, then internet use would indirectly cause weight gain, and therefore correlate with it.

The whole point of the article is about this trick. Where taking a weird and unrelated variable like internet use, lets us discover the direction of causation. Which according to common knowledge about statistics, shouldn't be possible. Not without randomized controlled experiments.