Daniel_Burfoot comments on Causal Diagrams and Causal Models - Less Wrong

61 Post author: Eliezer_Yudkowsky 12 October 2012 09:49PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (274)

You are viewing a single comment's thread.

Comment author: Daniel_Burfoot 12 October 2012 05:57:56PM *  25 points [-]

After reading this post I was stunned. Now I think the central conclusion is wrong, though I still think it is a great post, and I will go back to being stunned if you convince me the conclusion is correct.

You've shown how to identify the correct graph structure from the data. But you've erred in assuming that the directed edges of the graph imply causality.

Imagine you did the same analysis, except instead of using O="overweight" you use W="wears size 44 or higher pants". The data would look almost the same. So you would reach an analogous conclusion: that wearing large pants causes one not to exercise. This seems obviously false unless your notion of causality is very different from mine.

In general, I think the following principle holds: inferring causality requires an intervention; it cannot be discovered from observational data alone. A researcher who hypothesized that W causes not-E could round up a bunch of people, have half of them wear big pants, observe the effect of this intervention on exercise rates, and then conclude that there is no causal effect.

Comment author: IlyaShpitser 12 October 2012 08:45:39PM *  19 points [-]

You are correct -- directed edges do not imply causality by means of only conditional independence tests. You need something called the faithfulness assumption, and additional (causal) assumptions, that Eliezer glossed over. Without causal assumptions and with only faithfulness, all you are recovering is the structure of a statistical, rather than a causal model. Without faithfulness, conditional independence tests do not imply anything. This is a subtle issue, actually.

There is no magic -- you do not get causality without causal assumptions.

Comment author: eurg 20 October 2012 10:55:55PM 1 point [-]

Is this another variation of the theme that one needs to assume the possibility of inductive reasoning to make an argument for it (or also assume Occam's Razor to argue for it)? Also, the specific example he gave seems to me like an instance of "given very skewed data, the best guesses are still wrong" (there was sometime a variation of that here, regarding bets and opponents who have superior information). Or are you thinking of something for subtle?

Comment author: IlyaShpitser 31 October 2012 06:24:10PM *  2 points [-]

Even if you assume that we can do induction (and assume faithfulness!), conditional independence tests simply do not select among causal models. They select among statistical models, because conditional independences are properties of joint distributions (statistical, rather than causal objects). Linking those joint distributions with something causal relies on causal assumptions.

I think the biggest lesson to learn from Pearl's book is to keep statistical and causal notions separate.

Comment author: eurg 05 November 2012 02:21:20PM 0 points [-]

Thanks for clarifying!

Comment author: jimrandomh 12 October 2012 06:14:52PM 1 point [-]

In this case, the true structure would be O->E, O->W, I->E. If O is unobserved, then you confuse a fork for an arrow, but I'm not sure you can actually get an arrow pointing the wrong way just by omitting variables.

Comment author: Houshalter 19 August 2015 12:33:55AM 0 points [-]

He addressed that in the third footnote.

Or there might be some hidden third factor, a gene which causes both fat and non-exercise. By Occam's Razor this is more complicated and its probability is penalized accordingly, but we can't actually rule it out. It is obviously impossible to do the converse experiment where half the subjects are randomly assigned lower weights, since there's no known intervention which can cause weight loss.

The model assumes that those are the only relevant variables. Given that assumption, we can prove that weight causes exercise. And that it can't be the other way around.

If there are unobserved variables, it's possible that they can cause weight and cause exercise. However that wasn't one of the hypotheses anyone believed beforehand; they were arguing whether weight causes exercise or if exercise causes weight.

Second, even if there is an unobserved variable, it still suggests that exercising more will not improve your weight. Otherwise internet use would correlate with weight. Because internet use affects exercise. If exercise affected weight at all, then internet use would indirectly cause weight gain, and therefore correlate with it.

The whole point of the article is about this trick. Where taking a weird and unrelated variable like internet use, lets us discover the direction of causation. Which according to common knowledge about statistics, shouldn't be possible. Not without randomized controlled experiments.