Eliezer_Yudkowsky comments on Philosophy Needs to Trust Your Rationality Even Though It Shouldn't - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (169)
By (a) I mean that you can sometimes get the true graph exactly even without having to observe confounders. Actually this was sort of known already (see the FCI algorithm, or even the IC* algorithm in Pearl's book), but we can do a lot better than that. For example, if we have the true graph:
a -> b -> c -> d, with a <- u1 -> c, and a <- u2 -> d, where we do not observe u1,u2, and u1,u2 are very complicated, then we can figure out the true graph exactly by independence type techniques without having to observe u1 and u2. Note: the marginal distribution p(a,b,c,d) that came from this graph has no conditional independences at all (checkable by d-separation on a,b,c,d), so typical techniques fail.
(b) is I guess "a subtle issue" -- but my point is about careful language use and keeping causal and statistical issues clear and separate.
A "Bayesian network" (or "belief network" -- I don't like the word Bayesian here because it is confusing the issue, you can use frequentist techniques with belief networks if you wanted, in fact a lot of folks do) is a joint distribution that factorizes as a DAG. That's it. Nothing about causality. If there is a joint density representing a causal process where a is a direct cause of b is a direct cause of c, then this joint density will factorize with respect to both
a -> b -> c
and
a <- b <- c
but only the former graph is causal, the latter is not. Both graphs form a "Bayesian network" with the joint density (since the density factorizes with respect to both graphs), but only one graph is a causal graph. If you want to talk about causal models, in addition to saying that there is a Markov factorization you also need to say something else -- something that makes parents into direct causes. Usually people say something like:
for every x, p(x | pa(x)) = p(x | do(pa(x))), or mention the g-formula, or the truncated factorization of do(.), or "the causal Markov condition."
But this is something that (a) you need to say explicitly, and (b) involves language beyond standard probability theory because there is a do(.), and (c) is controversial to some people. What is do(.)? It refers to a hypothetical experiment/intervention.
If all you are learning is a graph that gives you a Markov factorization you have no business making claims about interventions -- interventions are a separate magisterium. You can assume that the unknown graph from which the data came is causal -- but you need to say this explicitly, this assumption will be controversial to some people, and by making that assumption you are I think committing yourself to the use of interventionist/potential outcome language (just to describe what it means for a data generating graph to be causal).
I have no problems with you doing Bayesian updating and getting posteriors over causal models -- I just wanted to get more precision on what a causal model is. A causal model is not a density factorizing with respect to a DAG -- that's a statistical model. A causal model makes assertions that relate hypothetical experiments like p(x | do(pa(x))) with observed data like p(x | pa(x)). So your Bayesian updating is operating in a world that contains more than just probability theory (which is a theory of standard joint densities, without the mention of do(.) or hypothetical experiments). You can in fact augment probability theory with a logical description of interventions, see for example this paper:
http://www.jair.org/papers/paper648.html
If your notion of causal model does not relate do(.) to observed data, then I don't know what you mean by a causal model. It's certainly not what I mean by it.
Irrelevant question: Isn't (b || d) | a, c?
No, because b -> c <-> a <-> d is an open path if you condition on c and a.
Ah, right.