You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Causal graphs and counterfactuals

3 Post author: Stuart_Armstrong 30 August 2016 04:12PM

Problem solved: Found what I was looking for in: An Axiomatic Characterization Causal Counterfactuals, thanks to Evan Lloyd.

Basically, making every endogenous variable a deterministic function of the exogenous variables and of the other endogenous variables, and pushing all the stochasticity into the exogenous variables.

 

Old post:

A problem that's come up with my definitions of stratification.

Consider a very simple causal graph:

In this setting, A and B are both booleans, and A=B with 75% probability (independently about whether A=0 or A=1).

I now want to compute the counterfactual: suppose I assume that B=0 when A=0. What would happen if A=1 instead?

The problem is that P(B|A) seems insufficient to solve this. Let's imagine the process that outputs B as a probabilistic mix of functions, that takes the value of A and outputs that of B. There are four natural functions here:

  • f0(x) = 0
  • f1(x) = 1
  • f2(x) = x
  • f3(x) = 1-x

Then one way of modelling the causal graph is as a mix 0.75f2 + 0.25f3. In that case, knowing that B=0 when A=0 implies that P(f2)=1, so if A=1, we know that B=1.

But we could instead model the causal graph as 0.5f2+0.25f1+0.25f0. In that case, knowing that B=0 when A=0 implies that P(f2)=2/3 and P(f0)=1/3. So if A=1, B=1 with probability 2/3 and B=1 with probability 1/3.

And we can design the node B, physically, to be one or another of the two distributions over functions or anything in between (the general formula is (0.5+x)f2 + x(f3)+(0.25-x)f1+(0.25-x)f0 for 0 ≤ x ≤ 0.25). But it seems that the causal graph does not capture that.

Owain Evans has said that Pearl has papers covering these kinds of situations, but I haven't been able to find them. Does anyone know any publications on the subject?

Comments (2)

Comment author: Manfred 30 August 2016 11:38:26PM 0 points [-]

This sounds like a question of how you're choosing to define a causal node. Is it something that's a fixed function of its parents? In which case your hypotheses about the function from A to B are hypotheses over different causal graphs. Or should the function from parents to node be a parameter that you represent inside a causal graph? In which case you need some representation of this distribution.

Either way, I agree that you need more than what you started with to capture the counterfactuals you're thinking of here.

Comment author: Manfred 31 August 2016 03:54:52AM *  1 point [-]

I still basically agree with my retracted comment, I'd just like to note that taken at face value, your two equations for B given A really are the same.

The counterfactual difference comes from an implied random variable that decides which branch of the equation we're "using" (in the implied causal process that goes from A to B), and which can remember this information during counterfactual reasoning. But of course it is a simple thing to make this implied random variable an explicit node in your causal graph. This is probably the best resolution.