Causal graphs and counterfactuals

Stuart_Armstrong

Problem solved: Found what I was looking for in: An Axiomatic Characterization Causal Counterfactuals, thanks to Evan Lloyd.

Basically, making every endogenous variable a deterministic function of the exogenous variables and of the other endogenous variables, and pushing all the stochasticity into the exogenous variables.

Old post:

A problem that's come up with my definitions of stratification.

Consider a very simple causal graph:

In this setting, A and B are both booleans, and A=B with 75% probability (independently about whether A=0 or A=1).

I now want to compute the counterfactual: suppose I assume that B=0 when A=0. What would happen if A=1 instead?

The problem is that P(B|A) seems insufficient to solve this. Let's imagine the process that outputs B as a probabilistic mix of functions, that takes the value of A and outputs that of B. There are four natural functions here:

f₀(x) = 0
f₁(x) = 1
f₂(x) = x
f₃(x) = 1-x

Then one way of modelling the causal graph is as a mix 0.75f₂ + 0.25f₃. In that case, knowing that B=0 when A=0 implies that P(f₂)=1, so if A=1, we know that B=1.

But we could instead model the causal graph as 0.5f₂+0.25f₁+0.25f₀. In that case, knowing that B=0 when A=0 implies that P(f₂)=2/3 and P(f₀)=1/3. So if A=1, B=1 with probability 2/3 and B=1 with probability 1/3.

And we can design the node B, physically, to be one or another of the two distributions over functions or anything in between (the general formula is (0.5+x)f₂ + x(f₃)+(0.25-x)f₁+(0.25-x)f₀ for 0 ≤ x ≤ 0.25). But it seems that the causal graph does not capture that.

Owain Evans has said that Pearl has papers covering these kinds of situations, but I haven't been able to find them. Does anyone know any publications on the subject?

I still basically agree with my retracted comment, I'd just like to note that taken at face value, your two equations for B given A really are the same.

The counterfactual difference comes from an implied random variable that decides which branch of the equation we're "using" (in the implied causal process that goes from A to B), and which can remember this information during counterfactual reasoning. But of course it is a simple thing to make this implied random variable an explicit node in your causal graph. This is probably the best resolution.

This sounds like a question of how you're choosing to define a causal node. Is it something that's a fixed function of its parents? In which case your hypotheses about the function from A to B are hypotheses over different causal graphs. Or should the function from parents to node be a parameter that you represent inside a causal graph? In which case you need some representation of this distribution.

Either way, I agree that you need more than what you started with to capture the counterfactuals you're thinking of here.

I still basically agree with my retracted comment, I'd just like to note that taken at face value, your two equations for B given A really are the same.

Either way, I agree that you need more than what you started with to capture the counterfactuals you're thinking of here.