paper-machine comments on Can Counterfactuals Be True? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (46)
No, we disagree. My calculations suggest that P[X = 0 | do(Yi = 1 for all i)] = P[X = 1 | do(Yi = 1 for all i)] = 0. The intervention falls outside the region where the original joint pdf has positive mass. The intervention do(X = 1) also annihilates the original joint pdf, because there is no region of positive mass in which X = 1.
I still don't understand why you don't think the problem is a language problem. Pearl's counterfactuals have a specific meaning, so of course they don't mean something else from what they mean, even if the other meaning is a more plausible interpretation of the counterfactual (again, whatever that means -- I'm still not sure what "more plausible" is supposed to mean theoretically).
I think the problem is that when you intervene to make something impossible happen, the resulting system no longer makes sense.
Yes. (I assume you mean "If Gore was president during 9/11, he wouldn't have invaded Iraq.")
Why should I disagree with Pearl's treatment of counterfactuals that don't backtrack?
Isn't the decision of whether or not a given counterfactual backtracks in its most "natural" interpretation largely a linguistic problem?
I don't think that's correct. My understanding of the intervention do(Yi = 1 for all i)] is that it creates a disconnected graph, in which the Yi all have the value 1 (as stipulated by the intervention) but X retains its original mass function P[X = 0] = 1. The causal links from X to the Yi are severed by the intervention, so it doesn't matter that the intervention has zero probability in the original graph, since the intervention creates a new graph. (Interventions into deterministic systems often will have zero probability in the original system, though not in the intervened one.) On the other hand, you claim to be following Pearl2012 whereas I've been reading Pearl2001 and there might have been some differences in his treatment of impossible interventions... I'll check this out.
For now, just suppose the original distribution over X was P[X = 0] = 1 - epsilon and P[X = 1] = epsilon for a very small epsilon. Would you agree that the intervention do(Yi = 1 for all i) now is in the area of positive mass function, but still doesn't change the distribution over X so we still have P[X = 0 | do(Yi = 1 for all i)] = 1 - epsilon and P[X = 1 | do(Yi = 1 for all i)] = epsilon?
I still think it's a conceptual analysis problem rather than a linguistic problem. However perhaps we should play the taboo game on "linguistic" and "conceptual" since it seems we mean different things by them (and possibly what you mean by "linguistic" is close to what I mean by "conceptual" at least where we are talking about concepts expressed in English).
Thanks anyway.
You seem to be done, so I won't belabor things further; I just want to point out that I didn't claim to have a more updated copy of Pearl (in fact, I said the opposite). I doubt there's been any change to his algorithm.
All this ASCII math is confusing the heck out of me, anyway.
EDIT: Oh, dear. I see how horribly wrong I was now. The version of the formula I was looking at said "(formula) for (un-intervened variables) consistent with (intervention), and zero otherwise" and because it was a deterministic system my mind conflated the two kinds of consistency. I'm really sorry to have blown a lot of your free time on my own incompetence.
Thanks for that.... You just saved me a few hours additional research on Pearl to find out whether I'd got it wrong (and misapplied the calculus for interventions that are impossible in the original system)!
Incidentally, I'm quite a fan of Pearl's work, and think there should be ways to adjust the calculus to allow reasonable backtracking counterfactuals as well as forward-tracking ones (i.e. ways to find a minimal intervention further back in the graph, one which then makes the antecedent come out true..) But that's probably worth a separate post, and I'm not ready for it yet.