Frederik Hytting Jørgensen

Wikitag Contributions

Comments

Sorted by

Am I right that the line of argument here is not about the generalization properties, but a claim about the quality of explanation, even on the restricted distribution?

Yes, I think that is a good way to put it. But faithful mechanistic explanations are closely related to generalization.  

Like here, your causal model  should have the explicit condition "x_1=x_2".

That would be a sufficient condition for  to make the correct predictions. But that does not mean that  provides a good mechanistic explanation of  on those inputs. 

I'm a bit unsure about the way you formalize things, but I think I agree with your point.  It is a helpful point.  I'll try to state a similar (same?) point.

Assume that all variables have the natural numbers as their domain. Assume WLOG that all models only have one input and one output node. Assume that  is an abstraction of  on relative to input support  and . Now there exists a model  such that  for all , but  is not a valid abstraction of relative to input support  . For example, you may define the structural assignment of the output node in  by

where  is an element in , which we assume to be non-empty.

There is nothing surprising about this. As you say, we need assumptions to rule things like these out. And coming up with those assumptions seems potentially interesting. People working on mechanistic interpretability should think more about what assumptions would make their methods reasonable. 

The main point of the post is not that causal abstractions do not provide guarantees about generalization (this point is underappreciated, but really, why would they?).  My main point is that causal abstractions can misrepresent the mechanistic nature of the underlying model (this is of course related to generalizability).    

Finally got around to looking at this. I didn't read the paper carefully, so I may have missed something, but I could not find anything that makes me more at ease with this conclusion. 

Ben has already shown that it is perfectly possible that Y causes X. If this is somehow less likely that X causes Y, this is exactly what needs to be made precise. If faithfulness is the assumption that makes this work, then we need to show that faithfulness is a reasonable assumption in this example. It seems that this work has not been done?

If we can find the precise and reasonable assumptions that exclude that Y causes X, that would be super interesting.   

For example, in theorem 3.2 in Causation, Prediction, and Search, we have a result that says that faithfulness holds with probability 1 if we have a linear model with coefficients drawn randomly from distributions with positive densities. 

It is not clear to me why we should expect faithfulness to hold in a situation like this, where Z is constructed from other variables with a particular purpose in mind.

Consider the graph Y<-X->Z. If I set Y:=X and Z:=X, we have that X⊥Y|Z, violating faithfulness. How are you sure that you don't violate faithfulness by constructing Z?

I'm not quite convinced by this response. Would it be possible to formalize "set of probability distributions in which Y causes X is a null set, i.e. it has measure zero."?

It is true that if the graph was (Y->X, X->Z, Y->Z), then we would violate faithfulness. There are results that show that under some assumptions, faithfulness is only violated with probability 0. But those assumptions do not seem to hold in this example.