Limiting Causality by Complexity Class

Bunthut

LESSWRONG
LW

Limiting Causality by Complexity Class — LessWrong

18 Limiting Causality by Complexity Class

by Bunthut

30th Jan 2021

AI Alignment Forum

2 min read

18 Ω 6

The standard account of causality depends on the idea of intervention: The question of what follows if X not just naturally occurs, but we bring it about artificially independently of its usual causes. This doesn't go well with embedded agency. If the agent is part of the world, then its own actions will always be caused by the past state of the world, and so it couldnt know if the apparent effect of its interventions isn't just due to some common cause. There is a potential way out of this if we limit the complexity of causal dependencies.

Classically, X is dependent on Y iff the conditional distribution of X on Y is different from the unconditional distribution on X. In slightly different formulation, there needs to be a program that takes Y as an input and outputs adjustments to our unconditional distribution on X, and where those adjustments improve prediction.

Now we could limit which programms we consider admissible. This will be by computational complexity of the program with respect to the precision of Y. For example, I will say that X is polynomially dependent on Y iff there is a programm running in polynomial time that fullfills these conditions. (Note that dependence in this new sense needn't be symmetrical anymore.)

Unlike with the unlimited dependence, there's nothing in principle impossible about the agents actions being polynomially independent from an entire past worldstate. This can form a weakened sense of intervention, and the limited-causal consequences of such interventions can be determined from actually observed frequencies.

Now, if we're looking at things where all dependencies are within a certain complexity class, and we analyse it with something stronger than that, this will end up looking just like ordinary causality. It also explains the apparent failure of causality in Newcombs problem: We now have a substantive account of what it is to act in intervention (to be independent in a given complexity class). In general, this requires work by the agent. It needs to determine its actions in such a way that they dont show this dependence. Omega is constructed to make its computational resources insufficient for that. So the agent fails to make itself independent of Omega's prediction. The agent would similarly fail for "future" events it causes where the dependence is in the complexity class. In a sense, this is what puts the events into its subjective future - that it cannot act independently of them.

CausalityRationality

Frontpage

18 Ω 6

Limiting Causality by Complexity Class

New Comment

10 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:39 PM

[-]Darmani5yΩ150

This post is a mixture of two questions: "interventions" from an agent which is part of the world, and restrictions

The first is actually a problem, and is closely related to the problem of how to extract a single causal model which is executed repeatedly from a universe in which everything only happens once. Pearl's answer, from IIRC Chapter 7 of Causality, which I find 80% satisfying, is about using external knowledge about repeatability to consider a system in isolation. The same principle gets applied whenever a researcher tries to shield an experiment from outside interference.

The second is about limiting allowed interventions. This looks like a special case of normality conditions, which are described in Chapter 3 of Halpern's book. Halpern's treatment of normality conditions actually involves a normality ordering on worlds, though this can easily be massaged to imply a normality ordering on possible interventions. I don't see any special mileage here out of making the normality ordering dependent on complexity, as opposed to any other arbitrary normality ordering, though someone may be able to find some interesting interaction between normality and complexity.

Speaking more broadly, this is part of the broader problem that our current definitions of actual causation are extremely model-sensitive, which I find a serious problem. I don't see a mechanistic resolution, but I did find this essay extremely thought provoking, which posits considering interventions in all possible containing models: http://strevens.org/research/expln/MacRules.pdf

[-]Bunthut5yΩ010

Pearl's answer, from IIRC Chapter 7 of Causality, which I find 80% satisfying, is about using external knowledge about repeatability to consider a system in isolation. The same principle gets applied whenever a researcher tries to shield an experiment from outside interference.

This is actually a good illustration of what I mean. You can't shield an experiment from outside influence entirely, not even in principle, because its you doing the shielding, and your activity is caused by the rest of the world. If you decide to only look at a part of the world, one that doesn't contain you, thats not a problem - but thats just assuming that that route of influence doesn't matter. Similarly, "knowledge about repeatability" is causal knowledge. This answer just tells you how to gain causal knowledge of parts of the world, given that you already have some causal knowledge about the whole. So you can't apply it to the entire world. This is why I say it doesn't go well with embedded agency.

The second is about limiting allowed interventions.

No? What I'm limiting is what dependencies we're considering. And it seems that what you say after this is about singular causality, and I'm not really concerned with that. Having a causal web is sufficient for decision theory.

[-]Darmani5yΩ010

Causal inference has long been about how to take small assumptions about causality and turn them into big inferences about causality. It's very bad at getting causal knowledge from nothing. This has long been known.

For the first: Well, yep, that's why I said I was only 80% satisfied.

For the second: I think you'll need to give a concrete example, with edges, probabilities, and functions. I'm not seeing how to apply thinking about complexity to a type causality setting, where it's assumed you have actual probabilities on co-occurrences.

[-]X No-Archive5yΩ010

Maybe I'm confused or misinterpreting. The first sentence of your first paragraph appears to contradict the first sentence of your second paragraph. The two claims seem incommensurable.

The first sentence of your first paragraph appears to appeal to experiment, while the first sentence of your second paragraph seems to boil down to "Classically, X causes Y if there is a significant statistical connection twixt X and Y."

Several problems with this view of causality as deriving from stats. First, the nature of the statistical distribution makes a huge difference. Extreme outliers will occur much more frequency in a Poisson distribution, for instance, than in a Guassian distribution. It can be very hard to determine the nature of the statistical distribution you're dealing with with any degree of confidence, particularly if your sample size is small.

For instance, we can't really calculate or even estimate the likelihood of a large asteroid strike of the magnitude of the one that caused the Tertiary-Cretaceous boundary. These events occur too infrequently so the statistical power of the data is too low to give us any confidence.

The second problem is that any statistical estimate of connection between events is always vulnerable to the four horsemen of irreproducibility: HARKing, low statistical power (AKA too few data points), P-hacking (AKA the garden of forking paths), and publication bias.

https://www.mrc-cbu.cam.ac.uk/wp-content/uploads/2016/09/Bishop_CBUOpenScience_November2016.pdf

The first sentence of the first paragraph claims "The standard account of causality depends on the idea of intervention..." Do you have any evidence to support this? Classicaly, arguments about causation seem be manifold and many don't involve empirical evidence.

One classical criterion for causality, Occam's Razor, involves the simplicity of the reasoning involved and makes no reference to empirical evidence.

Another classical criterion for causality involves the beauty of the mathematics involved in the model. This second criterion has been championed by scientists like Frank Wilczek and Paul Dirac, who asserted "A physical law must possess mathematical beauty," but criticized by scientists like Albert Einstein, who said "Elegance is for tailors; don't believe a theory just because it's beautiful."

Yet another classical criterion for causality involve Bayesian reasoning and the re-evaluation of prior beliefs. Older models of causality boil down to religious and aesthetic considerations -- viz., Aristotle's claim that planetary orbits must be circular because the circle is the most perfect gemoetric figure.

None of these appear to involve intervention.

[-]Bunthut5yΩ030

The first sentence of your first paragraph appears to appeal to experiment, while the first sentence of your second paragraph seems to boil down to "Classically, X causes Y if there is a significant statistical connection twixt X and Y."

No. "Dependence" in that second sentence does not mean causation. It just means statistical dependence. The definition of dependence is important because an intervention must be statistically independent from things "before" the intervention.

None of these appear to involve intervention.

These are methods of causal inference. I'm talking about what causality is. As in, what is the difference between a mere correlation, and causation? The difference is that the second is robust to intervention: if X causes Y, then if I decide to do X, even in circumstances different from those where I've observed it before, Y will happen. If X only correlates with Y, it might not.

[-]Measure5yΩ010

What do X and Y represent in this construction? What is the scaling parameter used to define the complexity class?

[-]Bunthut5yΩ010

X and Y are variables for events. By complexity class I mean computational complexity, not sure what scaling parameter is supposed to be there?

[-]Measure5yΩ020

Computational complexity only makes sense in terms of varying sizes of inputs. Are some Y events "bigger" than others in some way so that you can look at how the program runtime depends on that "size"?

[-]Bunthut5yΩ010

What I had in mind was increasing precision of Y.

[-]Measure5yΩ010

I guess that makes sense. Thanks for clarifying!

Moderation Log