In a recent comment, I suggested that correlations between seemingly unrelated periodic time series share a common cause: time. However, the math disagrees... and suggests a surprising alternative.
Imagine that we took measurements from a thermometer on my window and a ridiculously large tuning fork over several years. The first set of data is temperature T over time t, so it looks like a list of data points [(t0, T0), (t1, T1), ...]. The second set of data is mechanical strain e in the tuning fork over time, so it looks like a list of data points [(t0, e0), (t1, e1), ...]. We line up the temperature and strain data according to time, yielding [(T0, e0), (T1, e1), ...] and find a significant correlation between the two, since they happen to have similar periodicity.
Recalling Judea Pearl, we suggest that there is almost certainly some causal relationship between the temperature outside the window and the strain in the ridiculously large tuning fork. Common sense suggests that neither causes the other, so perhaps they have some common cause? The only other variable in the problem is time, so perhaps time is the common cause. This sort of makes sense, since changes in time intuitively seem to cause the changes in temperature and strain.
Let's check that intuition with some math. First, imagine that we ignore the time data. Now we just have a bunch of temperature data points [T0, T1, ...] and strain data points [e0, e1, ...]. In fact, in order to truly ignore time data, we cannot even order the points according to time! But that means that we no longer have any way to line up the points T0 with e0, T1 with e1, etc. Without any way to match up temperature points to corresponding strain points, the temperature and strain data are randomly ordered, and the correlation disappears!
We have just performed a d-separation. When time t was known (i.e., controlled for), the variables T and e were correlated. But when t was unknown, the variables were uncorrelated. Now, let's wave our hands a little and equate correlation with dependence. If time were a common cause of temperature and strain, then we should see that T and e are correlated without knowledge of time, but the correlation disappears when controlling for time. However, we see exactly the opposite structure: controlling for t induces the correlation. This pattern is called a "collider", and it implies that time is a common effect of temperature and strain. Rather than time causing the oscillations in our time series, the oscillations in our time series cause time.
Whoa. Now that the math has given us the answer, let's step back and try to make sense of it. Imagine that everything in the universe stopped moving for some time, and then went back to moving exactly as before. How could we measure how much time passed while the universe was stopped? We couldn't. For all practical purposes, if nothing changes, then time has stopped. Time, then, is an effect of motion, not vice versa. This is an old idea from philosophy/physics (I think I originally read it in one of Stephen Hawking's books). We've just rederived it.
But we may still wonder: what caused the correlation between temperature and strain? A common effect cannot cause a correlation, so where did it come from? The answer is that there was never any correlation between temperature and strain to begin with. Given just the temperature and strain data, with no information about time (e.g. no ordering or correspondence between points), there was no correlation. The correlation was induced by controlling for time. So the correlation is only logical; there is no physical cause relating the two, at least within our model.
What? This makes no sense.
I guess you haven't seen this stated explicitly, but the framework of causal networks makes an iid assumption. The idea is that the causal network represents some process that occurs a lot, and we can watch it occur until we get a reasonably good understanding of the joint distribution of variables. Part of this is that it the same process occurring, so there is no time dependence built into the framework.
For some purposes, we can model time by simply including it as an observed variable, which you do in this post. However, the different measurements of each variable are associated because they come from the same sample of the (iid) causal process, whether or not we are conditioning on time. The way you are trying to condition on time isn't correct, and the correlation does exists in both cases. (Really, we care about dependence rather than correlation, but it doesn't make a difference here.)
I do think that this is a useful general direction of analysis. If the question is meaningful at all, then the answer is probably that given by Armok_GoB in the original thread, but it would be useful to clarify what exactly the question means. There is probably a lot of work to be done before we really understand such things, but I would advise you to better understand the ideas behind causal networks before trying to contribute.
Causal networks do not make an iid assumption. Consider one of the simplest examples, in which we examine experimental data. Some of the variables are chosen by the experimenter. They can be chosen any way the experimenter pleases, so long as they vary. The process is the same, but that does not imply iid observations. It just means that time dependence must enter through the variables. As you say, it is not built in to the framework.
The problem is to reduce the phrase "the different measurements of each variable are associated because they come from ... (read more)