Exactly! We want to incorporate the association information using Bayes theorem. If you have zero information about the mapping, then your knowledge is invariant under permutations of the data sets (e.g., swapping T0 with T1). That implies that your prior over the associations is uniform over the possible permutations (note that a permutation uniquely specifies an association and vice versa). So, when calculating the correlation, you have to average over all permutations, and the correlation turns out to be identically zero for all possible data. No association means no correlation.
So in the zero information case, we get this weird behavior that isn't what we expect. If the zero information case doesn't work, then we can't expect to get correct answers with only partial information about the associations. We can expect similar strangeness when trying to deal with partial information based on priors about side-effects caused by our hypothetical drug.
If we don't have enough information to construct the model, then our analysis should yield inconclusive results, not weird or backward results. So the problem is to figure out the right way to handle association information.
Yes, but this is a completely different matter than your original post. Obviously this is how we should handle this weird state of information that you're constructing, but it doesn't have the causal interpretation you give it. You are doing something, but it isn't causal analysis. Also, in the scenario you describe, you have the association information, so you should be using it.
In a recent comment, I suggested that correlations between seemingly unrelated periodic time series share a common cause: time. However, the math disagrees... and suggests a surprising alternative.
Imagine that we took measurements from a thermometer on my window and a ridiculously large tuning fork over several years. The first set of data is temperature T over time t, so it looks like a list of data points [(t0, T0), (t1, T1), ...]. The second set of data is mechanical strain e in the tuning fork over time, so it looks like a list of data points [(t0, e0), (t1, e1), ...]. We line up the temperature and strain data according to time, yielding [(T0, e0), (T1, e1), ...] and find a significant correlation between the two, since they happen to have similar periodicity.
Recalling Judea Pearl, we suggest that there is almost certainly some causal relationship between the temperature outside the window and the strain in the ridiculously large tuning fork. Common sense suggests that neither causes the other, so perhaps they have some common cause? The only other variable in the problem is time, so perhaps time is the common cause. This sort of makes sense, since changes in time intuitively seem to cause the changes in temperature and strain.
Let's check that intuition with some math. First, imagine that we ignore the time data. Now we just have a bunch of temperature data points [T0, T1, ...] and strain data points [e0, e1, ...]. In fact, in order to truly ignore time data, we cannot even order the points according to time! But that means that we no longer have any way to line up the points T0 with e0, T1 with e1, etc. Without any way to match up temperature points to corresponding strain points, the temperature and strain data are randomly ordered, and the correlation disappears!
We have just performed a d-separation. When time t was known (i.e., controlled for), the variables T and e were correlated. But when t was unknown, the variables were uncorrelated. Now, let's wave our hands a little and equate correlation with dependence. If time were a common cause of temperature and strain, then we should see that T and e are correlated without knowledge of time, but the correlation disappears when controlling for time. However, we see exactly the opposite structure: controlling for t induces the correlation. This pattern is called a "collider", and it implies that time is a common effect of temperature and strain. Rather than time causing the oscillations in our time series, the oscillations in our time series cause time.
Whoa. Now that the math has given us the answer, let's step back and try to make sense of it. Imagine that everything in the universe stopped moving for some time, and then went back to moving exactly as before. How could we measure how much time passed while the universe was stopped? We couldn't. For all practical purposes, if nothing changes, then time has stopped. Time, then, is an effect of motion, not vice versa. This is an old idea from philosophy/physics (I think I originally read it in one of Stephen Hawking's books). We've just rederived it.
But we may still wonder: what caused the correlation between temperature and strain? A common effect cannot cause a correlation, so where did it come from? The answer is that there was never any correlation between temperature and strain to begin with. Given just the temperature and strain data, with no information about time (e.g. no ordering or correspondence between points), there was no correlation. The correlation was induced by controlling for time. So the correlation is only logical; there is no physical cause relating the two, at least within our model.