You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

FrameBenignly comments on Open thread, Dec. 21 - Dec. 27, 2015 - Less Wrong Discussion

2 Post author: MrMind 21 December 2015 07:56AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (230)

You are viewing a single comment's thread. Show more comments above.

Comment author: FrameBenignly 22 December 2015 11:34:09PM *  0 points [-]

You're using correlation in what I would consider a weird way. Randomization is intended to control for selection effects to reduce confounds, but when somebody says correlational study I get in my head that they mean an observational study in which no attempt was made to determine predictive causation. When an effect shows up in a nonrandomized study, it's not that you can't determine whether the effect was causative; it's that it's more difficult to determine whether the causation was due to the independent variable or an extraneous variable unrelated to the independent variable. It's not a question of whether the effect is due to correlation or causation, but whether the relationship between the independent and dependent variable even exists at all.

Comment author: Anders_H 23 December 2015 12:41:05AM *  2 points [-]

(1) Observational studies are almost always attempts to determine causation. Sometimes the investigators try to pretend that they aren't, but they aren't fooling anyone, least of all the general public. I know they are attempting to determine causation because nobody would be interested in the results of the study unless they were interested in causation. Moreover, I know they are attempting to determine causation because they do things like "control for confounding". This procedure is undefined unless the goal is to estimate a causal effect

(2) What do you mean by the sentence "the study was causative"? Of course nobody is suggesting that the study itself had an effect on the dependent variable?

(3) Assuming that the statistics were done correctly and that the investigators have accounted for sampling variability, the relationship between the independent and dependent variable definitely exists. The correlation is real, even if it is due to confounding. It just doesn't represent a causal effect

Comment author: Lumifer 23 December 2015 04:40:18PM *  2 points [-]

You are assuming a couple of things which are almost always true in your (medical) field, but are not necessarily true in general. For example,

Observational studies are almost always attempts to determine causation

Nope. Another very common reason is to create a predictive model without caring about actual causation. If you can't do interventions but would like to forecast the future, that's all you need.

Assuming that the statistics were done correctly and that the investigators have accounted for sampling variability, the relationship between the independent and dependent variable definitely exists.

That further assumes your underlying process is stable and is not subject to drift, regime changes, etc. Sometimes you can make that assumption, sometimes you cannot.

Comment author: Vaniver 23 December 2015 08:45:34PM *  1 point [-]

Another very common reason is to create a predictive model without caring about actual causation. If you can't do interventions but would like to forecast the future, that's all you need.

You'd also like a guarantee that others can't do interventions, or else your measure could be gamed. (But if there's an actual causal relationship, then 'gaming' isn't really possible.)

Comment author: FrameBenignly 23 December 2015 01:03:11AM 0 points [-]

(1) I just think calling a nonrandomized study a correlational study is weird.

(2) I meant to say effect; not study; fixed

(3) If something is caused by a confounding variable, then the independent variable may have no relationship with the dependent variable. You seem to be using correlation to mean the result of an analysis, but I'm thinking of it as the actual real relationship which is distinct from causation. So y=x does not mean y causes x or that x causes y.

Comment author: Anders_H 23 December 2015 01:18:54AM 0 points [-]

I don't understand what you mean by "real relationship". I suggest tabooing the terms "real relationship" and "no relationship".

I am using the word "correlation" to discuss whether the observed variable X predicts the observed variable Y in the (hypothetical?) superpopulation from which the sample was drawn. Such a correlation can exist even if neither variable causes the other.

If X predicts Y in the superpopulation (regardless of causality), the correlation will indeed be real. The only possible definition I can think of for a "false" correlation is one that does not exist in the superpopulation, but which appears in your sample due to sampling variability. Statistical methodology is in general more than adequate to discuss whether the appearance of correlation in your sample is due to real correlation in the superpopulation. You do not need causal inference to reason about this question. Moreover, confounding is not relevant.

Confounding and causal inference are only relevant if you want to know whether the correlation in the superpopulation is due to the causal effect of X on Y. You can certainly define the causal effect as the "actual real relationship", but then I don't understand how it is distinct from causation.

Comment author: FrameBenignly 23 December 2015 04:01:22AM 0 points [-]

I just realized the randomized-nonrandomized study was just an example and not what you were talking about.