gwern comments on Confound it! Correlation is (usually) not causation! But why not? - Less Wrong

44 Post author: gwern 09 July 2014 03:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (34)

You are viewing a single comment's thread. Show more comments above.

Comment author: gwern 09 July 2014 04:01:52AM 5 points [-]

And we can’t explain away all of this low success rate as the result of illusory correlations being throw up by the standard statistical problems with findings such as small n, sampling error (A & B just happened to sync together due to randomness), selection bias, publication bias, etc. I’ve read about those problems at length, and despite knowing about all that, there still seems to be a problem: correlation too often ≠ causation.

Comment author: Furslid 09 July 2014 04:21:17AM 3 points [-]

I'm pointing out that your list isn't complete, and not considering this possibility when we see a correlation is irresponsible. There are a lot of apparent correlations, and your three possibilities provide no means to reject false positives.

Comment author: gwern 09 July 2014 03:21:44PM 6 points [-]

You are fighting the hypothetical. In the least convenient possible world where no dataset is smaller than a petabyte and no one has ever heard of sampling error, would you magically be able to spin the straw of correlation into the gold of causation? No. Why not? That's what I am discussing here.

Comment author: Cyan 09 July 2014 05:45:39PM *  5 points [-]

I suggest you move that point closer to the list of 3 possibilities -- I too read that list and immediately thought, "...and also coincidence."

The quote you posted above ("And we can't explain away...") is an unsupported assertion -- a correct one in my opinion, but it really doesn't do enough to direct attention away from false positive correlations. I suggest that you make it explicit in the OP that you're talking about a hypothetical in which random coincidences are excluded from the start. (Upvoted the OP FWIW.)

(Also, if I understand it correctly, Ramsey theory suggests that coincidences are inevitable even in the absence of sampling error.)

Comment author: IlyaShpitser 09 July 2014 06:46:54PM *  8 points [-]

I agree with gwern's decision to separate statistical issues from issues which arise even with infinite samples. Statistical issues are also extremely important, and deserve careful study, however we should divide and conquer complicated subjects.

Comment author: Cyan 09 July 2014 07:05:31PM 9 points [-]

I also agree -- I'm recommending that he make that split clearer to the reader by addressing it up front.

Comment author: gwern 10 July 2014 01:52:58AM 4 points [-]

I see. I really didn't expect this to be such an issue and come up in both the open thread & Main... I've tried rewriting the introduction a bit. If people still insist on getting snagged on that, I give up.

Comment author: [deleted] 10 July 2014 04:28:16PM 0 points [-]

I'm pointing out that your list isn't complete,

It ends with “etc.” for Pete's sake!

Comment author: Cyan 10 July 2014 04:31:01PM 1 point [-]

...no it doesn't?