bentarm comments on Simpson's Paradox - Less Wrong

68 Post author: bentarm 12 January 2011 11:01PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (58)

You are viewing a single comment's thread. Show more comments above.

Comment author: bentarm 14 January 2011 01:09:49AM 2 points [-]

Let's say the only data we'd collected were gender and whether or not the patient's birthday was a Tuesday. Do you really think there is something to be gained from building four separate models now?

More seriously, if you collect enough information, then purely by chance there will be some partitioning of the data which gives the wrong conclusion.

I don't think we disagree on anything important here - the main point is that you need to be careful when choosing which partitions of the data you use - arbitrarily partitioning along every available divide is not optimal.

PS - thanks for the typo correction, I really need to learn to proofread...