Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: tgb 19 September 2013 02:38:44AM 3 points [-]

Thanks, that dicussions's examples were exactly what I was looking for!

Comment author: krzhang 19 September 2013 08:37:32AM 2 points [-]

I'm glad it was helpful. =)

Comment author: Vaniver 19 September 2013 07:23:50AM *  1 point [-]

Why don't you take his model, as described, and see if it really produces this effect?

For the first disagreement, that's a disagreement about his commentary on his second figure. I don't have the data to actually calculate the correlation there, but eyeballing it the groups look like they don't have a positive relationship between education and income anywhere near that of the larger group.

The second disagreement is on interpretation. If you add noise in both dimensions to a multivariate Gaussian model with mean differences between groups, then that impacts any slice of the model (modified by the angle between the mean difference vector and the slice vector). If one subgroup is above and to the right of the other subgroup, that means it's above for every vertical slice and to the right for every horizontal slice. (On northwest-southeast slices, there's no mean difference between the distributions, just population size differences, and the mean difference is maximized on the northeast-southwest slice.)

The particular slicing used in this effect- looking at each vertical slice individually, and each horizontal slice individually- seems reasonable, except that in the presence of mean differences it behaves as a filter that preserves the NW-SE noise!

The grandparent was wrong before I edited it, where I speculated that the noise had to be negatively correlated. That's the claim that the major axis of the covariance ellipse has to be oriented a particular direction, but that was an overreach, as you see the reverse regression effect if there is any noise along the NW-SE axis. Take a look at Yan's first figure- it has noise in both blues and greens, but it's one-dimensional noise going NE-SW, and so we don't see reverse regression.

My original thought (when I thought you might need the major axis to be NW-SE, rather than just the NW-SE axis to be nonzero) was that this was just a reversal effect, with the noise providing the reversing factor. That's still true but I'm surprised at how benign the restrictions on the noise are.

That is, I disagree with Yan that this has a different origin than Simpson's Paradox, but I agree with Yan that this is an important example of how pernicious reversal effects are, and that noise generates them by default, in some sense. I would demonstrate it with a multivariate Gaussian where the blue mean is [6 6], the green mean is [4 4], and the covariance matrix is [1 .5; .5 1], so that it's obvious that the dominant relationship for each group is a positive relationship between education and income but the NW-SE relationship exists and these slices make it visible.

Comment author: krzhang 19 September 2013 08:36:20AM *  2 points [-]

Hi Vaniver! =D

On the commentary: your eyeballing seems good, but I don't think I ever said anything about relative comparisons between correlation coefficients (namely just overall correlation is positive). As you observed, I could easily make all 3 correlations (blue-only, green only, or blue+green) positive. I don't have any interesting things to say about their relative degrees.

I don't quite see the difference in interpretation from this writing. I agree with basically all the stuff you've written? The fact that the slicing "behaves as a filter", if I interpret it correctly, is exactly the problem here.

I don't know what "have a different origin than Simpson's paradox" means exactly, but here are a few ways they differ and why I say they are "different":

  • a fundamental assumption on Simpson's paradox is that there's some imbalance with the denominators; in your 2x2x2 matrix you can't arbitrarily scale the numbers arbitrarily; all the examples you can construct almost relies on (let's say we are using the familiar batting averages example) the fact that the denominators (row sums) are different.

  • the direct cause of the reversal effect is, as you said, the noise; I don't think Simpson's paradox has anything related to the noise.

Idea: my steel-man version of your argument is that reversal effects arise when you have inhomogenous data, and this is definitely the more general common problem in both situations. In that case I agree. (this is how I teach this class at SPARC, at least).

Comment author: Qiaochu_Yuan 19 February 2013 05:54:04AM 1 point [-]

Welcome! It's good to see you here.

Comment author: krzhang 19 February 2013 05:58:03AM 0 points [-]

Haha hey QC. Remind me sometime to learn the "get ridiculously high points in karma-based communities and learn a lot" metaskill from you... you seem to be off to a good start here too ;)

Comment author: krzhang 19 February 2013 05:33:23AM *  10 points [-]

I am Yan Zhang, a mathematics grad student specializing in combinatorics at MIT (and soon to work at UC Berkeley after graduation) and co-founder of Vivana.com. I was involved with building the first year of SPARC. There, I met many cool people at CFAR, for which I'm now a curicculum consultant.

I don't know much about LW but have liked some of the things I have read here; AnnaSalamon described me as a "street rationalist" because my own rationality principles are home-grown from a mix of other communities and hobbies. In that sense, I'm happy to step foot into this "mainstream dojo" and learn your language.

Recently Anna suggested I may want to cross-post something I wrote to LW and I've always wanted to get to know the community better, so this is the first step, I suppose. I look forward to learning from all of you.