You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ChristianKl comments on Open thread, December 7-13, 2015 - Less Wrong Discussion

3 Post author: polymathwannabe 07 December 2015 02:47PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (223)

You are viewing a single comment's thread. Show more comments above.

Comment author: ChristianKl 09 December 2015 11:49:39PM *  2 points [-]

The linked paper says:

We fitted a linear mixed model y = ยต + g + e, where y is the phenotype, m is the mean term, g is the aggregate additive genetic effect of all the SNPs and e is the residual effect.

If you have 3511 individuals and 549692 SNPs you won't find any nonlinear effects. 3511 observations of 549692 SNPs is already overfitted 3511 observations of 549692 * 549691 gene interactions is even more overfitted and I wouldn't expect that the four four principal components they calculate to find an existing needle in that haystack.

Apart from that it's worth noting that IQ is g fitted to a bell curve. You wouldn't expect a variable that you fit to a bell curve to behave fully linearly.

Comment author: Douglas_Knight 10 December 2015 12:36:45AM -1 points [-]

No, they didn't try to measure non-linear effects. Nor did they try to measure environment. That is all irrelevant to measuring linear effects, which was the main thing I wanted to convey. If you want to understand this, the key phrase is "narrow sense heritability." Try a textbook. Hell, try wikipedia.

That it did well on held-back data should convince you that you don't understand overfitting.

Actually, I would expect a bell curve transformation to be the most linear.

Comment author: ChristianKl 10 December 2015 01:27:49AM 1 point [-]

That it did well on held-back data should convince you that you don't understand overfitting.

They didn't do well on the gene level: Analyses of individual SNPs and genes did not result in any replicable genome-wide significant association

No, they didn't try to measure non-linear effects. Nor did they try to measure environment. That is all irrelevant to measuring linear effects, which was the main thing I wanted to convey.

No, the fact that you can calculate a linear model that predicts h_2 in a way that fits 0.4 or 0.5 of the variance doesn't mean that the underlying reality is structured in a way that gene's have linear effects.

To make a causal statement that genes work in a linear way the summarize statistic of is not enough.