You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Douglas_Knight comments on Whole genome sequencing vs SNP genotyping - Less Wrong Discussion

5 Post author: harcisis 11 June 2015 10:09PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (9)

You are viewing a single comment's thread. Show more comments above.

Comment author: Douglas_Knight 12 June 2015 05:07:42PM *  3 points [-]

I agree with your general point, but here is a technical comment: 23andMe is the million most common SNPs, but that is not the same as the million most common variants, because not all variation is in the form of a SNP. SNP stands for "single nucleotide polymorphism" -- it means that one letter is changed while the context is unchanged. They are easy to detect because of that context, and that ease of detection is why they are used.

Another kind of variation is an insertion or a deletion. They are harder to detect, which is why 23andMe only detects three of them, ones in the BRCA gene that are common among Ashkenazi. It does not attempt to detect even the ones that are equally common among the Dutch. They are easy to detect with whole genome sequencing and they are valuable to detect because they are fairly easy to interpret: the whole protein is ruined. What the protein does and what you can do about it are harder problems, but it's not like finding a new SNP, where it probably means nothing.

A third kind of variation is copy number variation, where there is a repetitive section of the DNA and number of repeats varies from person to person. But whole genome sequencing today is bad at such regions, at least if the number of repeats is large. A lot of people think that they are important, but the fact that they are hard to measure makes that hard to assess at this time.