Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

CellBioGuy comments on Bayesianism for Humans - Less Wrong

52 Post author: ChrisHallquist 29 October 2013 11:54PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread. Show more comments above.

Comment author: CellBioGuy 01 November 2013 10:44:12AM *  12 points [-]

A large gene is a large target for spontaneous mutation. Most people with the disease did not inherit it but instead had something inside the large gene go wrong between their parents and them. For the spontaneous mutations, you likely have never seen that particular difference before.

You also have no idea where in the gene the problem could be and you just ned to sequence the thing. With current sequencing technology you basically need to either throw the entire genome into an Illumina sequencer for many thousands of dollars, or do a number of small custom Sanger sequencing reactions which read you out about 600-800 specific base pairs at a time which are individually not that expensive or difficult but can add up when you need to tile them over a large area. Seeing as the gene is 350 kilobases, in this case it adds up both in terms of cost and in terms of source DNA you need.

SNPs are only useful when there is one or a few ancestral mutant alleles that have spread through the population and in which you can either look for one known causitive change, or a nearby unique SNP that gets dragged along for the ride with the disease allele because it is quite close to it.

EDIT to clear up some questions from a few layers up in the chain: These days looking for known, relatively common genetic variants is very easy, as the success of 23andme illustrates. These tests use microarrays to look for SNPs - this process does not involve sequencing though, but instead only tests the sequence similarity (via binding affinity) of a sample to a set of short reference strands. In order to identify a particular allele with this technique though it needs to have been detected in previous work. The only way to confidently figure out rare or unique variants is to outright sequence and that gets expensive for regions larger than a few kilobases. And hilariously enough, due to the multiple forms of sequencing technology that exist if you need to sequence an area larger than a megabase or two it becomes cheaper to just sequence the entire genome.