timtyler comments on Information theory and FOOM - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (93)
In the Levitt paper, 64% is the number of single-domain architecture proteins that are found in at least two of the 3 groups viruses, prokaryotes, and eukaryotes (figure 3). This is my (very close) approximation for the fraction of families in eukaryotes or prokaryotes found in both eukaryotes and prokaryotes, which isn't reported. 84% is computed from that information, plus the caption of figure 3 saying that prokaryotes contain 88% of SDA families. 73% is computed from all of that information.
There is no bias towards discovering genes shared with eukaryotes in ordinary sequencing. We sequence complete genomes. Almost all of the bacterial genes known come from these whole-genome projects. We've sequenced many more bacteria than eukaryotes. Bacterial genomes don't contain much repetitive intergenic DNA, so you get nice complete genome assemblies.
Life starting 3.7 billion years ago - could be. Google's top ten show claims ranging from 2.7GY to 4.4GY ago. Adding that .7 billion could make the information-growth curve more linear, and remove one exponentiation in my analysis.
Let's just say I'm measuring the information in DNA. Information in "the diversity of life" is too vague. I don't want to measure any information that an organism or an ecosystem gains from the environment by expressing those genetic codes.
I too was talking about information in DNA. The number of species influences the quantity of information present in the DNA of an ecosystem - just as rolling a dice 100 times supplies more information than rolling it once.