It looks like this was the straw that broke my back - my first post ever on this site after lurking occasionally for upwards of 1.5 years. The explicit plea combined with a number of cringe-inducing misunderstandings of genetics / molecular biology in a bunch of previous posts finally got to me (I'm a grad student studying basic eukaryotic cell biology).
Here's my thirty-thousand-feet overall take on the matter: you cannot in good conscience treat genetic information like a computer program, and this is where most misunderstandings and problematical logical leaps occur.
It is true that protein-coding-gene expression is a process that kind of resembles an algorithm. You have pieces of the DNA, roughly analogous to long-term-storage in this context, under particular circumstances getting transcribed into an RNA copy, roughly analogous to memory. Then you have a subset of that RNA that gets 'read' in 3 base chunks with particular meanings: START-alanine-tryptophan-asparagine-glycine...arginine-STOP. The proteins made by this fold up, and do whatever they do.
There are four big problems with thinking from this approach though. The first being that coding for proteins is not all that DNA and other nucleic acids do by a long shot. There are genes that never make a protein but make functional RNAs that tether things together. Others make regulatory RNAs that go on to affect gene expression. Other DNA that never gets 'read' by anything binds proteins and other complexes for all kinds of purposes I will get into below. Still other DNA consists of selfish replicating elements that exist in vast quantities.
The second is that DNA and RNA are not just information, a string of bits. They are physical objects that are moving around at dozens of meters per second, hitting other molecules, and these physical interactions are what drive their activity. It is not logical operations being performed on a bit string, it is chemical reactions and catalysis in actual three dimensional space. DNA is full of functional elements that have nothing to do with coding for a protein, from promoter elements that have the right charge structure (a function of sequence yes, but decidedly a physical attribute) to stick to the transcription factors and polymerases needed to pry apart the strands and synthesize RNA, to attachment points for fibers that pull freshly replicated DNA into daughter cells, to areas of loose base pairing needed to first pry the strands apart and begin replication, to extra binding sites for transcription factors away from genes which soak up extra molecules and keep them inactive. RNA molecules also have widely varying stability and half-lives before breaking down, again dependent upon their shape and sequence in ways that are often the opposite of straightforward. This is all very physical and depends on the interaction of the shape and charge of the nucleic acid molecules and the rest of the contents of the cell.
The third is that all this information content is completely context-dependent. Yes, almost everything on Earth has compatible genetic codes (animal mitochondria being a notable exception, incidentally - there is very nearly nothing in biology that doesn't have some exception somewhere in the slew of diversity that exists). But if you put an entire yeast genome straight into a human cell, it wouldn't be active and would make no protein* - not even the selfish parasitic elements mooching along inside it would turn on. The presence of a gene reading frame is useless without the correct functional DNA elements next to it that, when colliding with the correct proteins and complexes, are able to properly stick to all the pre-existing machinery that is required to catalyze the production of other molecules from that template. While in any one branch of life the DNA elements and catalytic machinery have evolved together, they drift over evolutionary time. To make matters even worse, it appears that the genetic code itself is pretty arbitrary. There's nothing chemical in the structure of a gene to tie it to a particular amino acid sequence, other than the code itself which is entirely mediated by proteins which tie together free amino acids and transfer-RNAs and that are themselves made by genes according to the code. The code does not exist without the proteins.
EDIT: I have to amend this. I went looking through the literature and while I could not find reference to normal human promoters working in yeast or vice versa I did find references to a very strong promoter from a human mononucleosis-causing virus that in bread yeast produces detectable but biologically insignificant amounts of protein, and works a bit better in a second 'fission yeast' species. Looks like it is sometimes possible for human and yeast promoters to cross-talk, but it is rare and insignificant compared to normal expression.
Fourth, there are features of organisms that have nothing to do with their genomes. Nothing in its genome tells a gram negative bacterium to have two nested cell membranes. Instead, its particular compliment of proteins allows that second membrane to be perpetuated and split along with the rest of the cell when it divides. Nothing in a human cell (we think) tells it that its internal membrane system should have particular proteins in it - instead the functional internal membrane system pulls particular freshly-synthesized proteins with particular features into itself as they are made. The more widely-thought-of epigenetic state is another example of this.
All together, this makes me extremely wary of any attempt to even talk about the 'complexity' of an organism based only on its genome. The size of the genome puts an upper bound on some sorts of complexity - the number of proteins that an organism can make, for example. But physical interactions, the presence of non-DNA molecules, and the previous shape/state of the organism are integral, carry vast quantities of information, and are the context that makes the DNA represent information in the first place rather than just being an unstable polymer.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
I would say the main difference is that computer systems work to embody the same bit string in widely varying substrates and perform the same logical operations on it. It doesn't matter if a program is stored on magnetic domains in a tape drive and executed in vacuum tubes, or if it is stored in electrons trapped in flash memory and executed in a 22 nanometer process CPU, the end result of a given set of logical operations is the same. In biology though there really isn't a message or program you can abstract away from the molecules bouncing around, there is only one level of abstraction. You cannot separate 'hardware' and 'software'.
Assuming "bit string" means "machine code", this isn't true. The same machine code will not result in the same logical operations being performed on all computers. It may not correspond to any logical operations at all on other computers. And what logical operations are carried out depends entirely on "the molecules bouncing around" in the computer. You aren't making DNA sound different from machine code at all.