Comment author: Scott_Aaronson 04 November 2007 09:04:19PM 0 points [-]

OK, I came up with a concrete problem whose solution would (I think) tell us something about whether Eliezer's argument can actually work, assuming a certain stylized set of primitive operations available to DNA: insertion, deletion, copying, and reversal. See here if you're interested.

Comment author: Scott_Aaronson 04 November 2007 07:53:30PM 2 points [-]

Eliezer, so long an organism's fitness depends on interactions between many different base pairs, the effect can be as if some of the base pairs are logically determined by others.

Also, unless I'm mistaken there are some logical operations that the genome can perform: copying, transpositions, reversals...

To illustrate, suppose (as apparently happens) a particular DNA stretch occurs over and over with variations: sometimes forwards and sometimes backwards, sometimes with 10% of the base pairs changed, sometimes chopped in half and sometimes appended to another stretch, etc. Should we then count all but one of these occurrences as "junk"? Of course, we could measure the number of bits using our knowledge of how the sequence actually arose ("first stretch X was copied Y times, then the copies were modified as follows..."). But the more such knowledge we bring in, the further we get from the biologist's concept of information and the closer to the Platonic mathematical concept.

Comment author: Scott_Aaronson 04 November 2007 06:51:05PM 4 points [-]

Eliezer, your argument seems to confuse two different senses of information. You first define "bit" as "the ability to eliminate half the possibilities" -- in which case, yes, if every organism has O(1) children then the logical "speed limit on evolution" is O(1) bits per generation.

But you then conclude that "the meaningful DNA specifying a human must fit into at most 25 megabytes" -- and more concretely, that "it is an excellent bet that nearly all the DNA which appears to be junk, really is junk." I don't think that follows at all.

The underlying question here seems to be this: suppose you're writing a software application, and as you proceed, many bits of code are generated at random, many bits are logically determined by previous bits (albeit in a more-or-less "mindless" way), and at most K times you have the chance to fix a bit as you wish. (Bits can also be deleted as you go.) Should we then say that whatever application you end up with can have at most K bits of "meaningful information"?

Arguably from some God's-eye view. But any mortal examining the code could see far more than K of the bits fulfilling a "functional role" -- indeed, possibly even all of them. The reason is that the web of logical dependencies, by which the K "chosen" bits interacted with the random bits to produce the code we see, could in general be too complicated ever to work out within the lifetime of the universe. And crucially, when biologists talk about how many base pairs are "coding" and how many are "non-coding", it's clearly the pragmatic sense of "meaningful information" they have in mind rather than the Platonic one.

Indeed, it's not even clear that God could produce a ~K-bit string from which the final application could be reliably reconstructed. The reason is that the application also depends on random bits, of which there are many more than K. Without assuming some conjecture about pseudorandom number generators, it seems the most God could do would be to give us a function mapping the random bits to K bits, such that by applying that function we'd end up most of the time with an application that did more-or-less the same thing. (This actually leads to some interesting CS questions, but I'll spare you for now! :) )

To say something more concrete, without knowing much more than I do about biology, I wouldn't venture a guess as to how much of the "junk DNA" is really junk. The analogy I prefer is the following: if I printed out the MS Word executable file, almost all of it would look like garbage to me, with only a few "coding regions" here and there ("It looks like you're writing a letter. Would you like help?"). But while the remaining bits might indeed be garbage in some sense, they're clearly not in the sense a biologist would mean.

Comment author: Scott_Aaronson 22 August 2007 01:03:37PM 0 points [-]

Bravo.

Comment author: Scott_Aaronson 08 August 2007 05:14:09PM 5 points [-]

Eliezer: Excellent post, but I wonder if what you're saying is related to impressionist painting or the French Revolution? :-)

Seriously, I constantly meet people who ask me questions like: "could quantum algorithms have implications for biomechanical systems?" Or "could neural nets provide insight to the P versus NP problem?" And I struggle to get across to them what you've articulated so clearly here: that part of being a successful researcher is figuring out what isn't related to what else.

View more: Prev