To build intuition about content vs architecture in AI (which comes up a lot in discussions about AI takeoff that involve Robin Hanson), I've been wondering about content vs architecture size (where size is measured in number of bits).
Here's how I'm operationalizing content and architecture size for ML systems:
- content size: The number of bits required to store the learned model of the ML system (e.g. all the floating point numbers in a neural network).
- architecture size: The number of bits of source code. I'm not sure if it makes sense to include the source code of supporting software (e.g. standard machine learning libraries).
I tried looking at the AlphaGo paper to see if I could find this kind of information, but after trying for about 30 minutes was unable to find what I wanted. I can't tell if this is because I'm not acquainted enough with the ML field to locate this information or if that information just isn't in the paper.
Is this information easily available for various ML systems? What is the fastest way to gather this information?
I'm also wondering about this same content vs architecture size split in humans. For humans one way I'm thinking of it is as "amount of information encoded in inheritance mechanisms" vs "amount of information encoded in a typical adult human brain". I know that Eliezer Yudkowsky has cited 750 megabytes as the amount of information in the human DNA, and also emphasizes that most of this information is junk. This was in 2011 and I don't know if there's a new consensus or how to factor in epigenetic information. There is also content stored in genes, and I'm not sure how to separate out the content and architecture in genes.
I'm pretty uncertain about whether this is even a good way to think about this topic, so I would also appreciate any feedback on this question itself. For example, if this isn't an interesting question to ask, I would like to know why.
I'm not entirely sure what I'm trying to learn here (which is part of what I was trying to express with the final paragraph of my question); this just seemed like a natural question to ask as I started thinking more about AI takeoff.
In "I Heart CYC", Robin Hanson writes: "So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases."
It sounds like he expects early AGI systems to have lots of hand-coded knowledge, i.e. the minimum number of bits needed to specify a seed AI is large compared to what Eliezer Yudkowsky expects. (I wish people gave numbers for this so it's clear whether there really is a disagreement.) It also sounds like Robin Hanson expects progress in AI capabilities to come from piling on more hand-coded content.
If ML source code is small and isn't growing in size, that seems like evidence against Hanson's view.
If ML source code is much smaller than the human genome, I can do a better job of visualizing the kind of AI development trajectory that Robin Hanson expects, where we stick in a bunch of content and share content among AI systems. If ML source code is already quite large, then it's harder for me to visualize this (in this case, it seems like we don't know what we're doing, and progress will come from better understanding).
If the human genome is small, I think that makes a discontinuity in capabilities more likely. When I try to visualize where progress comes from in this case, it seems like it would come from a small number of insights. We can take some extreme cases: if we knew that the code for a seed AGI could fit in a 500-line Python program (I don't know if anybody expects this), a FOOM seems more likely (there's just less surface area for making lots of small improvements). Whereas if I knew that the smallest program for a seed AGI required gigabytes of source code, I feel like progress would come in smaller pieces.
I'm not sure. The content/architecture split doesn't seem clean to me, and I haven't seen anyone give a clear definition. Specialized data structures seems like a good example of something that's in between.