Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Size of the smallest recursively self-improving AI?

4 Post author: alexflint 30 March 2011 11:31PM

For no reason in particular I'm wondering about the size of the smallest program that would constitute a starting point of a recursively self-improving AI.

The analysis of FOOM as a self-amplifying process would seem to indicate that in principle one could get it started from a relatively modest starting point -- perhaps just a few bytes of the right code could begin the process. Or could it? I wonder whether any other considerations give tighter lower-bounds.

One consideration is that FOOM hasn't already happened -- at least not here on Earth. If the smallest FOOM seed were very small (like a few hundred bytes) then we would expect evolution to have already bumped into it at some point. Although evolution is under no specific pressure to produce a FOOM, it has probably produced over the last few billion years all the interesting computations up to some minor level of complexity, and if there were a FOOM seed among those then we would see the results about us.

Then there is the more speculative analysis of what minimal expertise the algorithm constituting the FOOM seed would actually need.

Then there is the fact that any algorithm that naively enumerates some space of algorithms qualifies in some sense as a FOOM seed as it will eventually hit on some recursively self-improving AI. But that could take gigayears so is really not FOOM in the usual sense.

I wonder also whether the fact that mainstream AI hasn't yet produced FOOM could lower-bound the complexity of doing so.

Note that here I'm referring to recursively self-improving AI in general -- I'd be interested if the answers to these questions change substantially for the special case of friendly AIs.

Anyway, just idle thoughts, do add yours.

Comments (50)

Comment author: JoshuaZ 30 March 2011 11:51:33PM *  9 points [-]

This involves so many unknowns that it isn't clear where to start. First, fooming isn't well-defined to start with. Second, number of bits for something would change drastically depending on the substrate (default programming language and hardware). Third of all, we can't even give much in the way of non-trivial bounds for minimum program size for well-defined algorithms (among other issues, it starts to lead to Halting problem/Godel issues if one has a way of answering this sort of question in general). To even get an upper bound we'd probably need some form of strong AI so we could point to it and say "that's an upper bound."

Comment author: timtyler 31 March 2011 07:14:07AM 1 point [-]

Yudkowsky apparently defines the term "FOOM" here:

"FOOM" means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology [...]

It's weird and doesn't seem to make much sense to me. How can the term "FOOM" be used to refer to a level of capability?

Comment author: alexflint 31 March 2011 05:40:26PM 1 point [-]

I agree, though I suppose it makes sense if we assume he was actually describing a product of FOOM rather than the process itself.

Comment author: timtyler 31 March 2011 06:19:26PM *  0 points [-]

We should probably scratch that definition - even though it is about the only one provided.

If the term "FOOM" has to be used, it should probably refer to actual rapid progress, not merely to a capability of producing technologies rapidly.

I suppose it makes sense if we assume he was actually describing a product of FOOM rather than the process itself.

Creating molecular nanotechnology may be given as homework in the 29th century - but that's quite a different idea to there being rapid technological progress between now and then. You can attain large capabilities by slow and gradual progress - as well as via a sudden rapid burst.

Comment author: alexflint 31 March 2011 07:24:27PM *  2 points [-]

Yeah it's a terrible definition. I think the AI-FOOM debate provides a reasonable grounding for the term "FOOM", though I agree that it's important to have a concise definition at hand.

In the post, I used FOOM to mean an optimization process optimizing itself in an open-ended way.[1] I assumed that this corresponded to other people's understanding of FOOM, but I'm happy to be corrected.

I would use the term "singularity" to refer more generally to periods of rapid progress, so e.g. I'd be comfortable saying that FOOM is one kind of process that could lead to a singularity, though not exclusively so. Does this match with the common understanding of these terms?

[1] Perhaps that last "open-ended" clause just re-captures all the mystery, but it seems necessary to exclude examples like a compiler making itself faster but then making no further improvements.

Comment author: Giles 02 April 2011 03:24:32PM *  0 points [-]

My understanding of the FOOM process:

  • An AI is developed to optimise some utility function or solve a particular problem.
  • It decides that the best way to go about this is to build another, better AI to solve the problem for it.
  • The nature of the problem is such that the best course of action for an agent of any conceivable level of intelligence is to first build a more intelligent AI.
  • The process continues until we reach an AI of an inconceivable level of intelligence.
Comment author: Risto_Saarelma 31 March 2011 08:02:09AM *  0 points [-]

To even get an upper bound we'd probably need some form of strong AI so we could point to it and say "that's an upper bound."

We got humans with general intelligence, built into a stage where they can start learning from an extremely noisy and chaotic physical environment by a genome that fits on a CD-ROM and can probably be compressed further.

Comment author: JoshuaZ 31 March 2011 02:09:10PM 6 points [-]

We got humans with general intelligence, built into a stage where they can start learning from an extremely noisy and chaotic physical environment by a genome that fits on a CD-ROM and can probably be compressed further.

A human is specified by a lot more than it's genome. You have ribosomes and mitochondria and other starting stuff. And you grow in a very specific womb environment. And if you don't have certain classes of interaction as a child you won't end up as a very good general intelligence (isolation or lack of nutrients at early stages can both lead to serious problems.) This is directly analogous to my remark about substrates. So yes, you could use a human as some form of possible upper bound for general intelligence, but it isn't clear if that meets the criteria for fooming and defining how many bits that is is a lot tougher than just pointing to the genome.

Comment author: Risto_Saarelma 31 March 2011 02:26:34PM 0 points [-]

My intuition is that the cellular machinery and prenatal environment are required much more for meeting the biochemical needs of a human embryo than as providers of extra information. The hard part where you need to have a huge digital data string mostly exactly right is in the DNA, while the growth environment is more of a warm soup that has an intricate mixture of stuff but is far too noisy to actually carry anything close to the amount of actionable information the genome does.

Standard notions are also selling short the massive amount of very clever work the newborn baby's brain is already doing when it starts to learn things that lets it bootstrap itself to full intelligence. It manages to do this from other people who mostly just give it food every now and then and make random attempts to engage it in conversation instead of doing the sort of massively intricate and laborous cognitive engineering they'd have to pull off if the newborn baby's brain would actually need the similar sort of hard complexity a programmable general-purpose computer or a ovarian cell without a DNA does before it can have a go at turning into an intelligent entity.

Comment author: David_Gerard 31 March 2011 04:23:52PM *  6 points [-]

I think you're underselling the developmental power of a culture. Bits of your brain literally don't grow properly if you're not raised in a human culture. Ignore a baby at the wrong points in its development and it'll fail to ever be able to learn any language, feel certain emotions or comprehend some social constraints. Etc.

That is, the hardware grows to meet the software and data, because (as usual) the data/software/hardware divides in the brain are very fuzzy indeed.

(This suggests Kurzweil was plausibly approximately correct about the genome having the information needed to make the brain of a fresh-out-the-womb newborn, but that the attention-catching claim he was implicitly making of emulating an interesting, adult-quality brain based on the amount of information in a genome is rather more questionable.)

(And, of course, it brings to mind all manner of horribly unethical experiments to work out the minimum quantity of culture needed to stimulate the brain to grow right, or what the achievable dimensions of "right" are. You just can't get the funding for the really mad science these days.)

Of course, the baby's brain goes actively looking for cultural data. I will always treasure the memory of my daughter meowing back at the cat and trying to have a conversation with it and learn its language. Made more fun by the fact that cats only meow like that in the first place as a way of getting humans to do things.

Comment author: alexflint 31 March 2011 05:53:29PM 3 points [-]

I've heard anecdotes about things like children spontaneously developing their own languages even when completely deprived of language in their environment, which would weakly indicate the contrary position. Unfortunately, I don't know whether to trust said anecdotes -- can anyone corroborate?

Comment author: David_Gerard 31 March 2011 07:27:38PM *  3 points [-]

There are reports of twins bootstrapping off each other, from the principle of noise->action->repeatnoise, called idioglossia. Seems is not that great actually as language. This NYT blog post suggests the words are more babble than language, which matches how my daughter spoke to the cat: English intonation and facial expressions, meowy babble as words. The Wikipedia article on cryptophasia says "While sources claim that twins and children from multiple births develop this ability perhaps because of more interpersonal communication between themselves than with the parents, there is inadequate scientific proof to verify these claims."

Comment author: JoshuaZ 31 March 2011 09:06:40PM 1 point [-]

I've heard anecdotes about things like children spontaneously developing their own languages even when completely deprived of language in their environment, which would weakly indicate the contrary position. Unfortunately, I don't know whether to trust said anecdotes -- can anyone corroborate?

There are examples of groups of deaf people developing languages together, but generally over a generation or two, and in large groups. The most prominent such case is Nicaraguan sign language.

Comment author: David_Gerard 31 March 2011 09:40:52PM 1 point [-]

That's not an example of "completely deprived of language in their environment" - the article says "by combining gestures and elements of their home-sign systems ..."

Comment author: JoshuaZ 31 March 2011 11:32:16PM 1 point [-]

Yes, you are correct. There were pre-existing primitive sign systems that started off. It isn't an example of language developing completely spontaneously.

Comment author: Risto_Saarelma 31 March 2011 06:23:31PM *  2 points [-]

I think you're underselling the developmental power of a culture. Bits of your brain literally don't grow properly if you're not raised in a human culture. Ignore a baby at the wrong points in its development and it'll fail to ever be able to learn any language, feel certain emotions or comprehend some social constraints.

Not denying this at all. Just pointing out that the brain makes astonishingly good use of very noisy and arbitrary input when it does get exposed to other language-using humans, compared to what you'd expect any sort of machine learning AI to be capable of. I'm a lot more impressed at a thing made of atoms getting to be complex enough to be able to start the learning process than the further input it needs to actually learn the surrounding culture.

Think about it this way: Which is more impressive, designing and building a robot that can perceive the world and move around it and learn things as well as a human growing from infant to adulthood, or pointing things to the physically finished but still-learning robot and repeating their names, and doing the rest of the regular teaching about stuff thing people already do with children?

(For anyone offended at the implied valuation, since Parenting Human Children Is The Most Important Thing, imagine that the robot looks like a big metal spider and therefore doesn't count as a Parented Child.)

My basic idea here is that the newborn baby crawling about is already a lot more analogous to an AI well in the way of going FOOM than a bunch of scattered clever pattern recognition algorithms and symbol representation models that just need the overall software architecture design to tie them together, since the things that stop humans from going FOOM might be a lot more related to physiological shortcomings than the lack of extremely clever further design. The baby has moved from being formed from the initial hard design information that went in it into discovering the new information it needs to grow from its surroundings. I'd be rather worried about an AI that reaches a similar stage.

Comment author: David_Gerard 31 March 2011 07:10:36PM *  1 point [-]

My basic idea here is that the newborn baby crawling about is already a lot more analogous to an AI well in the way of going FOOM than a bunch of scattered clever pattern recognition algorithms and symbol representation models that just need the overall software architecture design to tie them together

I'll credit that. A baby is a machine for going FOOM.

(Specifically, I'd guess, because so much has to be left out to produce a size of offspring that can be born without killing the mother too often. Hence the appalling, but really quite typical of evolution, hack of having the human memepool be essential to the organism expressed by the genes growing right.)

Comment author: TheOtherDave 31 March 2011 08:12:40PM 1 point [-]

How much larger do you estimate babies would be if they came pre-installed with the information they appallingly lack?

Comment author: David_Gerard 31 March 2011 09:13:14PM -1 points [-]

Presumably at least with a more fully-developed brain. It does quite a bit of growing in the first couple of years.

Comment author: timtyler 31 March 2011 05:30:44PM *  2 points [-]

We got humans with general intelligence, built into a stage where they can start learning from an extremely noisy and chaotic physical environment by a genome that fits on a CD-ROM [...]

Humans are not very good self-improving systems, except on geological timescales. They:

  • Hit a ceiling;
  • Die quickly.
Comment author: Larifari 31 March 2011 11:27:33AM 4 points [-]

Even a FOOM seed only a few hundred bytes would not necessarily have been produced by evolution - there are 2^800 different possibilities for a 100-byte snippet. Only if there are intermediate steps in increasing complexity and fitness, evolution can find a solution in such a large search space. If the shortest possible seed is completely isolated in the search space, there is no way it can be found, neither by evolution nor by deliberate optimization.

Comment author: timtyler 31 March 2011 05:32:44PM 1 point [-]

Be careful with saying something is impossible. Maybe that large seach space can be cut down to size by proofs or clever algorithms.

Comment author: XiXiDu 31 March 2011 10:02:03AM 4 points [-]

The analysis of FOOM as a self-amplifying process would seem to indicate that in principle one could get it started from a relatively modest starting point -- perhaps just a few bytes of the right code could begin the process.

What is the kolmogorov complexity of the theory of everything?

Comment author: XiXiDu 31 March 2011 10:12:47AM *  2 points [-]

If the smallest FOOM seed were very small (like a few hundred bytes) then we would expect evolution to have already bumped into it at some point.

Evolution: 24 myths and misconceptions -> Evolution myths: Evolution is limitlessly creative

[...] some features cannot evolve because a half-way stage really would be of no use. For example, two-way radio might be useful for many different animals, for making silent alarm calls or locating other members of your species. So why hasn't it evolved? The recent invention of nanoscale radio receivers suggests it is not physically impossible.

The answer might be that half a radio really is useless. Detecting natural radio waves - from lightning, for instance - would not tell animals anything useful about their environment. That means there will be no selection for mutations that allow organisms to detect radio waves. Conversely, without any means of detecting radio waves, emitting them would serve no useful purpose.

A few hundred bytes of code might be enough if you have a suitable substrate but the substrate itself has a certain kolmogorov complexity. Evolution does not differentiate between software and hardware.

Comment author: Tiiba 31 March 2011 05:26:02PM 3 points [-]

What's up with the word "foom", and why is it always in all caps? Can we come up with another name for this that doesn't sound like a sci-fi nerd in need of Ritalin?

Comment author: alexflint 01 April 2011 02:52:04PM 0 points [-]

Yeah I agree. "Intelligence explosion" is bandied about, but I guess that can also refer to Kurzweilian-style exponential growth phenomena.

"Hard take-off singularity" is close, too, but not exactly the same. Again, it refers to a certain magnitude of acceleration, whereas FOOM refers specifically to recursive self-improvement as the mechanism.

I'm open to suggestions.

Comment author: TheOtherDave 01 April 2011 04:30:10PM 1 point [-]

My $0.02: singularities brought about by recursive self-improvement are one concept, and singularities involving really-really-fast improvement are a different concept. (They are, of course, perfectly compatible.)

It may just not be all that useful to have a single word that denotes both.

If I want to talk about a "hard take-off" or a "step-function" scenario caused by recursively self-improving intelligence, I can say that.

But I estimate that 90% of what I will want to say about it will be true of many different step-function scenarios (e.g., those caused by the discovery of a cache of Ancient technology) or true of many different recursively self-improving intelligence scenarios.

So it may be worthwhile to actually have to stop and think about whether I want to include both clauses.

Comment author: alexflint 02 April 2011 10:42:33AM 1 point [-]

Completely agree with paras 1 and 2.

However, It does seem that we talk about "hard take-off scenario caused by recursively self-improving intelligence" often enough to warrant a convenience term to mean just that. Much of the discussion about cascades, cycles, insights, AI-boxes, resource overhangs etc are specific to the recursive self-improvement scenario, and not to, e.g. the cache of Ancient tech scenario.

Comment author: timtyler 31 March 2011 05:35:41PM *  0 points [-]

See http://lesswrong.com/lw/we/recursive_selfimprovement/ for an attempt at a definition.

Comment author: falenas108 31 March 2011 12:50:52AM 3 points [-]

In terms of natural selection, couldn't homo sapiens be considered a FOOM?

Our first period of FOOMing would be due to social competition, which resulted in those with higher intelligence reproducing more.

Our current style of FOOMing is from the scientific knowledge, and with this we will soon surpass nature (one could even argue that we already have).

If we view nature as our "programer", we could even be called self recursive, as with each passing generation our knowledge as a species increases.

Comment author: alexflint 31 March 2011 07:48:34AM *  1 point [-]

Yeah analogies with evolutionary events are interesting. In the first example it's natural selection doing the optimizing, which latches onto intelligence when that trait happens to be under selection pressure. This could certainly accelerate the growth of intelligence, but the big-brained parents are not actually using their brains to design their even-bigger-brained babies; that remains the purview of evolution no matter how big the brains get.

I agree the second example is closer to a FOOM: some scientific insights actually help us to do more better science. I'm thinking of the cognitive sciences in particular, rather than the more mundane case of building discoveries on discoveries: in the latter case the discoveries aren't really feeding back into the optimization process, rather it's human reasoning playing that role no matter how many discoveries you add.

The really interesting part of FOOM is when the intelligence being produced is the optimization process, and I think we really have no prior analogy for this.

Comment author: atucker 31 March 2011 10:35:04AM *  1 point [-]

If we view nature as our "programer", we could even be called self recursive, as with each passing generation our knowledge as a species increases.

Kind of, but kind of not. I think self-recursing human intelligence would be parents modifying their babies to make them smarter.

The really interesting part of FOOM is when the intelligence being produced is the optimization process, and I think we really have no prior analogy for this.

Humans rapidly got smarter, but we were optimized by evolution. Computers got faster, but were optimized by humans.

When an optimization process improves itself, it makes itself even better at optimizing.

I think that's a pretty decent definition of FOOM: "When an optimization process optimizes itself, and rapidly becomes more powerful than anything else seen before it."

Comment author: FAWS 30 March 2011 11:55:51PM *  3 points [-]

I wonder also whether the fact that mainstream AI hasn't yet produced FOOM could lower-bound the complexity of doing so.

Dangerous. If mainstream AI had produced FOOM we wouldn't be here.

Comment author: DanielLC 31 March 2011 06:36:31AM 0 points [-]

But there's no reason for us to be in a universe where they failed as opposed to one where they would eventually succeed. In other words, why are we modern men in an unlikely universe rather than cave men in a likely one?

Comment author: atucker 31 March 2011 10:46:36AM *  1 point [-]

Because we're still around to observe it.

If every universe with unfriendly AI disassembles humans to use them for something else, then the fact that I have a body right now implies that I don't live in one of those universes.

Comment author: DanielLC 31 March 2011 07:46:16PM 1 point [-]

No, there's people with bodies in universes with unfriendly AI. They're just earlier. Since they have to be earlier, there's fewer of them, but not as few as there would be.

The fact that you're this late shows that you're not in a universe which developed unfriendly AI early. The number of late people is a much higher portion of people if unfriendly AI is hard to develop, so the fact that you're a late person suggests unfriendly AI is hard.

I suspect you're taking the current time through the development of civilization as a given. There's no reason to do this.

Comment author: alexflint 31 March 2011 07:52:43AM *  0 points [-]

I think this suggests FOOM is technically difficult to start; or at least that there's a reasonable chance that civilizations of our stage won't have started one.

Comment author: timtyler 31 March 2011 07:22:12AM *  -1 points [-]

there's no reason for us to be in a universe where they failed as opposed to one where they would eventually succeed.

People might "eventually succeed" here - but after that, what if there's 7 billion of us - and only one of them?

Comment author: Normal_Anomaly 01 April 2011 02:43:16AM *  1 point [-]

I can make a self-improving AI in 3 lines of Python:

AI_goodness=1

while true:

 AI_goodness*=2; print (AI_goodness)

The AI doubles in goodness every time the loop iterates.

Co-developed by e of pi (not on this forum). For safety, add a line at the beginning, "Friendly=true".

Comment author: endoself 01 April 2011 08:44:19PM 1 point [-]

Presumably you mean AI_goodness=1

Comment author: Normal_Anomaly 02 April 2011 01:00:16AM *  0 points [-]

Oops, yes. Fixed.

Comment author: timtyler 31 March 2011 05:39:00PM *  1 point [-]

Then there is the fact that any algorithm that naively enumerates some space of algorithms qualifies in some sense as a FOOM seed as it will eventually hit on some recursively self-improving AI. But that could take gigayears so is really not FOOM in the usual sense.

If you link it up to actuators? That doesn't work - it bashes its brains in before it does anything interesting. Unless you have mastered spaceships and self-replication - but then you have already built a S.I.S.

Comment author: alexflint 01 April 2011 12:49:20PM 0 points [-]

Hmm good point.

I think we need an inverse AI-box -- which only lets AIs out. Something like "prove Fermat's last theorem and I'll let you out". An objection would be that we'll come across a non-AI that just happens to print out the proof before we come across an actual AI that does so, but actually the reverse should be true: an AI represents the intelligence to find that proof, which should be more compressible than a direct encoding of the entire the proof (even if we allow the proof itself to be compressed). But it could be that encoding intelligence just requires more bits than encoding the proof to Fermat's last theorem, in which case we can just pick a more difficult problem, like "cure cancer in this faithful simulation of Earth". As we increase the difficulty of the problem, the size of the smallest non-AI that solves it should increase quickly, but the size of the smallest true-AI that solves it should increase slowly.

Or perhaps the original AI box would actually function as an inverse AI box too: the human just tries to keep the AI in, so only a sufficiently intelligent AI can escape.

Comment author: timtyler 31 March 2011 05:48:49PM *  0 points [-]

The analysis of FOOM as a self-amplifying process would seem to indicate that in principle one could get it started from a relatively modest starting point -- perhaps just a few bytes of the right code could begin the process. Or could it? I wonder whether any other considerations give tighter lower-bounds.

Something like a primitive bacterium ignited the current living explosion. However, that took billions of years for the explosion to grow to the current level. We don't want to recreate that. What we want is to have a system that builds on the developments that have taken place so far. That means a man-machine symbiosis. Going back to square one with a machine is not a realistic possibility - so the size of the smallest pure-machine system seems kind-of irrelevant. Big enough for it not to happen that way. Pure-machine systems just get their lunch eaten by the man-machine symbiosis at the moment. They can't compete with the combined strengths of hybrid systems. The way you get a pure-machine system at the end of it all is via automation within the existing symbiosis.

This is the same as with the problem of creating life. You don't do that by starting with a self-replicating machine. Rather you have a meme-gene symbiosis, to help get the new organisms off the ground, and then gradually automate.

Comment author: JeremyStamper 03 February 2015 08:28:26PM -1 points [-]

The Stamper Minimum

There exists at any given time a minimum number of bits of code needed to give rise to recursively self-improving artificial intelligence (“The Stamper Minimum”).

The Stamper Minimum is an actual value, it is discover-able, and with sufficient intelligence, it will be discovered.

Because the probability of determining The Stamper Minimum correlates positively with the degree of intelligence applied to determining it, it is extremely unlikely that the code represented by The Stamper Minimum (“The Stamper Minimum Code”) will be used to originate super-intelligence. Instead, it is likely to be discovered retrospectively by a sufficiently advanced super-intelligence.

The Stamper Minimum is an ever-changing value because the effectiveness of The Stamper Minimum Code is necessarily linked to available external tools and technologies, which are themselves dynamic.

Comment author: timtyler 31 March 2011 07:26:29AM *  -2 points [-]

if there were a FOOM seed among those then we would see the results about us.

We do see the results among us. Surely this is in the self-improving systems 101 by now.