Recursive Self-Improvement

Eliezer Yudkowsky

43 Recursive Self-Improvement

1st Dec 2008

15 min read

43

Followup to: Life's Story Continues, Surprised by Brains, Cascades, Cycles, Insight, Recursion, Magic, Engelbart: Insufficiently Recursive, Total Nano Domination

I think that at some point in the development of Artificial Intelligence, we are likely to see a fast, local increase in capability - "AI go FOOM". Just to be clear on the claim, "fast" means on a timescale of weeks or hours rather than years or decades; and "FOOM" means way the hell smarter than anything else around, capable of delivering in short time periods technological advancements that would take humans decades, probably including full-scale molecular nanotechnology (that it gets by e.g. ordering custom proteins over the Internet with 72-hour turnaround time). Not, "ooh, it's a little Einstein but it doesn't have any robot hands, how cute".

Most people who object to this scenario, object to the "fast" part. Robin Hanson objected to the "local" part. I'll try to handle both, though not all in one shot today.

We are setting forth to analyze the developmental velocity of an Artificial Intelligence. We'll break down this velocity into optimization slope, optimization resources, and optimization efficiency. We'll need to understand cascades, cycles, insight and recursion; and we'll stratify our recursive levels into the metacognitive, cognitive, metaknowledge, knowledge, and object level.

Quick review:

"Optimization slope" is the goodness and number of opportunities in the volume of solution space you're currently exploring, on whatever your problem is;
"Optimization resources" is how much computing power, sensory bandwidth, trials, etc. you have available to explore opportunities;
"Optimization efficiency" is how well you use your resources. This will be determined by the goodness of your current mind design - the point in mind design space that is your current self - along with its knowledge and metaknowledge (see below).

Optimizing yourself is a special case, but it's one we're about to spend a lot of time talking about.

By the time any mind solves some kind of actual problem, there's actually been a huge causal lattice of optimizations applied - for example, humans brain evolved, and then humans developed the idea of science, and then applied the idea of science to generate knowledge about gravity, and then you use this knowledge of gravity to finally design a damn bridge or something.

So I shall stratify this causality into levels - the boundaries being semi-arbitrary, but you've got to draw them somewhere:

"Metacognitive" is the optimization that builds the brain - in the case of a human, natural selection; in the case of an AI, either human programmers or, after some point, the AI itself.
"Cognitive", in humans, is the labor performed by your neural circuitry, algorithms that consume large amounts of computing power but are mostly opaque to you. You know what you're seeing, but you don't know how the visual cortex works. The Root of All Failure in AI is to underestimate those algorithms because you can't see them... In an AI, the lines between procedural and declarative knowledge are theoretically blurred, but in practice it's often possible to distinguish cognitive algorithms and cognitive content.
"Metaknowledge": Discoveries about how to discover, "Science" being an archetypal example, "Math" being another. You can think of these as reflective cognitive content (knowledge about how to think).
"Knowledge": Knowing how gravity works.
"Object level": Specific actual problems like building a bridge or something.

I am arguing that an AI's developmental velocity will not be smooth; the following are some classes of phenomena that might lead to non-smoothness. First, a couple of points that weren't raised earlier:

Roughness: A search space can be naturally rough - have unevenly distributed slope. With constant optimization pressure, you could go through a long phase where improvements are easy, then hit a new volume of the search space where improvements are tough. Or vice versa. Call this factor roughness.
Resource overhangs: Rather than resources growing incrementally by reinvestment, there's a big bucket o' resources behind a locked door, and once you unlock the door you can walk in and take them all.

And these other factors previously covered:

Cascades are when one development leads the way to another - for example, once you discover gravity, you might find it easier to understand a coiled spring.
Cycles are feedback loops where a process's output becomes its input on the next round. As the classic example of a fission chain reaction illustrates, a cycle whose underlying processes are continuous, may show qualitative changes of surface behavior - a threshold of criticality - the difference between each neutron leading to the emission of 0.9994 additional neutrons versus each neutron leading to the emission of 1.0006 additional neutrons. k is the effective neutron multiplication factor and I will use it metaphorically.
Insights are items of knowledge that tremendously decrease the cost of solving a wide range of problems - for example, once you have the calculus insight, a whole range of physics problems become a whole lot easier to solve. Insights let you fly through, or teleport through, the solution space, rather than searching it by hand - that is, "insight" represents knowledge about the structure of the search space itself.

and finally,

Recursion is the sort of thing that happens when you hand the AI the object-level problem of "redesign your own cognitive algorithms".

Suppose I go to an AI programmer and say, "Please write me a program that plays chess." The programmer will tackle this using their existing knowledge and insight in the domain of chess and search trees; they will apply any metaknowledge they have about how to solve programming problems or AI problems; they will process this knowledge using the deep algorithms of their neural circuitry; and this neutral circuitry will have been designed (or rather its wiring algorithm designed) by natural selection.

If you go to a sufficiently sophisticated AI - more sophisticated than any that currently exists - and say, "write me a chess-playing program", the same thing might happen: The AI would use its knowledge, metaknowledge, and existing cognitive algorithms. Only the AI's metacognitive level would be, not natural selection, but the object level of the programmer who wrote the AI, using their knowledge and insight etc.

Now suppose that instead you hand the AI the problem, "Write a better algorithm than X for storing, associating to, and retrieving memories". At first glance this may appear to be just another object-level problem that the AI solves using its current knowledge, metaknowledge, and cognitive algorithms. And indeed, in one sense it should be just another object-level problem. But it so happens that the AI itself uses algorithm X to store associative memories, so if the AI can improve on this algorithm, it can rewrite its code to use the new algorithm X+1.

This means that the AI's metacognitive level - the optimization process responsible for structuring the AI's cognitive algorithms in the first place - has now collapsed to identity with the AI's object level.

For some odd reason, I run into a lot of people who vigorously deny that this phenomenon is at all novel; they say, "Oh, humanity is already self-improving, humanity is already going through a FOOM, humanity is already in a Singularity" etc. etc.

Now to me, it seems clear that - at this point in the game, in advance of the observation - it is pragmatically worth drawing a distinction between inventing agriculture and using that to support more professionalized inventors, versus directly rewriting your own source code in RAM. Before you can even argue about whether the two phenomena are likely to be similar in practice, you need to accept that they are, in fact, two different things to be argued about.

And I do expect them to be very distinct in practice. Inventing science is not rewriting your neural circuitry. There is a tendency to completely overlook the power of brain algorithms, because they are invisible to introspection. It took a long time historically for people to realize that there was such a thing as a cognitive algorithm that could underlie thinking. And then, once you point out that cognitive algorithms exist, there is a tendency to tremendously underestimate them, because you don't know the specific details of how your hippocampus is storing memories well or poorly - you don't know how it could be improved, or what difference a slight degradation could make. You can't draw detailed causal links between the wiring of your neural circuitry, and your performance on real-world problems. All you can see is the knowledge and the metaknowledge, and that's where all your causal links go; that's all that's visibly important.

To see the brain circuitry vary, you've got to look at a chimpanzee, basically. Which is not something that most humans spend a lot of time doing, because chimpanzees can't play our games.

You can also see the tremendous overlooked power of the brain circuitry by observing what happens when people set out to program what looks like "knowledge" into Good-Old-Fashioned AIs, semantic nets and such. Roughly, nothing happens. Well, research papers happen. But no actual intelligence happens. Without those opaque, overlooked, invisible brain algorithms, there is no real knowledge - only a tape recorder playing back human words. If you have a small amount of fake knowledge, it doesn't do anything, and if you have a huge amount of fake knowledge programmed in at huge expense, it still doesn't do anything.

So the cognitive level - in humans, the level of neural circuitry and neural algorithms - is a level of tremendous but invisible power. The difficulty of penetrating this invisibility and creating a real cognitive level is what stops modern-day humans from creating AI. (Not that an AI's cognitive level would be made of neurons or anything equivalent to neurons; it would just do cognitive labor on the same level of organization. Planes don't flap their wings, but they have to produce lift somehow.)

Recursion that can rewrite the cognitive level is worth distinguishing.

But to some, having a term so narrow as to refer to an AI rewriting its own source code, and not to humans inventing farming, seems hardly open, hardly embracing, hardly communal; for we all know that to say two things are similar shows greater enlightenment than saying that they are different. Or maybe it's as simple as identifying "recursive self-improvement" as a term with positive affective valence, so you figure out a way to apply that term to humanity, and then you get a nice dose of warm fuzzies. Anyway.

So what happens when you start rewriting cognitive algorithms?

Well, we do have one well-known historical case of an optimization process writing cognitive algorithms to do further optimization; this is the case of natural selection, our alien god.

Natural selection seems to have produced a pretty smooth trajectory of more sophisticated brains over the course of hundreds of millions of years. That gives us our first data point, with these characteristics:

Natural selection on sexual multicellular eukaryotic life can probably be treated as, to first order, an optimizer of roughly constant efficiency and constant resources.
Natural selection does not have anything akin to insights. It does sometimes stumble over adaptations that prove to be surprisingly reusable outside the context for which they were adapted, but it doesn't fly through the search space like a human. Natural selection is just searching the immediate neighborhood of its present point in the solution space, over and over and over.
Natural selection does have cascades; adaptations open up the way for further adaptations.

So - if you're navigating the search space via the ridiculously stupid and inefficient method of looking at the neighbors of the current point, without insight - with constant optimization pressure - then...

Well, I've heard it claimed that the evolution of biological brains has accelerated over time, and I've also heard that claim challenged. If there's actually been an acceleration, I would tend to attribute that to the "adaptations open up the way for further adaptations" phenomenon - the more brain genes you have, the more chances for a mutation to produce a new brain gene. (Or, more complexly: the more organismal error-correcting mechanisms the brain has, the more likely a mutation is to produce something useful rather than fatal.) In the case of hominids in particular over the last few million years, we may also have been experiencing accelerated selection on brain proteins, per se - which I would attribute to sexual selection, or brain variance accounting for a greater proportion of total fitness variance.

Anyway, what we definitely do not see under these conditions is logarithmic or decelerating progress. It did not take ten times as long to go from H. erectus to H. sapiens as from H. habilis to H. erectus. Hominid evolution did not take eight hundred million years of additional time, after evolution immediately produced Australopithecus-level brains in just a few million years after the invention of neurons themselves.

And another, similar observation: human intelligence does not require a hundred times as much computing power as chimpanzee intelligence. Human brains are merely three times too large, and our prefrontal cortices six times too large, for a primate with our body size.

Or again: It does not seem to require 1000 times as many genes to build a human brain as to build a chimpanzee brain, even though human brains can build toys that are a thousand times as neat.

Why is this important? Because it shows that with constant optimization pressure from natural selection and no intelligent insight, there were no diminishing returns to a search for better brain designs up to at least the human level. There were probably accelerating returns (with a low acceleration factor). There are no visible speedbumps, so far as I know.

But all this is to say only of natural selection, which is not recursive.

If you have an investment whose output is not coupled to its input - say, you have a bond, and the bond pays you a certain amount of interest every year, and you spend the interest every year - then this will tend to return you a linear amount of money over time. After one year, you've received $10; after 2 years, $20; after 3 years, $30.

Now suppose you change the qualitative physics of the investment, by coupling the output pipe to the input pipe. Whenever you get an interest payment, you invest it in more bonds. Now your returns over time will follow the curve of compound interest, which is exponential. (Please note: Not all accelerating processes are smoothly exponential. But this one happens to be.)

The first process grows at a rate that is linear over time; the second process grows at a rate that is linear in its cumulative return so far.

The too-obvious mathematical idiom to describe the impact of recursion is replacing an equation

y = f(t)

with

dy/dt = f(y)

For example, in the case above, reinvesting our returns transformed the linearly growing

y = m*t

into

y' = m*y

whose solution is the exponentially growing

y = e^(m*t)

Now... I do not think you can really solve equations like this to get anything like a description of a self-improving AI.

But it's the obvious reason why I don't expect the future to be a continuation of past trends. The future contains a feedback loop that the past does not.

As a different Eliezer Yudkowsky wrote, very long ago:

"If computing power doubles every eighteen months, what happens when computers are doing the research?"

And this sounds horrifyingly naive to my present ears, because that's not really how it works at all - but still, it illustrates the idea of "the future contains a feedback loop that the past does not".

History up until this point was a long story about natural selection producing humans, and then, after humans hit a certain threshold, humans starting to rapidly produce knowledge and metaknowledge that could - among other things - feed more humans and support more of them in lives of professional specialization.

To a first approximation, natural selection held still during human cultural development. Even if Gregory Clark's crazy ideas are crazy enough to be true - i.e., some human populations evolved lower discount rates and more industrious work habits over the course of just a few hundred years from 1200 to 1800 - that's just tweaking a few relatively small parameters; it is not the same as developing new complex adaptations with lots of interdependent parts. It's not a chimp-human type gap.

So then, with human cognition remaining more or less constant, we found that knowledge feeds off knowledge with k > 1 - given a background of roughly constant cognitive algorithms at the human level. We discovered major chunks of metaknowledge, like Science and the notion of Professional Specialization, that changed the exponents of our progress; having lots more humans around, due to e.g. the object-level innovation of farming, may have have also played a role. Progress in any one area tended to be choppy, with large insights leaping forward, followed by a lot of slow incremental development.

With history to date, we've got a series of integrals looking something like this:

Metacognitive = natural selection, optimization efficiency/resources roughly constant

Cognitive = Human intelligence = integral of evolutionary optimization velocity over a few hundred million years, then roughly constant over the last ten thousand years

Metaknowledge = Professional Specialization, Science, etc. = integral over cognition we did about procedures to follow in thinking, where metaknowledge can also feed on itself, there were major insights and cascades, etc.

Knowledge = all that actual science, engineering, and general knowledge accumulation we did = integral of cognition+metaknowledge(current knowledge) over time, where knowledge feeds upon itself in what seems to be a roughly exponential process

Object level = stuff we actually went out and did = integral of cognition+metaknowledge+knowledge(current solutions); over a short timescale this tends to be smoothly exponential to the degree that the people involved understand the idea of investments competing on the basis of interest rate, but over medium-range timescales the exponent varies, and on a long range the exponent seems to increase

If you were to summarize that in one breath, it would be, "with constant natural selection pushing on brains, progress was linear or mildly accelerating; with constant brains pushing on metaknowledge and knowledge and object-level progress feeding back to metaknowledge and optimization resources, progress was exponential or mildly superexponential".

Now fold back the object level so that it becomes the metacognitive level.

And note that we're doing this through a chain of differential equations, not just one; it's the final output at the object level, after all those integrals, that becomes the velocity of metacognition.

You should get...

...very fast progress? Well, no, not necessarily. You can also get nearly zero progress.

If you're a recursified optimizing compiler, you rewrite yourself just once, get a single boost in speed (like 50% or something), and then never improve yourself any further, ever again.

If you're EURISKO, you manage to modify some of your metaheuristics, and the metaheuristics work noticeably better, and they even manage to make a few further modifications to themselves, but then the whole process runs out of steam and flatlines.

It was human intelligence that produced these artifacts to begin with. Their own optimization power is far short of human - so incredibly weak that, after they push themselves along a little, they can't push any further. Worse, their optimization at any given level is characterized by a limited number of opportunities, which once used up are gone - extremely sharp diminishing returns.

When you fold a complicated, choppy, cascade-y chain of differential equations in on itself via recursion, it should either flatline or blow up. You would need exactly the right law of diminishing returns to fly through the extremely narrow soft takeoff keyhole.

The observed history of optimization to date makes this even more unlikely. I don't see any reasonable way that you can have constant evolution produce human intelligence on the observed historical trajectory (linear or accelerating), and constant human intelligence produce science and technology on the observed historical trajectory (exponential or superexponential), and fold that in on itself, and get out something whose rate of progress is in any sense anthropomorphic. From our perspective it should either flatline or FOOM.

When you first build an AI, it's a baby - if it had to improve itself, it would almost immediately flatline. So you push it along using your own cognition, metaknowledge, and knowledge - not getting any benefit of recursion in doing so, just the usual human idiom of knowledge feeding upon itself and insights cascading into insights. Eventually the AI becomes sophisticated enough to start improving itself, not just small improvements, but improvements large enough to cascade into other improvements. (Though right now, due to lack of human insight, what happens when modern researchers push on their AGI design is mainly nothing.) And then you get what I. J. Good called an "intelligence explosion".

I even want to say that the functions and curves being such as to allow hitting the soft takeoff keyhole, is ruled out by observed history to date. But there are small conceivable loopholes, like "maybe all the curves change drastically and completely as soon as we get past the part we know about in order to give us exactly the right anthropomorphic final outcome", or "maybe the trajectory for insightful optimization of intelligence has a law of diminishing returns where blind evolution gets accelerating returns".

There's other factors contributing to hard takeoff, like the existence of hardware overhang in the form of the poorly defended Internet and fast serial computers. There's more than one possible species of AI we could see, given this whole analysis. I haven't yet touched on the issue of localization (though the basic issue is obvious: the initial recursive cascade of an intelligence explosion can't race through human brains because human brains are not modifiable until the AI is already superintelligent).

But today's post is already too long, so I'd best continue tomorrow.

Post scriptum: It occurred to me just after writing this that I'd been victim of a cached Kurzweil thought in speaking of the knowledge level as "exponential". Object-level resources are exponential in human history because of physical cycles of reinvestment. If you try defining knowledge as productivity per worker, I expect that's exponential too (or productivity growth would be unnoticeable by now as a component in economic progress). I wouldn't be surprised to find that published journal articles are growing exponentially. But I'm not quite sure that it makes sense to say humanity has learned as much since 1938 as in all earlier human history... though I'm quite willing to believe we produced more goods... then again we surely learned more since 1500 than in all the time before. Anyway, human knowledge being "exponential" is a more complicated issue than I made it out to be. But human object level is more clearly exponential or superexponential.

Recursive Self-Improvement

Personal Blog

43

New Comment

Rendering 0/54 comments, sorted by

oldest

(show more) Click to highlight new comments since: Today at 7:46 PM

Moderation Log

43 Recursive Self-Improvement

by Eliezer Yudkowsky

1st Dec 2008

15 min read

43

Followup to: Life's Story Continues, Surprised by Brains, Cascades, Cycles, Insight, Recursion, Magic, Engelbart: Insufficiently Recursive, Total Nano Domination

Most people who object to this scenario, object to the "fast" part. Robin Hanson objected to the "local" part. I'll try to handle both, though not all in one shot today.

Quick review:

"Optimization slope" is the goodness and number of opportunities in the volume of solution space you're currently exploring, on whatever your problem is;
"Optimization resources" is how much computing power, sensory bandwidth, trials, etc. you have available to explore opportunities;
"Optimization efficiency" is how well you use your resources. This will be determined by the goodness of your current mind design - the point in mind design space that is your current self - along with its knowledge and metaknowledge (see below).

Optimizing yourself is a special case, but it's one we're about to spend a lot of time talking about.

So I shall stratify this causality into levels - the boundaries being semi-arbitrary, but you've got to draw them somewhere:

"Metacognitive" is the optimization that builds the brain - in the case of a human, natural selection; in the case of an AI, either human programmers or, after some point, the AI itself.
"Cognitive", in humans, is the labor performed by your neural circuitry, algorithms that consume large amounts of computing power but are mostly opaque to you. You know what you're seeing, but you don't know how the visual cortex works. The Root of All Failure in AI is to underestimate those algorithms because you can't see them... In an AI, the lines between procedural and declarative knowledge are theoretically blurred, but in practice it's often possible to distinguish cognitive algorithms and cognitive content.
"Metaknowledge": Discoveries about how to discover, "Science" being an archetypal example, "Math" being another. You can think of these as reflective cognitive content (knowledge about how to think).
"Knowledge": Knowing how gravity works.
"Object level": Specific actual problems like building a bridge or something.

Roughness: A search space can be naturally rough - have unevenly distributed slope. With constant optimization pressure, you could go through a long phase where improvements are easy, then hit a new volume of the search space where improvements are tough. Or vice versa. Call this factor roughness.
Resource overhangs: Rather than resources growing incrementally by reinvestment, there's a big bucket o' resources behind a locked door, and once you unlock the door you can walk in and take them all.

And these other factors previously covered:

Cascades are when one development leads the way to another - for example, once you discover gravity, you might find it easier to understand a coiled spring.
Cycles are feedback loops where a process's output becomes its input on the next round. As the classic example of a fission chain reaction illustrates, a cycle whose underlying processes are continuous, may show qualitative changes of surface behavior - a threshold of criticality - the difference between each neutron leading to the emission of 0.9994 additional neutrons versus each neutron leading to the emission of 1.0006 additional neutrons. k is the effective neutron multiplication factor and I will use it metaphorically.
Insights are items of knowledge that tremendously decrease the cost of solving a wide range of problems - for example, once you have the calculus insight, a whole range of physics problems become a whole lot easier to solve. Insights let you fly through, or teleport through, the solution space, rather than searching it by hand - that is, "insight" represents knowledge about the structure of the search space itself.

and finally,

Recursion is the sort of thing that happens when you hand the AI the object-level problem of "redesign your own cognitive algorithms".

To see the brain circuitry vary, you've got to look at a chimpanzee, basically. Which is not something that most humans spend a lot of time doing, because chimpanzees can't play our games.

Recursion that can rewrite the cognitive level is worth distinguishing.

So what happens when you start rewriting cognitive algorithms?

Well, we do have one well-known historical case of an optimization process writing cognitive algorithms to do further optimization; this is the case of natural selection, our alien god.

Natural selection on sexual multicellular eukaryotic life can probably be treated as, to first order, an optimizer of roughly constant efficiency and constant resources.
Natural selection does not have anything akin to insights. It does sometimes stumble over adaptations that prove to be surprisingly reusable outside the context for which they were adapted, but it doesn't fly through the search space like a human. Natural selection is just searching the immediate neighborhood of its present point in the solution space, over and over and over.
Natural selection does have cascades; adaptations open up the way for further adaptations.

Or again: It does not seem to require 1000 times as many genes to build a human brain as to build a chimpanzee brain, even though human brains can build toys that are a thousand times as neat.

But all this is to say only of natural selection, which is not recursive.

The first process grows at a rate that is linear over time; the second process grows at a rate that is linear in its cumulative return so far.

The too-obvious mathematical idiom to describe the impact of recursion is replacing an equation

y = f(t)

with

dy/dt = f(y)

For example, in the case above, reinvesting our returns transformed the linearly growing

y = m*t

into

y' = m*y

whose solution is the exponentially growing

y = e^(m*t)

Now... I do not think you can really solve equations like this to get anything like a description of a self-improving AI.

But it's the obvious reason why I don't expect the future to be a continuation of past trends. The future contains a feedback loop that the past does not.

As a different Eliezer Yudkowsky wrote, very long ago:

"If computing power doubles every eighteen months, what happens when computers are doing the research?"

With history to date, we've got a series of integrals looking something like this:

Metacognitive = natural selection, optimization efficiency/resources roughly constant

Cognitive = Human intelligence = integral of evolutionary optimization velocity over a few hundred million years, then roughly constant over the last ten thousand years

Metaknowledge = Professional Specialization, Science, etc. = integral over cognition we did about procedures to follow in thinking, where metaknowledge can also feed on itself, there were major insights and cascades, etc.

Knowledge = all that actual science, engineering, and general knowledge accumulation we did = integral of cognition+metaknowledge(current knowledge) over time, where knowledge feeds upon itself in what seems to be a roughly exponential process

Object level = stuff we actually went out and did = integral of cognition+metaknowledge+knowledge(current solutions); over a short timescale this tends to be smoothly exponential to the degree that the people involved understand the idea of investments competing on the basis of interest rate, but over medium-range timescales the exponent varies, and on a long range the exponent seems to increase

Now fold back the object level so that it becomes the metacognitive level.

You should get...

...very fast progress? Well, no, not necessarily. You can also get nearly zero progress.

If you're a recursified optimizing compiler, you rewrite yourself just once, get a single boost in speed (like 50% or something), and then never improve yourself any further, ever again.

But today's post is already too long, so I'd best continue tomorrow.

Recursive Self-Improvement

Personal Blog

43

Mentioned in

146My thoughts on the social response to AI risk

94"Outside View!" as Conversation-Halter

79Distinguishing definitions of takeoff

38Against easy superintelligence: the unforeseen friction argument

36Hard Takeoff

Load More (5/14)

New Comment

Rendering 0/54 comments, sorted by

oldest

(show more) Click to highlight new comments since: Today at 7:46 PM

Moderation Log

More from Eliezer Yudkowsky

Curated and popular this week

54Comments

Recursive Self-Improvement — LessWrong

Comment Permalink

Eliezer Yudkowsky18y30

(Compound reply from Eliezer.)

Eliezer: When you fold a complicated, choppy, cascade-y chain of differential equations in on itself via recursion, it should either flatline or blow up. You would need exactly the right law of diminishing returns to fly through the extremely narrow soft takeoff keyhole.

Goetz: This is the most important and controversial claim, so I'd like to see it better-supported. I understand the intuition; but it is convincing as an intuition only if you suppose there are no negative feedback mechanisms anywhere in the whole process, which seems unlikely.

Can you give a plausible example of a negative feedback mechanism as such, apart from a law of diminishing returns that would be (nearly) ruled out by historical evidence already available?

I suspect that human economic growth would naturally tend to be faster and somewhat more superexponential, if it were not for the negative feedback mechanism of governments and bureaucracies with poor incentives, that both expand and hinder whenever times are sufficiently good that no one is objecting strongly enough to stop it; when "economic growth" is not the issue of top concern to everyone, all sorts of actions will be taken to hinder economic growth; when the company is not in immediate danger of collapsing, the bureaucracies will add on paperwork; and universities just go on adding paperwork indefinitely. So there are negative feedback mechanisms built into the human economic growth curve, but an AI wouldn't have them because they basically derive from us being stupid and having conflicting incentives.

What would be a plausible negative feedback mechanism - as apart from a law of diminishing returns? Why wouldn't the AI just stomp on the mechanism?

Hanson: Depending on which abstractions you emphasize, you can describe a new thing as something completely new under the sun, or as yet another example of something familiar. So the issue is which abstractions make the most sense to use. We have seen cases before where when one growth via some growth channel opened up more growth channels, to further enable growth. So the question is how similar those situations are to this situation, where an AI getting smarter allows an AI to change its architecture in more and better ways. Which is another way of asking which abstractions are most relevant.

Well, the whole post above is just putting specific details on that old claim, "Natural selection producing humans and humans producing technology can't be extrapolated to an AI insightfully modifying its low-level brain algorithms, because the latter case contains a feedback loop of an importantly different type; it's like trying to extrapolate a bird flying outside the atmosphere or extrapolating the temperature/compression law of a gas past the point where the gas becomes a black hole."

If you just pick an abstraction that isn't detailed enough to talk about the putative feedback loop, and then insist on extrapolating out the old trends from the absence of the feedback loop, I would consider this a weak response.

Pearson: I think you have a tendency to overlook our lack of knowledge of how the brain works. You talk of constant brain circuitry, when people add new hippocampal cells through their life. We also expand the brain areas devoted to fingers if we are born blind and use braille.

Pearson, "constant brains" means "brains with constant adaptation-algorithms, such as an adaptation-algorithm for rewiring via reinforcement" not "brains with constant synaptic networks". I think a bit of interpretive charity would have been in order here.

Finney: I'd like to focus on the example offered: "Write a better algorithm than X for storing, associating to, and retrieving memories." Is this a well defined task? Wouldn't we want to ask, better by what measure? Is there some well defined metric for this task?

Hal, if this is taking place inside a reasonably sophisticated Friendly AI, then I'd expect there to be something akin to an internal economy of the AI with expected utilons as the common unit of currency. So if the memory system is getting any computer time at all, the AI has beliefs about why it is good to remember things and what other cognitive tasks memory can contribute to. It's not just starting with an inscrutable piece of code that has no known purpose, and trying to "improve" it; it has an idea of what kind of labor the code is performing, and which other cognitive tasks that labor contributes to, and why. In the absence of such insight, it would indeed be more difficult for the AI to rewrite itself, and its development at that time would probably be dominated by human programmers pushing it along.

Ian C.: Eliezer, would a human that modifies the genes that control how his brain is built qualify as the same class of recursion (but with a longer cycle-time), or is it not quite the same?

Owing to our tremendous lack of insight into how genes affect brains, and owing to the messiness of the brain itself as a starting point, we would get relatively slow returns out of this kind of recursion even before taking into account the 18-year cycle time for the kids to grow up.

However, on a scale of returns from ordinary investment, the effect on society of the next generation being born with an average IQ of 140 (on the current scale) might be well-nigh inconceivable. It wouldn't be an intelligence explosion; it wouldn't be the kind of feedback loop I'm talking about - but as humans measure hugeness, it would be huge.

Reid: I'm sure you're aware of Schmidhuber's forays into this area with his Gödel Machine. Doesn't this blur the boundaries between the meta-cognitive and cognitive?

Schmidhuber's "Gödel Machine" is talking about a genuine recursion from object-level to metacognitive level, of the sort I described. However, this problem is somewhat more difficult than Schmidhuber seems to think it is, to put it mildly - but that would be part of the AIXI sequence, which I don't think I'll end up writing. Also, I think some of Schmidhuber's suggestions potentially hamper the system with a protected level.

Vassar: OTOH, it bizarrely appears to be the case that over a large range of chess ranks, human players seem to gain effective chess skill measured by chess rank with roughly linear training while chess programs gain it via exponential speed-up.

I expect that what you're looking at is a navigable search space that the humans are navigating and the AI is grasping through brute-force techniques - yes, Deep Blue wasn't literally brute force, but it was still navigating raw Chess rather than Regularity in Chess. If you're searching the raw tree, returns are logarithmic; the human process of grokking regularities seems to deliver linear returns over practice with a brain in good condition. However, with Moore's Law in play (exponential improvements delivered by human engineers) the AIs outran the brains.

Humans getting linear returns where dumb algorithms get logarithmic returns, seems to be a fairly standard phenomenon in my view - consider natural selection trying to go over a hump of required simultaneous changes, for example.

Tim Tyler: Brainpower went into making new brains historically - via sexual selection. Feedback from the previous generation of brains into the next generation has taken place historically.

If no one besides me thinks this claim is credible, I'll just go ahead and hold it up as an example of the kind of silliness I'm talking about, so that no one accuses me of attacking a strawman.

(Quick reductio: Imagine Jane Cavewoman falling in love with Johnny Caveman on the basis of a foresightful extrapolation of how Johnny's slightly mutated visual cortex, though not useful in its own right, will open up the way for further useful mutations, thus averting the unforesightful basis of natural selection... Sexual selection just applies greater selection pressure to particular characteristics; it doesn't change the stupid parts of evolution at all - in fact, it often makes evolution even more stupid by decoupling fitness from characteristics we would ordinarily think of as "fit" - and this is true even though brains are involved. Missing this and saying triumphantly, "See? We're recursive!" is an example of the overeager rush to apply nice labels that I was talking about earlier.)

Drucker: The problem, as I see it, is that you can't take bits out of a running piece of software and replace them with other bits, and have them still work, unless said piece of software is trivial... The human brain is a mass of interconnecting systems, all tied together in a mish-mash of complexity. You couldn't upgrade any one part of it by finding a faster replacement for any one section of it. Attempting to perform brain surgery on yourself is going to be a slow, painstaking process, leaving you with far more dead AIs than live ones.

As other commenters pointed out, plenty of software is written to enable modular upgrades. An AI with insight into its own algorithms and thought processes is not making changes by random testing like it was bloody evolution or something. A Friendly AI uses deterministic abstract reasoning in this case - I guess I'd have to write a post about how that works to make the point, though.

A poorly written AI might start out as the kind of mess you're describing, and of course, also lack the insight to make changes better than random; and in that case, would get much less mileage out of self-improvement, and probably stay inert.

See in context