Eliezer_Yudkowsky comments on Recursive Self-Improvement - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (54)
(Compound reply from Eliezer.)
Goetz: This is the most important and controversial claim, so I'd like to see it better-supported. I understand the intuition; but it is convincing as an intuition only if you suppose there are no negative feedback mechanisms anywhere in the whole process, which seems unlikely.
Can you give a plausible example of a negative feedback mechanism as such, apart from a law of diminishing returns that would be (nearly) ruled out by historical evidence already available?
I suspect that human economic growth would naturally tend to be faster and somewhat more superexponential, if it were not for the negative feedback mechanism of governments and bureaucracies with poor incentives, that both expand and hinder whenever times are sufficiently good that no one is objecting strongly enough to stop it; when "economic growth" is not the issue of top concern to everyone, all sorts of actions will be taken to hinder economic growth; when the company is not in immediate danger of collapsing, the bureaucracies will add on paperwork; and universities just go on adding paperwork indefinitely. So there are negative feedback mechanisms built into the human economic growth curve, but an AI wouldn't have them because they basically derive from us being stupid and having conflicting incentives.
What would be a plausible negative feedback mechanism - as apart from a law of diminishing returns? Why wouldn't the AI just stomp on the mechanism?
Well, the whole post above is just putting specific details on that old claim, "Natural selection producing humans and humans producing technology can't be extrapolated to an AI insightfully modifying its low-level brain algorithms, because the latter case contains a feedback loop of an importantly different type; it's like trying to extrapolate a bird flying outside the atmosphere or extrapolating the temperature/compression law of a gas past the point where the gas becomes a black hole."
If you just pick an abstraction that isn't detailed enough to talk about the putative feedback loop, and then insist on extrapolating out the old trends from the absence of the feedback loop, I would consider this a weak response.
Pearson, "constant brains" means "brains with constant adaptation-algorithms, such as an adaptation-algorithm for rewiring via reinforcement" not "brains with constant synaptic networks". I think a bit of interpretive charity would have been in order here.
Hal, if this is taking place inside a reasonably sophisticated Friendly AI, then I'd expect there to be something akin to an internal economy of the AI with expected utilons as the common unit of currency. So if the memory system is getting any computer time at all, the AI has beliefs about why it is good to remember things and what other cognitive tasks memory can contribute to. It's not just starting with an inscrutable piece of code that has no known purpose, and trying to "improve" it; it has an idea of what kind of labor the code is performing, and which other cognitive tasks that labor contributes to, and why. In the absence of such insight, it would indeed be more difficult for the AI to rewrite itself, and its development at that time would probably be dominated by human programmers pushing it along.
Owing to our tremendous lack of insight into how genes affect brains, and owing to the messiness of the brain itself as a starting point, we would get relatively slow returns out of this kind of recursion even before taking into account the 18-year cycle time for the kids to grow up.
However, on a scale of returns from ordinary investment, the effect on society of the next generation being born with an average IQ of 140 (on the current scale) might be well-nigh inconceivable. It wouldn't be an intelligence explosion; it wouldn't be the kind of feedback loop I'm talking about - but as humans measure hugeness, it would be huge.
Schmidhuber's "Gรถdel Machine" is talking about a genuine recursion from object-level to metacognitive level, of the sort I described. However, this problem is somewhat more difficult than Schmidhuber seems to think it is, to put it mildly - but that would be part of the AIXI sequence, which I don't think I'll end up writing. Also, I think some of Schmidhuber's suggestions potentially hamper the system with a protected level.
I expect that what you're looking at is a navigable search space that the humans are navigating and the AI is grasping through brute-force techniques - yes, Deep Blue wasn't literally brute force, but it was still navigating raw Chess rather than Regularity in Chess. If you're searching the raw tree, returns are logarithmic; the human process of grokking regularities seems to deliver linear returns over practice with a brain in good condition. However, with Moore's Law in play (exponential improvements delivered by human engineers) the AIs outran the brains.
Humans getting linear returns where dumb algorithms get logarithmic returns, seems to be a fairly standard phenomenon in my view - consider natural selection trying to go over a hump of required simultaneous changes, for example.
If no one besides me thinks this claim is credible, I'll just go ahead and hold it up as an example of the kind of silliness I'm talking about, so that no one accuses me of attacking a strawman.
(Quick reductio: Imagine Jane Cavewoman falling in love with Johnny Caveman on the basis of a foresightful extrapolation of how Johnny's slightly mutated visual cortex, though not useful in its own right, will open up the way for further useful mutations, thus averting the unforesightful basis of natural selection... Sexual selection just applies greater selection pressure to particular characteristics; it doesn't change the stupid parts of evolution at all - in fact, it often makes evolution even more stupid by decoupling fitness from characteristics we would ordinarily think of as "fit" - and this is true even though brains are involved. Missing this and saying triumphantly, "See? We're recursive!" is an example of the overeager rush to apply nice labels that I was talking about earlier.)
As other commenters pointed out, plenty of software is written to enable modular upgrades. An AI with insight into its own algorithms and thought processes is not making changes by random testing like it was bloody evolution or something. A Friendly AI uses deterministic abstract reasoning in this case - I guess I'd have to write a post about how that works to make the point, though.
A poorly written AI might start out as the kind of mess you're describing, and of course, also lack the insight to make changes better than random; and in that case, would get much less mileage out of self-improvement, and probably stay inert.