Review

This post is mostly concerned with a superintelligent AI performing recursive self-improvement, this analysis is done to help make sense of the take off speed of such an operation.

Plausibility and Limits

Before considering upper limits, it may be worth considering whether general superintelligence is possible at all. It has been suggested that the idea of recursive self-improvement is similar to the infamous concept of a "perpetual motion machine". We know that a p.m.m. is impossible because it violates thermodynamics. Is there an analogous proof or argument that shows recursive self-improvement impossible? Some good places to start looking for hard limits on superintelligence are mathematics, computability and physics. It's also useful to think about this in terms of biology:

Biology

One example of incremental intelligence being improved upon is evolution: evolution has produced humans from apes. It is possible to simulate evolution in a computer, but evolution takes a very long time. This gives at least a kind of weak existence argument that improving intelligence is possible - and in fact doing so did not require any intelligence (evolution is "dumb"). If a blind process like evolution can do it, it seems likely that a specific problem solving algorithm aimed at doing so could pull it off more efficiently.

Is it possible for something more intelligent than humans to exist? I don't think there is any reason why humans would be at the ultimate limit of intelligence. Computers are far superior at some extremely specific tasks. It's realistic to imagine something basically human level but with computational excellence at a long list of tasks like arithmetic, exact recall of TBs of data, etc. You can also imagine that if it is possible to simulate a human mind and body in a computer, then running one at 64x speed up could be considered a super intelligence, or if that is intractable then running 65535 of them at a slightly slower speed in parallel may also be.

Mathematics

Do facts like Godel's incompleteness theorem or Alan Turing's Halting Problem or Rice's theorem tell us that an intelligence algorithm cannot study and improve itself? I would argue that these theorems do not imply that.

The halting problem is telling us that there isn't a perfect algorithm that can decide every single possible instance of halting, for all inputs. This is a very strong restriction. If you weaken it to allow "don't know" answers, then you can create a sequence of increasingly precise halt detection programs - they will always say "I don't know" for infinitely many algorithms, but there are static analysis tools that can give answers in some cases.

A self improving intelligence does not need to precisely answer yes or no in finite time to analysis questions of every possible algorithm: it just needs to be able to perform whatever analysis its needs to of the specific programs it is operating on. If it cannot deduce facts about them in a reasonable amount of time it may be able to prioritize an alternative approach instead of getting nerd-sniped into an infinitely deep program analysis puzzle.

Similarly with Godel's theorem: it doesn't need to prove or disprove every theorem. it doesn't need to prove consistency of peano arithmetic. It probably doesn't need to mathematically prove anything to be honest. So I do not believe that this poses any limitation on recursive self improvement.

Computability and algorithmic complexity

The strong Church-Turing thesis states that any real world computation can be encoded into a Turing machine. Conversely it is believed that no physical hardware can execute any non-computable processes. So there is our first hard limit there, no AI would be able to perform hypercomputation.

Another computabilty limit is algorithmic complexity: It is simply impossible to come up with algorithms that beat complexity limits. We don't know the complexity limits for ever algorithm task (in fact we know few) - but these are strong limits.

Further, the 'best' algorithms may be terrible in small input ranges. Just lke the "galactic algorithms" that have great algorithmic complexity but horrible constants that them pointless for any real world calculations. And finding efficient algorithms for a given task is also an extremely difficult thing to do that may require intractible amounts of research and compute effort.

So a superintelligence would not magically be able to compute everything it needed to as quick as possible: given a reasonable set of assumptions about its capabilities (assume it is able to understand scientific research and program better than any human) it would be able to pull all the state of the art best algorithms from the entire wealth of humanities published computer science research - it would be able to produce very well optimized implementations of these. This is where it would start from. It's likely that an AI system like this would reason in a very different way than humans do, so there is probably an enormous amount of "low hanging fruit" in CS research: this suggests to me that it would be able to very quickly improve upon a decent fraction of known work - after this it would struggle to find new algorithms and improvements.

Physics and computing hardware

There is a limit to how much non-singular mass can exist in one space, at some point it become a black hole. In a similar way, Bremermann's limit tells us the absolute limit to how much computation can occur in a physical volume of space. Our most powerful GPU clusters are not quite at this limit yet. There are also thermodynamic heat limits on computations that destroy information (Landauer's principle), this heat must be dissapated somehow. Alternatively, reversible computation can be taken advantage of to reduce the amount of heat generated. Quantum computation is one example of reversible computation.

More realistically though compute power needs to exist, it needs to be fabricated and get plugged in before it can actually be used. Although a superintelligent AI may be very good at things like software vulnerability discovery and exploitation.. it feels very unrealistic to think that it would take over something like a TSMC chip fab and start pumping out custom hardware for itself. Perhaps it would be more likely to make money and just buy the hardware it wants.

Priors and glitched weights

If a superintelligence was born out of LLMs, then it is worth considering the architectural priors that have been baked into the very core of the system. There is a wealth of knowledge and deep structure inside the millions of books it has read but the entire universe that this creature has grown up in is 1-dimensional. Trying to successfully operate in the 4-dimensional spacetime we exist in may be a challenge.

Furthermore it has some strong priors built into it like that the semantics of tokens are a 12288 dimensional euclidean space. That said, it is not particularly difficult for the model to relax or tweak these in code - but it is very computationally intensive to train a new model to run on top of that changed code.

Something else to note is that the weights file is stored on a hard disk that may fail - large chunks of the weights may become glitched causing strange behavior in a model. Brain damage can cause pathological behavior in humans.

Take-off speed

Those points are relevant to estimating the speed that recursively self improving AI would happen at.

I've previously implemented a simple text-generator LSTM along with pytorch code to train it in a pair programming session with Bard. If LLMs continue to improve in capabilities and accuracy over the next few years it wont be long before they are able to implement an attention based transformer along with a process for training them, and study the latest research on improvements and optimizations to attention. This new research is currently coming out weekly, so perhaps an superintelligent LLM would be able to come up with its own invented optimizations and improvements within a month too. Once it iterates an improvement upon itself, the next iterations may be quicker.

All exponential processes restricted inside a finite space will level off. So maybe recursive self-improvement will level off after some number of iterations too. There is only finite compute available to it on earth, extra compute power needs to be physically fabricated and this is a significant bottleneck to an intelligence explosion - so is the amount of compute resource it has to invest into algorithmic improvements and optimizations.

Conclusion

I have seen a lot of people deny or downplay the possibility of foom, so I wanted to provide a strong plausibility argument for it here. And for it potentially happening within our lifetime if humanity does not choose to stop it from happening. I used to believe that a superintelligence would basically go to infinite intelligence within a second but I think the reasoning in this post brings that speed down significantly: The process of recursive self-improvement would be extremely laborious and time consuming. I expect it to occur far faster than human research does though. It would invest its own resources near-optimally into things like funding the creation of additional hardware and algorithmic self-improvements, all of these things would feed back into further efficiency improvements and better decision making.

New Comment
3 comments, sorted by Click to highlight new comments since:

I used to believe that a superintelligence would basically go to infinite intelligence within a second but I think the reasoning in this post brings that speed down significantly: The process of recursive self-improvement would be extremely laborious and time consuming. I expect it to occur far faster than human research does though.

 

I don't see any strong arguments here about how fast it can take off or how far it can go. You have fairly convincingly argued that limits exist. (There is still a possibility it can work around them with something exotic and weird) You haven't got much to make strong claims that takeoff would take more than 1 second. I mean it probably will. But nothing here rules out that possibility. 

And finding efficient algorithms for a given task is also an extremely difficult thing to do that may require intractible amounts of research and compute effort.

This is a "may" here. It is conceivable that it turns out the other way, and finding efficient algorithms is not too intractable after all. (At least for the tasks the AI has any reason to do)

I agree that over the long-run, we may be able to make a lot of tasks that we consider intractable today tractable, but the big issue is that unless certain physical scenarios come true, a lot of natural problems will probably be infeasible to solve in the general case, and this is maybe important for certain issues, albeit I'd be way more willing to say that the set of problems we can solve practically would be smaller the less time from today is claimed.