Eliezer Yudkowsky write a post on Facebook on on Oct 17, where I replied at the time. Yesterday he reposted that here (link), minus my responses. So I’ve composed the following response to put here:
I have agreed that an AI-based economy could grow faster than does our economy today. The issue is how fast the abilities of one AI system might plausibly grow, relative to the abilities of the entire rest of the world at that time, across a range of tasks roughly as broad as the world economy. Could one small system really “foom” to beat the whole rest of the world?
As many have noted, while AI has often made impressive and rapid progress in specific narrow domains, it is much less clear how fast we are progressing toward human level AGI systems with scopes of expertise as broad as those of the world economy. Averaged over all domains, progress has been slow. And at past rates of progress, I have estimated that it might take centuries.
Over the history of computer science, we have developed many general tools with simple architectures and built from other general tools, tools that allow super human performance on many specific tasks scattered across a wide range of problem domains. For example, we have superhuman ways to sort lists, and linear regression allows superhuman prediction from simple general tools like matrix inversion.
Yet the existence of a limited number of such tools has so far been far from sufficient to enable anything remotely close to human level AGI. Alpha Go Zero is (or is built from) a new tool in this family, and its developers deserve our praise and gratitude. And we can expect more such tools to be found in the future. But I am skeptical that it is the last such tool we will need, or even remotely close to the last such tool.
For specific simple tools with simple architectures, architecture can matter a lot. But our robust experience with software has been that even when we have access to many simple and powerful tools, we solve most problems via complex combinations of simple tools. Combinations so complex, in fact, that our main issue is usually managing the complexity, rather than including the right few tools. In those complex systems, architecture matters a lot less than does lots of complex detail. That is what I meant by suggesting that architecture isn’t the key to AGI.
You might claim that once we have enough good simple tools, complexity will no longer be required. With enough simple tools (and some data to crunch), a few simple and relatively obvious combinations of those tools will be sufficient to perform most all tasks in the world economy at a human level. And thus the first team to find the last simple general tool needed might “foom” via having an enormous advantage over the entire rest of the world put together. At least if that one last tool were powerful enough. I disagree with this claim, but I agree that neither view can be easily and clearly proven wrong.
Even so, I don’t see how finding one more simple general tool can be much evidence one way or another. I never meant to imply that we had found all the simple general tools we would ever find. I instead suggest that simple general tools just won’t be enough, and thus finding the “last” tool required also won’t let its team foom.
The best evidence regarding the need for complexity in strong broad systems is the actual complexity observed in such systems. The human brain is arguably such a system, and when we have artificial systems of this sort they will also offer more evidence. Until then one might try to collect evidence about the distribution of complexity across our strongest broadest systems, even when such systems are far below the AGI level. But pointing out that one particular capable system happens to use mainly one simple tool, well that by itself can’t offer much evidence one way or another.
Honest question - is there some specific technical sense in which you are using "complex"? Colloquially, complex just means a thing consisting of many parts. Any neural network is "a thing consisting of many parts", and I can generally add arbitrarily many "parts" by changing the number of layers or neurons-per-layer or whatever at initialization time.
I don't think this is what you mean, though. You mean something like architectural complexity, though I think the word "architectural" is a weasel word here that lets you avoid explaining what exactly is missing from, e.g., AlphaGo Zero. I think by "complex" you mean something like "a thing consisting of many distinct sub-things, with the critical functional details spread across many levels of components and their combinations". Or perhaps "the design instructions for the thing cannot be efficiently compressed". This is the sense in which the brain is more "complex" than the kidneys.
(Although, the design instructions for the brain can be efficiently compressed, and indeed brains are made from surprisingly simple instructions. A developed, adult brain can't be efficiently compressed, true, but that's not a fair comparison. A blank-slate initialized AlphaGo network, and that same network after training on ten million games of Go, are not the same artifact.)
Other words aside from "complex" and "architecture" that I think you could afford to taboo for the sake of clarity are "simple" and "general". Is the idea of a neural network "simple"? Is a convnet "general"? Is MCTS a "simple, general" algorithm or a "complex, narrow" one? These are bad questions because all those words must be defined relative to some standard that is not provided. What problem are you trying to solve, what class of techniques are you regarding as relevant for comparison? A convnet is definitely "complex and narrow" compared to linear regression, as a mathematical technique. AlphaGo Zero is highly "complex and narrow" relative to a vanilla convnet.
If your answer to "how complex and how specific a technique do you think we're missing?" is always "more complex and more specific than whatever Deepmind just implemented", then we should definitely stop using those words.
I think that's a good way of framing it. Imagine it's the far future, long after AI is a completely solved problem. Just for fun, somebody writes the smallest possible fully general seed AI in binary code. How big is that program? I'm going to guess it's not bigger than 1 GB. The human genome is ~770 MB. Yes, it runs on "chemistry", but that laws of physics underpinning chemistry/physics actually don't take that many bytes to specify. Certainly not hundreds of megabytes.
Maybe a clearer question would be, how many bytes do