Moreover, granting neural networks, trading cognitive content has turned out to be not particularly hard. It does not require superintelligence to share representations between different neural networks; a language model can be adapted to handle visual data without enormous difficulty. Encodings from BERT or an ImageNet model can be applied to a variety of downstream tasks, and this is by now a standard element in toolkits and workflows. When you share architectures and training data, as for two differently fine-tuned diffusion models, you can get semantically meaningful merges between networks simply by taking the actual averages of their weights. Thoughts are not remotely "written in a different language."
Huh, I am very surprised by this section. When I read the description I thought you would obviously call this prediction the other way around.
The part where you can average weights is unique to diffusion models, as far as I can tell, which makes sense because the 2-d structure of the images is very local, and so this establishes a strong preferred basis for the representations of different networks.
Exchanging knowledge between two language models currently seems appr...
I would also call this one for Eliezer. I think we mostly just retrain AI systems without reusing anything. I think that's what you'd guess on Eliezer's model, and very surprising on Robin's model. The extent to which we throw things away is surprising even to a very simple common-sense observer.
I would have called "Human content is unimportant" for Robin---it seems like the existing ML systems that are driving current excitement (and are closest to being useful) lean extremely heavily on imitation of human experts and mostly don't make new knowledge themselves. So far game-playing AI has been an exception rather than the rule (and this special case was already mostly established by the time of the debate).
That said, I think it would be reasonable to postpone judgment on most of these questions since we're not yet in the end of days (Robin thinks it's still fairly far, and Eliezer thinks it's close but things will change a lot by the intelligence explosion). The main ones I'd be prepared to call unambiguously already are:
These aren't literally settled, but market odds have moved really far since the debate, and they both seem like defining features of the current world. In each case I'd say that one of the two participants was clearly super wrong and the other was basically right.
Coming in late, but the surprising thing on Yudkowsky's models is that compute was way more important than he realized, with it usually being 50/50 on the most favorable models to Yudkowsky, which means compute increases are not negligible, and algorithms aren't totally dominant.
Even granting the assumption that algorithms will increasingly be a bottleneck, and compute being less important, Yudkowsky way overrated the power of algorithms/thinking hard compared to just getting more resources/scaling.
The part where you can average weights is unique to diffusion models, as far as I can tell, which makes sense because the 2-d structure of the images is very local, and so this establishes a strong preferred basis for the representations of different networks.
Exchanging knowledge between two language models currently seems approximately impossible? Like, you can train on the outputs, but I don't think there is really any way for two language models to learn from each other by exchanging any kind of cognitive content, or to improve the internal representations of a language model by giving it access to the internal representations of another language model.
There's a pretty rich literature on this stuff, transferring representational/functional content between neural networks.
Averaging weights to transfer knowledge is not unique to diffusion models. It works on image models trained with non-diffusion setups (https://arxiv.org/abs/2203.05482, https://arxiv.org/abs/2304.03094) as well as on non-image tasks such as language modeling (https://arxiv.org/abs/2208.03306, https://arxiv.org/abs/2212.04089). Exchanging knowledge between language models via weight averaging is possible pr...
I think requiring a "common initialization + early training trajectory" is a pretty huge obstacle to knowledge sharing, and would de-facto make knowledge sharing among the vast majority of large language models infeasible.
Agreed. That part of my comment was aimed only at the claim about weight averaging only working for diffusion/image models, not about knowledge sharing more generally.
I do think stuff like stitching via cross-attention is kind of interesting, but it feels like a non-scalable way of knowledge sharing, unless I am misunderstanding how it works.
Not sure I see any particular argument against the scalability of knowledge exchange between LLMs in general or via cross-attention, though. Especially if we're comparing the cost of transfer to the cost of re-running the original training. That's why people are exploring this, especially smaller/independent researchers. There's a bunch of concurrent recent efforts to take frozen unimodal models and stitch them into multimodal ones (example from a few days ago https://arxiv.org/abs/2305.17216). Heck, the dominant approach in the community of LLM hobbyists seems to be transferring behaviors and knowledge from GPT-4 into LLaMa variants via targeted synthetic data generation. What kind of scalability are you thinking of?
In addition to what cfoster0 said, I'm kinda excited about the next ~2-3 years of cross LLM knowledge transfer, so this seems a differing prediction about the future, which is fun.
My model for why it hasn't happened already is in part just that most models know the same stuff, because they're trained on extremely similar enormous swathes of text, so there's no gain to be had by sticking them together. That would be why more effort goes into LLM / images / video glue than LLM / LLM glue.
But abstractly, a world where LLMs can meaningfully be connected to vision models but not on to other LLMs would be surprising to me. I expect something like training a model on code, and another model on non-code text, and then sticking them together to be possible.
Note that Hanson currently thinks the chances of AI doom are < 1%, while Yudkowsky thinks that they are > 99%.
It is good to note that the optimistic version of Hanson would be considered doom by many (including Yudkowsky). Doom/utopia definition Yudkowsky is not equal to doom/utopia definition of Hanson.
This is important in many discussions. Many non-doomers have definitions of utopia that many consider to be dystopian. E.g. AI will replace humans to create a very interesting future where the AI's will conquer the stars, some think this is positive others think this is doom because there are no humans.
I skimmed this, but I get the sense that you're interpreting Hanson's predictions in ways that he would not have agreed with. My cached thoughts suggest that Hanson's model predicts deep learning couldn't possibly work, because creating "intelligence" will require lots of custom engineering for different skills instead of "GPU go brr". Hence his admiration of Cyc: it is focusing on implemeting a whole host of skills with lots of integrated knowledge.
See his post "I heart CYC". Here's a quote form it, which I think highlights Hanson's own interpretation of "architecture is overrated":
The lesson Lenat took from EURISKO is that architecture is overrated; AIs learn slowly now mainly because they know so little. So we need to explicitly code knowledge by hand until we have enough to build systems effective at asking questions, reading, and learning for themselves. Prior AI researchers were too comfortable starting every project over from scratch; they needed to join to create larger integrated knowledge bases. This still seems to me a reasonable view, and anyone who thinks Lenat created the best AI system ever should consider seriously the lesson he thinks he learned.
That sure doesn't l...
If we take this as the disagreement -- will AI progress come from a handful of big insights, or many small ones -- I think the world right looks a great deal more like Hanson's view than Yudkowsky's. In his interview with Lex Fridman, Sam Altman characterizes GPT-4 as improving on GPT-3 in a hundred little things rather than a few big things, and that's... by far... my impression of current ML progress. So when I interpret their disagreement in terms of the kind of work you need to do before attaining AGI, I tend to agree that Hanson is right.
This also feels confused to me. Of course the key insight of the Transformer architecture was super simple, and as far as I can tell the primary difference between GPT-4 and GPT-3 is throwing a lot more compute at it, combined with a lot of engineering work to get it to work at larger scales and more GPUs (in a way that doesn't substantially improve performance).
We don't know how GPT-4 works, but I would currently bet that within 2-3 years we will see a system that gets GPT-4 performance and compute-efficiency whose source-code is extremely simple and does not require a lot of clever hacks, but whose difference from GPT-3 will be best characterized by "0 to 2 concrete insights that improved things", since that is exactly what we've seen with GPT-2 and GPT-3. The first system to reach a capability threshold often has a bunch of hacks, usually stemming from a lack of polish or understanding or just bugs, which then iteratively get pared down as progress continues.
I agree I'm confused here. But it's hard to come down to clear interpretations. I kinda think Hanson and Yudkowsky are also confused.
Like, here are some possible interpretations on this issue, and how I'd position Hanson and Yudkowsky on them based on my recollection and on vibes.
Or even 4. People know when they're making a significant improvement to AI -- the difference between "clever hack" and "deep insight" is something you see from beforehand just as much as afterwards. (Hanson vibes con, Yudkowsky vibes pro, gotta read 1000 pages of philosophy of progress before you call it, maybe depends on the technology, I tend to think people often don't know)
Which is why this overall section is in the "hard to call" area.
But -- regardless of Yudkowsky's current position -- it still remains that you'd have been extremely surprised by the last decade's use of compute if you had believed him, and much less surprised if you had believed Hanson.
I think you are pointing towards something real here, but also, algorithmic progress is currently outpacing compute growth by quite a bit, at least according to the Epoch AI estimates I remember. I also expect algorithmic progress to increase in importance.
I do think that some of the deep learning revolution turned out to be kind of compute bottlenecked, but I don't believe this is currently that true anymore, though I think it's kind of messy (since it's unclear what fraction of compute-optimizations themselves were bottlenecked on making it cheaper to experiment by having cheaper compute).
I do think that some of the deep learning revolution turned out to be kind of compute bottlenecked, but I don't believe this is currently that true anymore
I had kind of the exact opposite impression of compute bottlenecks (that deep learning was not meaingfully compute bottlenecked until very recently). OpenAI apparently has a bunch of products and probably also experiments that are literally just waiting for H100s to arrive. Probably this is mainly due to the massive demand for inference, but still, this seems like a kind actual hardware bottleneck that is pretty new for the field of DL. It kind of has a parallel to Bitcoin mining technology, where the ability to get the latest-gen ASICs first was (still is?) a big factor in miner profitability.
Note that Hanson currently thinks the chances of AI doom are < 1%
I think this is a common misconception of Hanson's views. If you define "doom" as human extinction, he's put it at about 30% within one year after human-level AI (I don't have a more recent link on hand but I've seen him talk about it on Twitter a few times, and I don't think he's changed his views substantially).
I think this is a pretty good and fair roundup, but I want to add as very lazy bit of personal context short of actually explaining my takes:
Both when I read the FOOM debate, and skimming over it again now, in my personal opinion Yudkowsky largely comes off better. Yudkowsky makes a few major mistakes that are clearly visible now, like being dismissive of dumb, scaled, connectionist architectures, but the arguments seem otherwise repairable. Contra, I do not know how to well defend Hanson's position.
I don't state this to claim a winner, and for sure there are people who read the arguments the other way, but only to suggest to the reader, if you have the time, consider taking a look and forming your own opinion.
Yudkowsky seems quite wrong here, and Hanson right, about one of the central trends -- and maybe the central trend -- of the last dozen years of AI. Implementing old methods more vigorously is more or less exactly what got modern deep learning started; algorithms in absence of huge compute have achieved approximately nothing.
Really? If you sent a bunch of H100 GPUs (and infrastructure needed to run them) back in time to 2008, people might have been able to invent transformers, GPTs, and all the little quirks that actually make them work a little faster, and a little more cheaply.
OTOH, if you sent back Attention is all you need (and some other papers or documentation on ML from the last decade), without the accompanying hardware, people likely would have gotten pretty far, pretty quickly, just using 2008-level hardware (or buying / building more and faster hardware, once they knew the right algorithms to run on them). People didn't necessarily have a use for all the extra compute, until they invented the algorithms which could actually make use of it.
Even today, just scaling up GPTs even further is one obvious thing to try that is currently somewhat bottlenecked on super...
An actual improvement to say, how Transformers work, would help with speech recognition, language modelling, image recognition, image segmentation, and so on and so forth. Improvements to AI-relevant hardware are a trillion-dollar business. Work compounds so easily on other work that many alignment-concerned people want to conduct all AI research in secret.
This section feels like it misunderstands what Yudkowsky is trying to say here, though I am not confident. I expected this point to not be about "what happens if you find an improvement to transformers i...
I think an important point missing from the discussion on compute is training vs inference: you can totally get a state-of-the-art language model performing inference on a laptop.
This is a slight point in favor of Yudkowsky: thinking is cheap, finding the right algorithm (including weights) is expensive. Right now we're brute-forcing the discovery of this algorithm using a LOT of data, and maybe it's impossible to do any better than brute-forcing. (Well, the human brain can do it, but I'll ignore that.)
Could you run a LLM on a desktop from 2008? No. But, o...
So, like, I remain pretty strongly pro Hanson on this point:
I think LLaMA 7b is very cool, but it's really stretching it to call it a state-of-the-art language model. It's much worse than LLaMA 65b, which much worse than GPT-4, which most people think is > 100b as far as I know. I'm using a 12b model right now while working on an interpretability project... and it is just much, much dumber than these big ones.
Not being able to train isn't a small deal, I think. Learning in a long-term way is a big part of intelligence.
Overall, and not to be too glib, I don't see why fitting a static and subhuman mind into consumer hardware from 2023 means that Yudkowsky doesn't lose points for saying you can fit a learning (implied) and human-level mind into consumer hardware from 2008.
which is indifferent to the simplicify of the architecture the insight lets you find.
The bolded should be "simplicity".
I really disagree with this article. It's basically just saying that you drank the LLM Kool-Aid. LLMs are massively overhyped. GPT-x is not the way to AGI.
This article could have been written a dozen years ago. A dozen years ago, people were saying the same thing: "we've given up on the Good Old-Fashioned AI / Douglas Hofstadter approach of writing algorithms and trying to find insights! it doesn't give us commerical products, whereas the statistical / neural network stuff does!"
And our response was the same as it is today. GOFAI is hard. No one expected t...
It does not require superintelligence to share representations between different neural networks
I don’t think you can train one transformer on a dataset that doesn’t contain any mentions of the fact X but mentions fact Y, then train the second transformer on a dataset that contains Y but not X, and then easily share the knowledge of X and Y between them
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
By 2011, Hanson concedes at least somewhat to Yudkowsky's position and states that Cyc might not have enough information or be in the wrong format (FOOM, 496).
I looked for it on that page, but instead it's on 497 (second-to-last numbered paragraph), where he says:
4. The AI system Eliezer most respects for its promising architecture is eurisko. Its author, Doug Lenat, concluded from it that our main obstacle is not architecture but mental content—the more one knows, the faster one can learn. Lenat’s new Cyc system has much content, though it still doesn’t learn fast. Cyc might not have enough content yet, or perhaps Lenat sought the wrong content or format.
Thank you, this has many interesting points. The takeoff question is the heart of predicting x-risk. With soft takeoff catastrophy seems unlikely, and likely with hard takeoff.
One point though. "Foom" was intended to be a synonym for "intelligence explosion" and "hard takeoff". But not for "recursive self-improvement", although EY perceived the latter to be the main argument for the former, though not the only one. He wrote:
...[Recursive self-improvement] is the biggest, most interesting, hardest-to-analyze, sharpest break-with-the-past contributing to the
TLDR
Starting in 2008, Robin Hanson and Eliezer Yudkowsky debated the likelihood of FOOM: a rapid and localized increase in some AI's intelligence that occurs because an AI recursively improves itself.
As Yudkowsky summarizes his position:
Over the course of this debate, both Hanson and Yudkowsky made a number of incidental predictions about things which could occur before the advent of artificial superintelligence -- or for which we could at the very least receive strong evidence before artificial superintelligence.
On the object level, my conclusions is that when you examine these predictions, Hanson probably does a little better than Yudkowsky. Although depending on how you weigh different topics, I could see arguments from "they do about the same" to "Hanson does much better."
On one meta level, my conclusion is that Hanson's view --- that we should try to use abstractions that have proven prior predictive power -- looks like a pretty good policy.
On another meta level, my conclusion -- springing to a great degree from how painful seeking clear predictions in 700 pages of words has been -- is that if anyone says "I have a great track record" without pointing to specific predictions that they made, you should probably ignore them, or maybe point out their lack of epistemic virtue if you have the energy to spare for doing that kind of criticism productively.
Intro
There are number of difficulties involved in evaluating some public figure's track record. We want to avoid cherry-picking sets of particularly good or bad predictions. And we want to have some baseline to compare them to.
We can mitigate both of these difficulties -- although not, alas, eliminate them -- by choosing one document to evaluate: "The Hanson-Yudkowsky Foom Debate". (All future page numbers refer to this PDF.) Note that the PDF includes the (1) debate-via-blogposts which took place on OvercomingBias, (2) an actual in-person debate that took place at Jane Street in 2011 and (3) further summary materials from Hanson (further blogposts) and Yudkowsky ("Intelligence Explosion Microeconomic"). This spans a period from 2008 to 2013.
I do not intend this to be a complete review of everything in these arguments.
The discussion spans the time from the big bang until hypothetical far future galactic civilizations. My review is a little more constrained: I am only going to look at predictions for which I think we've received strong evidence in the 15 or so years since the debate started.
Note also that the context of this debate was quite different than it would be if it happened today.
At the time of the debate, both Hanson and Yudkowsky believed that machine intelligence would be extremely important, but that the time of its arrival was uncertain. They thought that it would probably arrive this century, but neither had the very, certain short timelines which are common today.
At this point Yudkowsky was interested in actually creating a recursively self-improving artificial intelligence, a "seed AI." For instance, in 2006 the Singularity Institute -- what MIRI was before it renamed -- had a website explicitly stating that they sought funding to create recursively self-improving AI. During the Jane Street debate Yudkowsky humorously describes the Singularity Institute as the "Institute for Carefully Programmed Intelligence Explosion."
So this context is quite different than today.
I think that if I make a major mistake in the below, it's probably that I missed some major statements from Hanson or Yudkowsky, rather than drastically mis-calling the items that I did include. Such a mistake could happen in part because I have tried to be conservative, and mostly included predictions which seem to have multiple affirmations in the text. But I definitely did skim-read parts of the debate that seemed irrelevant to predictions, such as the parts about the origin of life or about the morality of taking over the world with a superhuman AI. So I could very well have missed something.
Feel free to mention such missed predictions in the comments, although please quote and cite page numbers. Rereading this has confirmed my belief that the recollected mythology of positions advanced during this debate is... somewhat different than what people's actual positions were.
Predictions -- Relatively Easy To Call
In this section, I'm going to include predictions which appear to me relatively straightforward.
I don't think many people who read the FOOM debate for the first time now would dispute them. Although they are nevertheless disputable if you try really hard, like everything.
I'll phrase each prediction so that Yudkowsky takes the positive, and Hanson the negative.
"Cyc is not a Promising Approach to Machine Intelligence"
Cyc was (and is) an effort to build an artificial intelligence by building a vast database of logically-connected facts about the world by hand. So a belief like "Bob works as an engineer" is represented by relating the entity <Bob> to <engineer> with <works-as>, in this database. These facts would then be entered into an "inference engine," which can reason about them in long chains of valid proofs. Right now, CycCorp claims Cyc has a knowledge base with 25 million axioms, 40,000 predicates, and so on. Its creator, Douglas Lenat, moved on to Cyc from Eurisko because he decided AI needed in a large base of knowledge to work correctly.
Hanson thinks that this is a promising approach, stating:
On the other hand, Yudkowsky thinks Cyc has approximately zero chance of working well:
Yudkowsky seems obviously right from where we stand now; Cyc was not promising.
The most advanced modern AI systems have zero need to import knowledge from Cyc, and Cyc's abilities pale besides modern LLMs. By 2011, Hanson concedes at least somewhat to Yudkowsky's position and states that Cyc might not have enough information or be in the wrong format (FOOM, 496).
Aside: What counts as being right
I think that Yudkowsky is obviously right here.
But if you wished, you could say that Hanson's position that "Cyc is promising" has not been entirely falsified. CycCorp still appears to have customers. Their product functions. They advertise the conclusions of Cyc as auditable, in a way that the conclusions of DL are not, and this is true. Functionally, Cyc is surpassed by machine learning for basically everything -- but you could say that in the future the approach might possibly turn things around. It's a logically coherent thing to say.
Nevertheless -- I'm comfortable saying that Cyc is the wrong approach, and that Yudkowsky clearly had the better predictions about this. As Yudkowsky said even in 2011, Cyc being promising has "been incrementally more and more falsified" each year (FOOM 476), and each year since 2011 has been further incremental falsification.
My basic criteria for judgement is that, if you had believed Hanson's view, you'd have been waaaaaaaaay more surprised by the future than if you had believed Yudkowsky's view. This will be the approach I'm taking for all the other predictions as well.
"AI Comes Before Whole-Brain Emulations"
Intelligence-on-computers could come in at least two ways.
It could come through AI and machine-learning algorithms manually coded by humans, perhaps inspired by the human brain but ultimately only loosely connected to it. Or it could come from some kind of high-resolution scan of a human brain, which is then virtualized and run on a computer: a whole brain emulation (WBE or "em").
Hanson literally wrote the book on ems (albeit after this debate) and thinks that ems are marginally more likely to occur before hand-coded AI (FOOM, 26).
Yudkowsky also had -- as of the Hanson-Yudkowsky debate, not now -- very broad intervals for the arrival of machine intelligence, which he summarizes as "I don’t know which decade and you don’t know either" (FOOM, 682). Nevertheless, he think AI is likely to occur before ems.
AI seems well on its way, and ems as distant as they did in 2008, so I'm comfortable saying that Yudkowsky's position looks far more accurate right now.
Nevertheless, both Yudkowsky and Hanson explicitly call attention the very broad distribution of their own timelines, so it is a small update towards Yudkowsky over Hanson.
"AI Won't Be Able to Exchange Cognitive Content Easily"
A central part of the dispute between Yudkowsky and Hanson is how localized future growth rates will be.
They both think that an economy with machine intelligences in it -- either em or AI -- will grow very quickly compared to our current economy.
But Hanson sees a world where "these AIs, and their human owners, and the economy that surrounds them, undergo a collective FOOM of self-improvement. No local agent is capable of doing all this work, only the collective system" (FOOM, 276, Yudkowsky summarizing Hanson). Yudkowsky, on the other hand, sees a world where an individual AI undergoes a rapid spike in self-improvement relative to the world; where a brain in a box in a basement can grow quickly to come to out-think all of humanity.
One thing that could influence whether growth is more local or global is whether AIs can trade cognitive content. If such trading such cognitive content with your neighbors is more advantageous -- or trading in general is advantageous -- then growth will probably be more global; if trading is less advantageous, growth will probably be more local.
Yudkowsky thinks trading or simply exchanging cognitive content between AIs is quite unlikely. Part of this is because of the current state AI in 2008, where no one AI architecture has grown to dominate the others:
And a little earlier:
But not all of this is simply projecting the present into the future. He further thinks that even if different AIs were to have the same architecture, trading cognitive content between them would be quite difficult:
By default, he also expects more sophisticated, advanced AIs to have representations that are more opaque to each other. This effect he thinks will be so significant that pre-FOOM AIs might be incapable of doing it: "AI would have to get very sophisticated before it got over the “hump” of increased sophistication making sharing harder instead of easier. I’m not sure this is pre-takeoff sophistication we’re talking about, here" (FOOM, 280).
Hanson, on the other hand, thinks that the current diverse state of AI architectures is simply an artifact of the early state of AI development. As AI research finds solutions that work, we should expect that architectures become more standardized. And as architectures become more standardized, this will make sharing between AIs more easy:
This makes him think that sharing between AIs is likely to occur relatively easily, because AI progress will make architectures more similar, which makes it easier to share cognitive content between AIs.
Hanson is the clear winner here. We don't have AIs that are exchanging cognitive content, because we don't have AIs that are sufficiently agent-like to do this. But humans now exchange cognitive AI content all the time.
Per Hanson's prediction, AI architectures have standardized around one thing -- neural networks, and even around a single neural network architecture (Transformers) to a very great degree. The diversity Yudkowsky observed in architectures has shrunk enormously, comparatively speaking.
Moreover, granting neural networks, trading cognitive content has turned out to be not particularly hard. It does not require superintelligence to share representations between different neural networks; a language model can be adapted to handle visual data without enormous difficulty. Encodings from BERT or an ImageNet model can be applied to a variety of downstream tasks, and this is by now a standard element in toolkits and workflows. When you share architectures and training data, as for two differently fine-tuned diffusion models, you can get semantically meaningful merges between networks simply by taking the actual averages of their weights. Thoughts are not remotely "written in a different language."
So generally, cognitive content looks to be relatively easy to swap between different systems. It remains easy to swap as systems get smarter, and workflows that involve such swapping are becoming increasingly common. Hanson's view looks more accurate.
"Improvements in One AI Project Generally Won't Improve Another Much"
This issue mirrors the one above.
As whether cognitive content could be easily shared between AIs is relevant for local vs. global takeoff, so is whether cognitive algorithms could be easily shared between AIs. That is, whether the improvements you make to one AI could be relatively easily transferred to another.
Yudkowsky states:
Yudkowsky does relax his confidence about sharing cognitive algorithms by the time of the 2011 debate, noting that chess algorithms have benefitted from sharing techniques, but still maintains his overall position (FOOM, 663).
Similarly to the above, Hanson thinks as progress occurs, improvements will begin to be shared.
Yudkowsky is again pretty clearly wrong here.
An actual improvement to say, how Transformers work, would help with speech recognition, language modelling, image recognition, image segmentation, and so on and so forth. Improvements to AI-relevant hardware are a trillion-dollar business. Work compounds so easily on other work that many alignment-concerned people want to conduct all AI research in secret.
Hanson's position looks entirely correct.
"Algorithms are Much More Important Than Compute for AI Progress"
Different views about the nature of AI imply different things about how quickly AIs could FOOM.
If most of the space between the sub-human AIs of 2008 and potentially superhuman AIs of the future is algorithmic, then growth could be very fast and localized as AI discovers these algorithms. The "a brain in a box in a basement" frequently mentioned in the Jane Street debate could discover algorithms that let it move from merely human to godlike intelligence overnight.
On the other hand, if a lot of the space between AIs of 2008 and superhuman AIs of the future is in size of compute needed -- or if greater compute is at least a prerequisite for having superhuman AI -- then growth is likely to be slower because AIs need to obtain new hardware or even build new hardware. A computer in a basement somewhere would need to purchase time in the cloud, hack GPUs, or purchase hardware to massively increase its intelligence, which could take more time and is at least more visible.
Yudkowsky uniformly insists that qualitative algorithmic differences are more important than compute, and moreover that great quantities of compute are not a prerequisite.
For instance, he says that "quantity [of minds] < (size, speed) [of minds] < quality [of minds]" (FOOM, 601). He expects "returns on algorithms to dominate" during an intelligence explosion (627). He consistently extends this belief into the past, noting that although human brains are four times bigger than chimpanzee brains "this tells us very little because most of the differences between humans and chimps are almost certainly algorithmic" (FOOM, 613).
When he mentions that compute could contribute to AI progress, he always makes clear that algorithms will be more important :
To take another view on how he believes that limited compute is in no way an obstacle to FOOM; he gives a "rough estimate" that you could probably run a mind about as smart as a human's mind on a 2008 desktop, or "or maybe even a desktop computer from 1996." (FOOM, 257)
But a desktop from 1996 isn't even the lower limit. If a superintelligence were doing the design for a mind, he continues, "you could probably have [mind of] roughly human formidability on something substantially smaller" (FOOM, 257).
This view about the non-necessity of compute is thoroughly and deliberately integrated into Yudkowsky's view, without particular prodding from Hanson -- he has several asides in FOOM where he explains how Moravec or Kurzweil's reasoning about needing human-equivalent compute for AI is entirely wrong (FOOM, 19, 256).
Hanson does not cover topic of compute as much.
To the degree he does, he is extremely dubious that there any small handful of algorithmic insights in intelligence-space that will grant intelligence; he also emphasizes hardware much more.
For instance, he approvingly states that "the usual lore among older artificial intelligence researchers is that new proposed architectural concepts are almost always some sort of rearranging of older architectural concepts." He continues:
Yudkowsky seems quite wrong here, and Hanson right, about one of the central trends -- and maybe the central trend -- of the last dozen years of AI. Implementing old methods more vigorously is more or less exactly what got modern deep learning started; algorithms in absence of huge compute have achieved approximately nothing.
The Deep Learning revolution is generally dated from 2012's AlexNet. The most important thing about AlexNet isn't any particular algorithm; the most important thing is that the authors wrote their code with CUDA to run on GPUs, which let them make the neural network far bigger then it could otherwise have been while training in a mere week. Pretty much all subsequent progress in DL has hinged on the continuing explosion of compute resources since then. Someone who believed Yudkowsky would have been extremely surprised by 2012-2020, when compute spent on ML runs doubled every 6 months and when that doubling was nearly always key for the improved performance.
Algorithms do matter. I think finding the right algorithms and data, rather than getting enough compute, are probably the biggest current obstacles to extremely compute-rich organizations like OpenAI or Google right now. But it is nevertheless undisputable that algorithms have not had the primary importance Yudkowsky attributed to them, in the absence of vastly increased compute. Put it this way: there still exist comparatively compute-frugal AI startups like Keen Technologies -- but even these still need to buy things like a DGX station that would be the most powerful supercomputer in the world if it existed in 2008 by a wide margin. So a comparatively compute-frugal program now is still compute-rich beyond anything Yudkowsky points to over the course of the debate.
(If you're further interested in the topic you should of course read Gwern on the scaling hypothesis.)
Yudkowsky himself sometimes appears to have changed his mind at least somewhat -- if still he thought that algorithms were the key to AGI, he wouldn't have advocated for banning huge GPU clusters with international law, because that's the kind of thing which would predictably focus more attention on improved algorithms, no?
On the other hand -- he seems (?) to still think that if only AI researchers were smart enough, progress would not involve huge compute? From his discussion with Ngo:
"The last decade of progress has depended on compute because everyone is too stupid to program human-level AI on a 2008 computer," could be the most Yudkowskan possible response to the evidence of the past ten years.
But -- regardless of Yudkowsky's current position -- it still remains that you'd have been extremely surprised by the last decade's use of compute if you had believed him, and much less surprised if you had believed Hanson.
Predictions -- Harder to Call
The above cases seem to me relatively clear.
The below I think seem pretty sensitive to what kind of predictions you take Hanson and Yudkowsky to be making, and how favorably or unfavorably you read them. The are greater interpretive degrees of freedom.
Nevertheless I include it this section, mostly because I've seen various claims that evidence supports one person or another.
"Human Content is Unimportant Compared to the Right Architecture"
A topic that comes up over and over again over the course of the debate -- particularly later, though -- is how important the prior "content" of all prior human civilization might be.
That is, consider of all the explicit knowledge encoded in all the books humans have written. Consider also all the implicit knowledge encoded in human praxis and tradition: how to swing an axe to cut down a tree, how to run a large team of AI science researchers, how to navigate different desired levels of kitchen cleanliness among roommates, how to use an arc-welder, how to calm a crying baby, and so on forever. Consider also all the content encoded not even in anyone's brains, but in the economic and social relationships without which society does not function.
How important is this kind of "content"?
It could be that this content, built up over the course of human civilization, is actually something AI would likely need. After all, humans take the first two decades or so of their life trying to absorb a big chunk of it. So it might be difficult for an AI to rederive all human scientific knowledge without this content.
Alternately, the vast edifice of prior human civilization and knowledge might fall before a more elegant AI architecture. The AI might find that it could easily recreate most of this knowledge without much difficulty, then quickly vault past it.
Hanson generally thinks that this content is extremely important.
And again, Hanson:
Yudkowsky on the other hand, thinks that with the right architecture you can just skip over a lot of human content:
Notably, Yudkowsky has also claimed, some years after the debate, that the evidence supports him in this domain.
In 2017 AlphaGoZero was released, which was able to learn Go at a superhuman level without learning from any human games at all. Yudkowsky then explained how this was evidence for his position:
So Yudkowsky says.
If we round off Hanson's position to "content from humans is likely to matter a lot" and Yudkowsky's to "human content is crap," then I think that AlphaGoZero is some level of evidence in support of Yudkowsky's view. (Although Hanson responded by saying it was a very small piece of evidence, because his view always permitted narrow tools to make quick progress without content, and AGZ is certainly a narrow tool.)
On the other hand, is it the only piece of evidence reality gives us on this matter? Is it the most important?
One additional piece of data is that some subsequent developments of more complex game-playing AI have not been able to discard human data. Neither DeepMind's StarCraft II, nor OpenAI's Dota2 playing agents -- both post Go-playing AIs -- were able to train without being jumpstarted by human data. Starcraft II and Dota2 are far more like the world than Go -- they involve partial information, randomness, and much more complex ontologies. So this might be an iota of evidence for something like a Hansonian view.
But far more importantly, and even further in the same direction -- non-narrow tools like GPT-4 are generally trained by dumping a significant fraction of all written human content into them. Training them well currently relies in part on mildly druidical knowledge about the right percent of the different parts of human content to dump into them -- should you have 5% code or 15% code? Multilingual or not? More ArXiV or more Stack overflow? There is reasonable speculation that we will run out of sufficient high-quality human content to feed these systems. The recent PaLM-2 paper has 18 authors for the data section -- more than it has for the architecture section! (Although both have fewer than the infrastructure section gets, of course -- how to employ compute still remains big.) So content is hugely important for LLMs.
Given that GPT-4 and similar programs look to be by far the most generally intelligent AI entities in the real world rather than a game world yet made, it's hard for me to see this as anything other than some evidence that content in Hanson's sense might matter a lot. If LLMs matter more for future general intelligence than AlphaGoZero -- which is a genuinely uncertain "if" for me -- then Hanson probably gets some fractional number of Bayes points over Yudkowsky. If not, maybe the reverse?
I don't think the predictions are remotely clear enough for either person to claim reality as on their side.
"Simple AI architectures will generalize very well" (Claim probably not made)
Different AI architectures can be more simple or more complex.
AlphaGo, which combines Monte-Carlo Tree Search, a policy network and a value network, is probably more architecturally complex than GPT-3, which is mostly a single giant transformer. Something like DreamerV3 is probably more complex than either, although you very quickly get into discussion of "what counts as complexity?" But there is in any event a spectrum of architectural complexity out there -- a system of one giant neural network trained end-to-end is relatively less complex, and a system of multiple neural networks trained with different objective functions is relatively more complex.
Yudkowsky has claimed (since the FOOM debate) that he predicted (in the FOOM debate) something akin to "simple architectures will generalize very well over broad domains." Thus, during his discussion with Ngo last year:
In general, I think right now it does look like you can get a pretty architecturally simple network doing a lot of cool cross-domain things. So if Yudkowsky had predicted it and Hanson had denied it, it would be some level of evidence for Yudkowsky's view over Hanson's.
The problem is that Yudkowsky mostly.... just doesn't seem to predict this unambiguously? I have ctrl-f'd for "car," "automobile," "cortex" through the PDF, and just not found that particular claim.
He does make some similar claims. For instance, Yudkowsky does claim that human level AI will be universally cross-domain.
Unfortunately, this is a claim that an architecture will have breadth, but not a claim about the simplicity of the architecture. It is also -- granting that we don't have AIs that can do long-range planning -- one for which we haven't received good information.
Here's a claim Yudkowsky and Hanson disagree about that could be interpreted as "simple architectures will generalize far" -- Yudkowsky says that only a few insights separate AI from being human-level.
On one hand, you'd think that saying a "few insights" separate AI from human-level-ness sort-of implies that the AI would have a simple architecture. But on the other hand, you could truthfully say only a few insights let you steer rockets around, fundamentally... but rockets nevertheless have pretty complex architectures. I'm not sure that the notion of "few insights" really corresponds to "simple architecture." In the dialog, it more seems to correspond to.... FOOM-ability, to the idea that you can find an insight while thinking in a basement that lets your thinking improve 2x, which is indifferent to the simplicity of the architecture the insight lets you find.
Let me return to what Yudkowsky and Hanson actually say, to show why.
Yudkowsky claims that a small handful of insights will likely propel a ML model from infrahumanity to superhumanity. He characterises the number as "about ten" (FOOM, 445) but also says it might be just one or two important ones (FOOM, 450). He affirms that "intelligence is about architecture" and that "architecture is mostly about deep insights" (FOOM, 406, emphasis his) and thus that the people who make an AI FOOM will have done so because of new deep insights (FOOM, 436).
Hanson, by contrast, thinks "powerful architectural insights are quite rare" (FOOM, 496). He believes that "most tools require lots more than a few key insights to be effective—they also require thousands of small insights that usually accumulate from a large community of tool builders and users" (FOOM, 10). He does think that there are some large insights in general, but insights "are probably distributed something like a power law, with many small-scope insights and a few large-scope" (FOOM, 144).
So Hanson in general thinks AI will look like most technology -- see the progress of planes, cars, guns, and so on -- in that progress comes from hundreds of tiny refinements and improvements. There's no moment in the history of planes where they suddenly become useful -- there are 100s of small and big improvements all gradually moving planes from "mostly useless, with rare exceptions" to "incredibly useful."
Yudkowsky, on the other hand, thinks that AI will look more like a handful of "eureka!" moments, followed up by some coding and subsequent world-transformation. As is witnessed, of course, by MIRI's / the then-Singularity institute plan to build a seed AGI entirely on their own.
If we take this as the disagreement -- will AI progress come from a handful of big insights, or many small ones -- I think the world right looks a great deal more like Hanson's view than Yudkowsky's. In his interview with Lex Fridman, Sam Altman characterizes GPT-4 as improving on GPT-3 in a hundred little things rather than a few big things, and that's... by far... my impression of current ML progress. So when I interpret their disagreement in terms of the kind of work you need to do before attaining AGI, I tend to agree that Hanson is right.
On the other hand, we could return to saying that "few insights" implies "simple architecture." I don't think this is... exactly... implied by the text? I'll admit that the vibes are for sure more on Yudkowsky's side. So if we interpret the text that way, then I'd tend to agree that Yudkowsky is right.
Either way, though, I don't think Yudkowsky and Hanson were really clear about what was going on and about what kind of anticipations they were making.
Misc
I was going to have a whole section of things that didn't quite make the cut vis-a-vis predictions, but were super suggestive, but that could be seen as trying to influence the results on my part. So I'm just going to bail instead, mostly.
Conclusion
Who was more right?
When I look at the above claims, Hanson's record looks a little better than Yudkowsky's, albeit with a small sample size. If you weight the Cyc prediction a ton, maybe you could get them to parity. I think it would be weird not to see the compute prediction as a little more important than the Cyc prediction, though.
Note that Hanson currently thinks the chances of AI doom are < 1%, while Yudkowsky thinks that they are > 99%.(Hanson thinks the chances of doom are... maybe somewhat lower than Yudkowsky, but they seem to have different ontologies of what qualifies as "doom" as the comments point out.)What Actual Lessons Can We Learn, Other Than Some Stuff About Deferral to Authority That Everyone Will Ignore Because We Like to Pretend We Do Not Defer to Authority, Even Though We All Fucking Do?
I was mildly surprised by how well some economic abstractions hold up.
The big part of the meta-debate in FOOM -- which they return to over and over again -- is whether you should try to use mostly only mental tools whose results have proven useful in the past.
Hansons' view is that if you use rules which you think retrodict data well but which haven't been vetted by actual predictions, you are almost certain to make mistakes because humans psychologically cannot distinguish actual retrodictions from post-hoc fitting. To avoid this post-hoc fitting, you should only use tools which have proven useful for actual predictions. Thus, he prefers to use economic abstractions which have been thus vetted over novel abstractions invented for the purpose.
I think this holds up pretty well. Yudkowsky makes predictions about future use of compute in AI, based on his attempted retrodictions about human evolution, human skull size, and so on. These predictions mostly failed. On the other hand, Hanson makes some predictions about AI converging to more similar systems, about advances in these systems mutually improving competing systems, and so on, based only on economic theory. These predictions succeeded.
Overall, I think "don't lean heavily on abstractions you haven't yet gotten actual good predictions from" comes out pretty well from the debate, and I continue to heavily endorse research evaluation proposals related to it.