Summary of the Below: I think there are classes of tasks where machine intelligence will have extremely high quality information to use as a target to regress towards, leading to super intelligent performance. The reason robotics tasks provide high quality information is that each robotic manipulation is a natural experiment, where the robot can compare the direct consequences of taking action [A] vs [B] (no actions are an action) and determine the causal relationships of the world.
There's a couple aspects of this problem I feel are completely unexamined.
There are a large number of real life problems today that fall in the following class:
They involve manipulations in the physical world, where all the major elements including the direct consequences of a robotic system's actions can be simulated and scored, for immediate feedback.
Problems that fall in this class:
[all autonomous vehicles, all autonomous logistics, all manufacturing, all mining, all agriculture and similar resource gathering, most cleaning tasks, most construction tasks]
This means the above problems are all solvable with sufficient financial investment. This also means platforming and general agents can be developed.
For many of these tasks, there are common subtasks including object manipulation, vehicle dynamics, working memory, behavior prediction, and so on can all be modeled.
And for many of tasks, there are many common framework elements that can be shared between robotic systems.
What does this mean? It means that one route to a kind of transformative AI is as follows:
a. Autonomous cars eventually become reliable enough to deploy in large numbers. Several companies succeed.
b. Several of these companies begin porting and generalizing their vehicle autonomy stacks to automate more things such as generic robotics
c. Revenue from autonomous cars and general robotics floods into these companies. It trivially will hit hundreds of billions and then exceed trillions per year in licensing fees. Just cars alone, if you can garner 0.25 per mile covered, and half the 3 trillion miles driven in the USA are autonomous, is 375 billion annually in revenue. Double that if you include Europe.
d. The money stimulates development of ever larger silicon arrays, more sophisticated algorithms, less labor intensive to deploy frameworks, and common shared subcomponents that learn collectively from every robotic system in the world using the subcomponent.
e. Now that robotic systems are becoming easy to define - at a certain stage there will be probably cloud-hosted editors, some high level scripting language, and ready-to-use premade realtime software/hardware autonomy stacks. All you would need to do to define a new system is import some data showing a human completing a task successfully, and license a few hundred readymade submodules, with automated tools holding your hand as you do so.
Eventually it would reach the point that it's a matter of hours to automate a task. (and yes, unemploy every human on earth doing the task now). This would lead to self replicating machinery.
This point is hit when there is an automated system to manufacture every part used in every part of state of the art robotics, including mining the materials, trucking them, building the infrastructure, and so on. This might be transformative.
f. If e isn't transformative, these "dumb agents" should be able to scale to designing other robotic systems, including ones for environments and scales unexplored by humans. (the lunar environment, the nanoscale). This would be done using techniques similar to the hide and seek openAI paper.
g. If f isn't transformative, the infrastructure needed to support "dumb*" agents as described will provide the pieces for future work to develop true AGI.
*dumb agents: cast the world into a series of state spaces, using pretrained neural networks for most transforms, with the final state space used to make control decisions. Example:
[input camera space] -> [object identity space] -> [collision risk space] -> [ potential path space] -> MAX([dollar value of each path space]) -> [control system output]
Epistemic status: mild confidence that this provides interesting discussion and debate.
Credits to (in no particular order) Mark Xu, Sydney Von Arx, Jack Ryan, Sidney Hough, Kuhan Jeyapragasan, and Pranay Mittal for resources and feedback. Credits to Ajeya (obviously), Daniel Kokotajlo, Gwern, Robin Hanson, and many others for perspectives on timeline cruxes. This post was written as part of a 10-week AI Safety Fellowship run by Mark. All errors my own.
Summary
Most of this post is unoriginal. It is intended primarily to summarize and rephrase the core distinctions between three plausible scenarios for AI development, which Ajeya lays out in her draft report on AI timelines. It also contains summaries and links to other related content.
As a secondary goal, it attempts to lay out concrete and hopefully plausible predictions for what would occur in each of these three worlds.
Glossary (from Ajeya's report)
This post will assume familiarity with basic terminology regarding neural networks and supervised learning.
Before reading this post, it's probably good to read at least a summary of Ajeya's report (e.g. Rohin Shah's). Some timeline-specific terminology that is helpful to know is also listed below:
Scenario One
If you believe...
Short-horizon NN has a fair chance of succeeding (e.g. 40%)[2]
You might believe this if you think there's a good chance it is sufficient to fine-tune a large language model like GPT-N for TAI, and there's no need to train models directly on tasks that take a long time to judge (e.g. "writing a twist ending to a story", as opposed to "predict next word"). Concrete things you might expect to happen soon would thus be:
For more reading, see Ajeya on downstream skills.
Algorithms halve compute requirements every ~2 years for short-horizon NNs, or every ~1 year for a medium/long-horizon NN.
You might expect this if you think there is a lot of "low-hanging fruit" in algorithms, such as if you think relatively little work has gone into optimizing training regimes or architectures for large NNs. (For context, OpenAI's AI & Efficiency suggests a halving time of ~16 months for ImageNet models.) Consequences you might expect are:
Moore's Law returns, resulting in ~1.5 year doubling times for FLOPs/$ to train a model.
For context, Moore's law described transistor progress well until the mid-2000s, when the regime shifted to a doubling time of ~3-4 years[4]. One possible story for a return to ~1.5-year doubling times is that the number of chip producers increases, with players perhaps aided by AI-assisted chip manufacturing. Moore's Law is rather hard to predict, but concrete things that might allow this are:
AI companies are rapidly willing to spend 2% of GDP to train a transformative model.
This looks like AI companies rapidly realizing the immense value of training bigger AI, so they rapidly scale up until they are limited by capital-on-hand in 2030. From that point on, they are willing to spend 2% of GDP on training a model. You might expect this if:
Given the above, you should expect a median timeline of 2036, shaded up by Ajeya to 2040.
Scenario Two
NNs likely require some medium- to long-horizon training.
You might believe this if you think scaling up GPT-3 doesn't quite lead to TAI. It gets you a good bit of the way there, but turns out you need a more complex environment to learn how to learn new tasks. This might look like:
Fine-tuning hits a wall: By 2025, it's clear that massive fine-tuned language models underperform in comparison to smaller models that use supervised learning directly on the question at hand (e.g. large-scale code generation). Concretely, a possible scenario is that GPT-5-CodeCompletionXL performs worse than a new TransformerCodeM model, which was trained directly on sets of code completion questions rather than unsupervised learning like GPT.
This implies that new advances will be required to reach scalable TAI.
(Ajeya mentions some reasons why you might expect a from-scratch supervised model to outperform a fine-tuned language model, but whether it would do so in reality is an open question.)
Training on long horizons becomes popular: By 2030, there exist 2+ models achieving state-of-the-art performance in a specific field that use training on long horizons (greater than few minutes). For example, a novel-writing AI that receives feedback only after finishing a novel, or an RL agent that plays long and complex games.
Algorithmic progress is a bit slower than it was in the past, halving compute every 3 years for short-horizon NNs, 2 years for medium & long-horizon NNs.
You might believe this if you think most of the low-hanging fruit has been picked. Architectural advancements slow down, with each one representing mostly incremental progress.
(For further reading and intuitions, AI Impacts has a page with many examples of past algorithmic progress. See also the posts mentioned in Ajeya's paper: Measuring the Algorithmic Efficacy of Neural Networks (2020) and Algorithmic Progress in Six Domains (2013).)
Things you may expect are:
Moore's Law slows a bit. FLOPS/$ doubles every 2.5 years.
By 2025, the general consensus is that Moore's Law is dead. TSMC, Intel, and Samsung hit manufacturing delays in new nodes, and it is projected that doubling time will increase. There is a solid path for further growth, but the path forward is hard, and chip designers focus more on optimizing preexisting nodes much like Intel does today.
Things you might expect are:
AI companies are willing to spend $1bn in 2025, with that figure doubling every 2 years.
Ajeya considers this a plausible level of spending on a "business as usual" trajectory, given the current market cap and cash on hand of major tech companies. See here for more details.
Given the above, you should expect a median timeline of ~2052.
Scenario Three
Transformative NNs likely depend highly on long-horizon training, perhaps requiring FLOPs on the order of evolutionary computation.
This looks like GPT & supervised learning hitting a dead-end for meta-learning (learning new, complex tasks). No matter how hard we try, we can't get neural networks to learn complex skills over short training timeframes. One of the options left to us is something like RL + transparency tools or supervised learning with feedback that is given only after long subjective time elapsed, which are both highly compute-intensive ways of training an agent.
Things you might expect are:
Algorithms halve compute every 4 years for short-horizon, 3 years for medium and long horizon.
This looks a lot like Scenario Two, but is quantitatively a bit slower. Realistically, the biggest sign of this is probably just a slowing trend in 2025's "algorithmic progress" chart, but other things you might expect are:
Moore's law slows significantly, doubling FLOPS/$ every 3.5 years.
This might occur if silicon process costs keep rising, eventually becoming uneconomical even for large players. There are incremental further advancements (e.g. optimizing 3nm+++), but overall stagnation continues.
Things you might expect are:
AI companies will have to wait for the entire economy to grow sufficiently to finance TAI.[7]
This follows from the slowing of Moore's Law and the need for expensive, long-horizon training. This world seems like one where advanced AI is not terribly profitable and requires resources on the scale of a 2090s megaproject.
Things you might expect are:
Given the above, you should expect a median timeline of 2100, shaded down to 2090 by Ajeya.
Epistemic Notes
The above predictions are obviously rough. Even in a world that satisfies a particular timeline (e.g. AI by 2052), I expect the specifics of more than half of them to probably be wrong. However, the hope is that these predictions can be used as a sort of barometer, so that five years down the line, we can look back and ask, "how many of these came true?" The answer may help us figure out when we predict TAI to eventually arrive.
I also hope these predictions can be used today to clarify researchers' own timelines. If you believe most of the predictions in Scenario 1 are plausible, for example, you may want to update toward shorter timelines, and likewise if you think Scenario 3 is plausible, you should probably update toward later timelines.
Other Scenarios and Open Questions
In the process of writing this and further understanding Ajeya's assumptions, I had a few ideas for other scenarios that I would enjoy seeing fleshed out.
Further Reading & Related Work
Here are some of the documents I found useful while writing this post.
See https://docs.google.com/document/d/1IJ6Sr-gPeXdSJugFulwIpvavc0atjHGM82QjIfUSBGQ/edit#heading=h.6t4rel10jbcj ↩︎
This crux simplifies Ajeya's range of estimates for hypothesis probabilities considerably. For each scenario, I've used the most probable anchor as the "central crux", as it seems to me that if you believe short-horizon NN is 40% likely to work, then the other hypotheses are also fairly sensible. ↩︎
This is reminiscent of Gwern's earlier post from "Are we in an AI overhang?" ↩︎
Sourced from https://docs.google.com/document/d/1cCJjzZaJ7ATbq8N2fvhmsDOUWdm7t3uSSXv6bD0E_GM/edit#heading=h.96w8mskhfp5l ↩︎
Credits to Robin Hanson's bet here: https://twitter.com/robinhanson/status/1297325331158913025 ↩︎
The US has recently prevented Chinese companies such as SMIC from getting high-end manufacturing equipment. It also has worked with the Dutch to deny ASML, a machine manufacturing company, from delivering high-quality EUV chip machines to China. Quick research seems to suggest that building a manufacturing plant without US equipment is extremely hard, at least for now. ↩︎
This is derived from Ajeya's assumption that spending will start at 300 million dollars in 2025 and grow every 3 years, reaching a max of 100bn by 2055, at which point it is bounded at 0.5% of GDP. However, since TAI isn't projected until 2100 in this scenario anyway, in practice this means that only the upper bound really matters. ↩︎