Earlier this year I offered to bet people who had short AI timelines.
While it wasn't my intention to be known as "a long AI timelines guy", I have begun feeling that was how people perceived me. Nonetheless, in the last few months, I've modified my views substantially. Thus, I offer this short post, which can hopefully make my current position more clear.
There are several reasons for my update towards shorter AI timelines, though each reason is relatively straightforward and uncomplicated. In the spirit of writing something short rather than not writing something at all, my explanations here will be brief, although I may be willing to elaborate in a comment below.
In order, these reasons included, but were not limited to,
- I became convinced that the barriers to language models adopting human-level reasoning were much weaker than I had believed. Previously, I had imagined that it would be difficult to get a language model to perform reliable reasoning over long sequences, in which each step in the sequence requires making a non-trivial inference, and one mistake in understanding the sequence can make the difference between a coherent and incoherent response.
Yet, my personal experience with language models, including but not limited to ChatGPT, has persuaded me to that this type of problem is not a strong barrier, and is closer in difficulty to other challenges like "summarize a document" or "understanding what's going on in a plot" which I had already thought language models were making good progress on. In hindsight, I should have perhaps trusted the model I had constructed myself, which forecasted human-level language models by 2030. Note: I don't think this update reflects new major capabilities found in GPT-3.5, but rather my own prior state of ignorance. - I built a TAI timelines model, and after fitting the model, it came out with a median timeline of 2037. While I don't put a high degree of confidence in my model, or the parameters that I used, I believe it's still more reliable than my own intuition, which suggested much later dates were more plausible.
- I reflected more on the possibility that short-term AI progress will accelerate AI progress.
- I noticed that I had been underestimating the returns to scaling, and the possibility of large companies scaling their training budgets quickly to the $10B-$100B level. I am still unsure that this will happen within the next 10 years, but it no longer seems like something I should dismiss.
- I saw almost everyone else updating towards shorter timelines, except for people who already had 5-15 year timelines, and a few other people like Robin Hanson. Even after adjusting for the bandwagon effect, I think it's now appropriate to update substantially as well.
I still feel like my arguments for expecting delays from regulation are being underrated. Yet, after reflection, I've become less confident about how much we should expect these delays to last. Instead of imagining a 20 year delay, a 3 to 10 year delay from regulation now seems more reasonable to me.
If you want me to get specific, my unconditional median TAI timeline is now something like 2047, with a mode around 2035, defined by the first year we get >30% yearly GWP growth as measured from a prior peak, or an event of comparable significance. Note that I think AI can likely be highly competent, general, and dangerous well before it has the potential to accelerate GWP growth to >30%, meaning that my AGI timelines may be quite a lot shorter than this, depending on one's definition of AGI.
Overall, this timeline may still appear too long to many people, yet my explanation is that it's what I get when I account for potential coordinated delays, unrelated catastrophes, and a 15% chance that we're fundamentally wrong about all of this stuff. Conditional on nothing like that happening, I'd be inclined to weakly bet on TAI before 2039.
I also tend to find myself arguing against short timelines by default, even though I feel like I take AI safety way more seriously than most people.
At this point, how many people with long timelines are there still around here? I haven't explicitly modeled mine, but it seems clear that they're much, much longer (with significant weight on "never") than the average less wronger. The next few years will for sure be interesting as we see the "median less wrong timeline" clash with reality.
A distinction which makes no difference. Copilot-like models are already being used in autonomous code-writing ways, such as AlphaCode which executes generated code to check against test cases, or evolving code, or LaMDA calling out to a calculator to run expressions, or ChatGPT writing and then 'executing' its own code (or writing code like SVG which can be interpreted by the browser as an image), or Adept running large Transformers which generate & execute code in response to user commands, or the dozens of people hooking... (read more)