Earlier this year I offered to bet people who had short AI timelines.
While it wasn't my intention to be known as "a long AI timelines guy", I have begun feeling that was how people perceived me. Nonetheless, in the last few months, I've modified my views substantially. Thus, I offer this short post, which can hopefully make my current position more clear.
There are several reasons for my update towards shorter AI timelines, though each reason is relatively straightforward and uncomplicated. In the spirit of writing something short rather than not writing something at all, my explanations here will be brief, although I may be willing to elaborate in a comment below.
In order, these reasons included, but were not limited to,
- I became convinced that the barriers to language models adopting human-level reasoning were much weaker than I had believed. Previously, I had imagined that it would be difficult to get a language model to perform reliable reasoning over long sequences, in which each step in the sequence requires making a non-trivial inference, and one mistake in understanding the sequence can make the difference between a coherent and incoherent response.
Yet, my personal experience with language models, including but not limited to ChatGPT, has persuaded me to that this type of problem is not a strong barrier, and is closer in difficulty to other challenges like "summarize a document" or "understanding what's going on in a plot" which I had already thought language models were making good progress on. In hindsight, I should have perhaps trusted the model I had constructed myself, which forecasted human-level language models by 2030. Note: I don't think this update reflects new major capabilities found in GPT-3.5, but rather my own prior state of ignorance. - I built a TAI timelines model, and after fitting the model, it came out with a median timeline of 2037. While I don't put a high degree of confidence in my model, or the parameters that I used, I believe it's still more reliable than my own intuition, which suggested much later dates were more plausible.
- I reflected more on the possibility that short-term AI progress will accelerate AI progress.
- I noticed that I had been underestimating the returns to scaling, and the possibility of large companies scaling their training budgets quickly to the $10B-$100B level. I am still unsure that this will happen within the next 10 years, but it no longer seems like something I should dismiss.
- I saw almost everyone else updating towards shorter timelines, except for people who already had 5-15 year timelines, and a few other people like Robin Hanson. Even after adjusting for the bandwagon effect, I think it's now appropriate to update substantially as well.
I still feel like my arguments for expecting delays from regulation are being underrated. Yet, after reflection, I've become less confident about how much we should expect these delays to last. Instead of imagining a 20 year delay, a 3 to 10 year delay from regulation now seems more reasonable to me.
If you want me to get specific, my unconditional median TAI timeline is now something like 2047, with a mode around 2035, defined by the first year we get >30% yearly GWP growth as measured from a prior peak, or an event of comparable significance. Note that I think AI can likely be highly competent, general, and dangerous well before it has the potential to accelerate GWP growth to >30%, meaning that my AGI timelines may be quite a lot shorter than this, depending on one's definition of AGI.
Overall, this timeline may still appear too long to many people, yet my explanation is that it's what I get when I account for potential coordinated delays, unrelated catastrophes, and a 15% chance that we're fundamentally wrong about all of this stuff. Conditional on nothing like that happening, I'd be inclined to weakly bet on TAI before 2039.
Excellent, thanks for this detailed critique! I think this might be the best that post has gotten thus far, I'll probably link to it in the future.
Point by point reply, in case you are interested:
2022-2023: Agree. Note that I didn't forecast that an AI could buy you a USB stick by 2022; I said people were dreaming of such things but that they didn't actually work yet.
2024: We definitely have a real disagreement about AI capabilities here; I do expect fine-tuned bureaucracies to be useful for some fairly autonomous things by 2024. (For example, the USB stick thing I expect to work fine by 2024). Not just babbling and fooling humans and forcing people to interact with a company through them.
Re propaganda/persuasion: I am not sure we disagree here, but insofar as we disagree I think you are correct. We agree about what various political actors will be doing with their models--propaganda, censorship, etc. We disagree about how big an effect this will have on the populace. Or at least, 2021-me disagrees with 2022-you. I think 2022-me has probably come around to your position as well; like you say, it just takes time for these sorts of things to influence the public + there'll probably be a backlash / immunity effect. Idk.
2025: I admit I overestimated how hard diplomacy would turn out to be. In my defense, Cicero only won because the humans didn't know they were up against a bot. Moreover it's a hyper-specialized architecture trained extensively on Diplomacy, so it indeed doesn't have general reasoning skills at all.
We continue to disagree about the potential effectiveness of fine-tuned bureaucracies. To be clear I'm not confident, but it's my median prediction.
I projected a 10x decrease in hardware costs, and also a 10x improvement in algorithms/software, from 2020 to 2025. I stand by that prediction.
2026:
We disagree about whether understanding is (or will be) there. I think yes, you think no. I don't think that these AIs will be "merely symbol manipulators" etc. I don't think the data-poisoning effect will be strong enough to prevent this.
As mentioned above, I do take the point that society takes a long time to change and probably I shouldn't expect the propaganda etc. to make that much of a difference in just a few years. Idk.
I'm not conflating those things, I know they are different. I am and was asserting that the chatbots would actually have understanding, at least in all the behaviorally relevant senses (though I'd argue also in the philosophical senses as well). You are correct that I didn't argue for this in the text -- but that wasn't the point of the text, the text was stating my predictions, not attempting to argue for them.
ETA: I almost forgot, it sounds like you mostly agree with my predictions, but think AGI still won't be nigh even in my 2026 world? Or do you instead think that the various capabilities demonstrated in the story won't occur in real life by 2026? This is important because if 2026 comes around and things look more or less like I said they would, I will be saying that AGI is very near. Your original claim was that in the next few years the median LW timeline would start visibly clashing with reality; so you must think that things in real-life 2026 won't look very much like my story at all. I'm guessing the main way it'll be visibly different, according to you, is that AI still won't be able to do autonomous things like go buy USB sticks? Also they won't have true understanding -- but what will that look like? Anything else?