I left some comments on an earlier version of AI 2027; the most relevant is the following:
June 2027: Self-improving AI
OpenBrain now has a “country of geniuses in a datacenter.”
Most of the humans at OpenBrain can’t usefully contribute anymore. Some don’t realize this and harmfully micromanage their AI teams. Others sit at their computer screens, watching performance crawl up, and up, and up.
This is the point where I start significantly disagreeing with the scenario. My expectation is that by this point humans are still better at tasks that take a week or longer. Also, it starts getting really tricky to improve on these, because you get limited by a number of factors: it takes a long time to get real-world feedback, it takes a lot of compute to experiment on week-long tasks, etc.
I expect these dynamics to be particularly notable when it comes to coordinating Agent-4 copies. Like, probably a lot of p-hacking, then other agents knowing that p-hacking is happening and covering it up, and so on. I expect a lot of the human time will involve trying to detect clusters of Agent-4 copies that are spiralling off into doing wacky stuff. Also at this point the metrics of performance won't be robust enough to avoid agents goodharting hard on them.
Hi Zvi, you misspelled my name as "Dei". This is a somewhat common error, which I usually don't bother to point out, but now think I should because it might affect LLMs' training data and hence their understanding of my views (e.g., when I ask AI to analyze something from Wei Dai's perspective). This search result contains a few other places where you've made the same misspelling.
If we remap main actor in AGI27 like this:
Human civ (in the paper) -> evolution (on earth)
Then it strikes how
Agent 4 (in the paper) -> Human civ (or the part of it that is involved in AI dev)
Fits perfectly - I hear the clink thinking about it.
Kevin Roose in The New York Times
Kevin Roose covered Scenario 2027 in The New York Times. The final conclusion is supportive of this kind of work, and Kevin points out that expectations at the major labs are compatible with the scenario. I was disappointed that the tone here seems to treat the scenario and the viewpoint behind it as ‘extreme’ or ‘fantastical.’ Yes, this scenario involves things that don’t yet exist and haven’t happened. It’s a scenario of the future. One can of course disagree with much of it. And you probably should. As we’ll see later with David Shapiro, we also have someone quoted as saying ‘oh they just made all this up without any grounding’ despite the hundreds of pages of grounding and evidence. It’s easier to simply pretend it isn’t there. And we have a classic Robin Hanson edit, here’s his full quote while linking: I think it’s totally reasonable to be wary of predictions of continued smooth exponentials. I am indeed also wary of them. I am however confident that if you did get ‘superhuman A.I. coders’ in a fully broad sense, that the other necessary skills for any reasonable definition of (artificial) general intelligence would not be far behind.Eli Lifland Offers Takeaways
Eli Lifland, who worked closely on the project, offers his takeaways here.Scott Alexander Offers Takeaways
He offers us a post outlining them. The list is:Others Takes on Scenario 2027
Having a Concrete Scenario is Helpful
The central points here seem spot on. If you want to know what a recursive self-improvement or AI R&D acceleration scenario looks like in a way that helps you picture one, and that lets you dive into details and considerations, this is the best resource available yet and it isn’t close. My one disagreement with Nevin (other than my standard objection to use of the word ‘doomer’) is that I don’t expect ‘even more polarized fighting.’ What I expect is for those who are worried to continue to attempt to find solutions that might possibly work, and for the ‘yolo’ crowd to continue to be maximally polarized against anything that might reduce existential risk, on principle, with a mix of anarchists and those who want government support for their project. Remarkably often, it will continue to be the same people.Writing It Down Is Valuable Even If It Is Wrong
I very much appreciate those who say ‘I strongly disagree with these predictions but appreciate that you wrote them down with detailed explanations.’ I strongly agree with Davidad that the speed at which things play out starting in 2028 matters very little. The destination remains the same.Saffron Huang Worries About Self-Fulfilling Prophecy
This is a reasonable thing to worry about. Is this a self-fulfilling or self-preventing style of prophecy? My take is that it is more self-preventing than self-fulfilling, especially since I expect the actions we want to avoid to be the baseline scenario. Directionally the criticism highlights a fair worry. One always faces a tradeoff between creating something engaging versus emphasizing the particular most important messages and framings. I think there are places Scenario 2027 could and should have gone harder there, but it’s tough to strike the right balance, including that you often have to ship what you can now and not let the perfect be the enemy of the good. Daniel also notes on the Win-Win Podcast that he is worried about the self-fulfilling risks and plans to release additional things that have better endings, whereas he notes that Leopold Aschenbrenner in Situational Awareness was intentionally trying to do hyperstition, but that by default it’s wise to say what’s actually likely to happen. I certainly don’t think this is presented as ‘everything until October 2027 here is inevitable.’ It’s a scenario. A potential path. You could yell that louder I guess? It’s remarkable how often there is a natural way for people to misinterpret something [M] as a stronger fact or claim than it is, and:Phillip Tetlock Calibrates His Skepticism
It is a well-known hard problem how much to update based on past predictions. In this case, I think quite a bit. Definitely enough to give the predictions a read. You should still be mostly making up your own mind, as always. I do think Jan’s right about that. Predictions until now were the easy part. That has a lot to do with why a lot of people are so worried. However, one must always also ask how predictions were made, and are being made. Grading only on track record of being right (or ‘winning’), let alone evaluating forward looking predictions that way, is to invite disaster.Jan Kulveit Wants to Bet
Being at least as right than ‘What 2026 Looks Like’ is a super high bar. If these odds are fair at 8:1, then that’s a great set of predictions. As always, kudos to everyone involved for public wagering.Matthew Barnett Debates How To Evaluate the Results
This is an illustration of why setting up a bet like the above in a robust way is hard. It’s definitely true that there will be a lot of disagreement over how accurate Scenario 2027 was, regardless of its level of accuracy, so long as it isn’t completely off base.Teortaxes for China and Open Models and My Response
Teortaxes claims the scenario is underestimating China, and also challenges its lack of interest in human talent and the sidelining of open models, see his thread for the relevant highlights from the OP, here I pull together his key statements from the thread. I see this as making a number of distinct criticisms, and also this is exactly the kind of thing that writing all this down gets you – Teortaxes gets to point to exactly where their predictions and model differ from Daniel’s.Others Wonder About PRC Passivity
Julian Bradshaw asks if the scenario implies the PRC should at least blockade Taiwan. The answer is, if PRC fully believed this scenario then maybe, but it crashes the economy and risks war so it’s a hell of a play to make if you’re not sure. There’s a difference between ‘feel the AGI’ and both ‘feel the ASI’ and ‘be confident enough you actually act quickly at terrible cost.’ I think it’s correct to presume that it takes a lot to force the second reaction, and indeed so far we’ve seen basically no interest in even slightly costly action, and a backlash in many cases to free actions. In terms of the Congress, I think them doing little is the baseline scenario. I mean, have you met them? Do you really think there wouldn’t be 35 senators who defer to the president, even if for whatever reason that wasn’t Trump?Timothy Lee Remains Skeptical
This seems to be based on basic long standing disagreements. I think they all amount to, essentially, not feeling the ASI, and not thinking that superintelligence is A Thing. In which case, yes, you’re not going to think any of this is going to happen.David Shapiro for the Accelerationists and Scott’s Response
Shapiro wants to accelerate AI and calls himself an ‘AI maximalist.’ I am including this for completeness. If you already know where this is going and don’t need to read this section, you are encouraged to skip it. This was the most widely viewed version of this type of response I saw (227k views). I am including the full response, so you can judge it for yourself. I will note that I found everything about this typical of such advocates. This is not meant to indicate that David Shapiro is being unusual, in any way, in his response, given the reference classes in question. Quite the contrary. If you do read his response, ask yourself whether you think these criticisms, and accusations that Scenario 2027 is not grounded in any evidence or any justifications, are accurate, before reading Scott Alexander’s reply. Then read Scott’s reply. And here is Scott Alexander’s response, pointing out that, well… So in summary I agree with this response:LessWrong Weighs In
Vladimir Nesov challenges that the flop counts here seem modestly too high based on anticipated GPU production schedules. This is a great example of ‘post the wrong answer on the internet to get the right one,’ and why detailed scenarios are therefore so great. Best case you’re posting the right answer. Worst case you’re posting the wrong one. Then someone corrects you. Victory either way. Wei Dei points out that when Agent-4 is caught, it’s odd that it sits back and lets the humans consider entering slowdown. Daniel agrees this is a good objection, and proposes a few ways it could make sense. Players of the AI in the wargame never taking the kinds of precautions against this Wei Dei mentions is an example of how this scenario and the wargame in general are in many ways extremely optimistic. Knight Lee asks if they could write a second good ending based on the actions the authors actually would recommend, and Thomas Larsen responds that they couldn’t make it feel realistic. That’s fair, and also a really bad sign. Doing actually reasonable things is not currently in the Overton window enough to feel realistic. Yitz offers An Optimistic 2027 Timeline, which opens with a massive trade war and warnings of a global depression. In Late 2026 China invades Taiwan and TSMC is destroyed. The ending is basically ‘things don’t move as fast.’ Yay, optimism?Other Reactions
Greg Colbourn has a reaction thread, from the perspective of someone much more skeptical about our chances in a scenario like this. It’s got some good questions in it, but due to how it’s structured it’s impossible to quote most of it here. I definitely consider this scenario to be making rather optimistic assumptions on the alignment front and related topics. Patrick McKenzie focuses on the format. As usual I had a roundup thread. I included some of them throughout but noting that there are others I didn’t, if you want bonus content or completeness. Joan Velja challenges the 30% growth assumption for 2026, but this is 30% growth in stock prices, not in GDP. That’s a very different thing, and highly realistic. The 30% GDP growth, if it happened, would come later. Mena Fleschman doesn’t feel this successfully covered ‘crap-out’ scenarios, but that’s the nature of a modal scenario. There are things that could happen that aren’t in the scenario. Mena thinks it’s likely we will have local ‘crap-outs’ in particular places, but I don’t think that changes the scenario much if they’re not permanent, except insofar as it reflects much slower overall progress. Joern Stoehler thinks the slowdown ending’s alignment solution won’t scale to these capability levels. I mostly agree, as I say several times I consider this part very optimistic, although the specific alignment solution isn’t important here for scenario purposes. And having said that, watch some people get the takeaway that we should totally go for this particular alignment strategy. No, please don’t conclude that, that’s not what they are trying to say. Gabriel Weil, in addition to his reactions on China, noticed that the ‘slowdown’ scenario in AI 2027 seems less plausible to him than other ‘lucky’ ways to avoid doom. I definitely wouldn’t consider this style of slowdown to be the majority of the win probability, versus a variety of other technical and political (in various combinations) ways out.Next Steps
There’s a fellowship that will include several AI 2027 collaborators (Eli Lifland and Thomas Larsen) at Pivotal in Q3, it will run from June 30 to August 29 in London, but the deadline is in two days, you have to apply by April 9, so act fast. Here’s what they’re going to do next, in addition to writing additional content:The Lighter Side
Predicting things that are known is still impressive, because most people don’t know.