I'm happy that this was done before release. However ... I'm still left wondering "how many prompts did they try?" In practice, the first AI self-replicating escape is not likely to be a model working alone on a server, but a model carefully and iteratively prompted, with overall strategy provided by a malicious human programmer. Also, one wonders what will happen once the base architecture is in the training set. One need only recognize that there is a lot of profit to be made (and more cheaply) by having the AI identify and exploit zero-days to generate and spread malware (say, while shorting the stock of a target company). Perhaps GPT-4 is not yet capable enough to find or exploit zero-days. I suppose we will find out soon enough.
Note that this creates a strong argument for never open-sourcing the model once a certain level of capability is reached: a GPT-N with enough hints about its own structure will be able to capably write itself.
(Spoilers for Interstellar)
I sat next to person on a flight a few weeks ago who, upon talking about physics, said she thought the movie Interstellar was "amazing" and "scientific". I agreed with her, thinking she was talking about the realistic black hole simulations. No, she was talking about scenes where the main character reaches back in time as a ghost to influence his daughter.
This person was a first-year Ph.D. student in medicine.
So yes, even when science fiction is done relatively carefully, some people will take as "scientific" the parts which have been stretched for better storytelling.
There are a few examples in history of civilizations running out of critical resources, usually accompanied by other conflict and calamity. In the most extreme historical cases, these civilizations go extinct in ways that left us with poor documentation, so there may be an anthropic / selection bias which obscures the exact causes of civilizational collapse:
There are several instances of social destabilization following closely on this heels of famine and rising food prices. However, famine is almost always a result of political or economic mismanagement, rather than pure constraints on non-food resources. The Arab Spring is sort of the canonical example of food scarcity resulting in higher food prices and subsequent political instability.
However, political instability is not a de facto result of food shortages. The 20th century saw many famines with relatively limited political consequences: the [Russian Famine of 1921](https://en.wikipedia.org/wiki/Russian_famine_of_1921%E2%80%931922), the [Soviet famine of 1932](https://en.wikipedia.org/wiki/Soviet_famine_of_1932%E2%80%931933), the [Great Leap Forward/Great Chinese Famine](https://en.wikipedia.org/wiki/Great_Chinese_Famine), and the [1996 famine in North Korea](https://en.wikipedia.org/wiki/North_Korean_famine) are prominent examples. It may be noted that none of these countries were technologically advanced or democratic: perhaps free communications are a prerequisite to shortage-induced political instability?
Caveat: history is really complicated, and I am only parroting popular views. I am not a historian, nor do I have any detailed knowledge of these events.
What's the practical difference between "text" and one-hots of said "text"? One-hots are the standard for inputting text into models. It is only recently that we expect models to learn their preferred encoding for raw text (cf. transformers). By taking a small shortcut, the authors of this paper get to show off their agent work without loss of generality: one could still give one-hot instructions to an agent that is learning to act in the real life.
The last time CM was mentioned on here I looked up his old videos about Fauci. In a video from Sept 2020 he made a very confident claim that NYC had already achieved 70% infections and thereby herd immunity. That turned out to be untrue:
https://www.lesswrong.com/posts/xEFfbEMFHhtgseKz3/covid-6-10-somebody-else-s-problem
So my prior for him is now skewed toward "pundit" rather than "honest inquirer."
it means contact tracing becomes easier and lockdowns more effective.
I read that oppositely. If the serial interval is shorter, contact tracers need to work faster to inform those exposed, lest those exposed become infected and transmitting. Likewise, lockdowns would only become more effective if the time each person is contagious is reduced. IIRC the delta variant, according to Indian accounts, is significantly contagious for three weeks from the date of infection as opposed to the usual two.
Re: quarantines for fully vaccinated travellers
The problem a national government has in setting quarantine standards for vaccinated people is that there are so many inconsistencies to vaccines, and the topic is wrapped in geopolitics.
The cost to administration of such an exemption program is high, the social, political, and medical perils are many, and there is the chance of quarantine escape in >5% of cases. Vaccine exemption from national quarantine is unlikely.
This is good critique of the details of AI 2027, but whether the prediction should have been for autonomous AI research by 2026 or 2033, it doesn't look like anything substantive is changing among the policy concerns that the AI 2027 raises.
I think Nikola's threshold for superhuman AI is conservative enough. If we reach a point where an AI agent (or super-agent) can perform tasks equivalent to 10 human-years of programmer time with 80% accuracy, then it is likely that AI research can be divided between several agents and completely automated. In my opinion, humanity will have lost control of AI by this point: much like the PI of a research lab never knows all the technical details of how their experiments are actually performed, by this point even the leading edge of human researchers are likely to fail to understand the research they are overseeing beyond that of a surface-level abstraction. From well before the point at which humans can no longer understand AI self-improvement, all of AI 2027's warnings about social and organizational dynamics are relevant: the incentives push companies to ignore the initial warning signs of autonomous and misaligned (evil) behavior, opening the door to potential catastrophe.
Your graph ("six stories") shows that METR's plain-old-exponential prediction would put us at this point before 2032, and the "new normal" METR curve based on the most recent model releases would put us there before 2028. So the current paradigm is such that super-exponential growth is not even needed to enter dystopia before 2032, and the uncertainties are such that entering dystopia before 2028 is still a possibility.
Getting the details right is important, but this critique reinforces my impression that AI 2027 is important. I only hope that AI 2027 skeptics don't start pointing at the headline ("bad") to argue against making meaningful policy and regulatory changes.