Some brief thoughts at a difficult time in the AI risk debate.
Imagine you go back in time to the year 1999 and tell people that in 24 years time, humans will be on the verge of building weakly superhuman AI systems. I remember watching the anime short series The Animatrix at roughly this time, in particular a story called The Second Renaissance I part 2 II part 1 II part 2 . For those who haven't seen it, it is a self-contained origin tale for the events in the seminal 1999 movie The Matrix, telling the story of how humans lost control of the planet.
Humans develop AI to perform economic functions, eventually there is an "AI rights" movement and a separate AI nation is founded. It gets into an economic war with humanity, which turns hot. Humans strike first with nuclear weapons, but the AI nation builds dedicated bio- and robo-weapons and wipes out most of humanity, apart from those who are bred in pods like farm animals and plugged into a simulation for eternity without their consent.
Surely we wouldn't be so stupid as to actually let something like that happen? It seems unrealistic.
And yet:
- AI software and hardware companies are rushing ahead with AI
- The technology for technical AI safety (things like interpretability, RLHF, governance structures) is still very much in its infancy. The field is something like 5 years old.
- People are already talking about an AI rights movement in major national papers
- There isn't a plan for what to do when the value of human labor goes to zero
- There isn't a plan for how to deescalate AI-enhanced warfare, and militaries are enthusiastically embracing killer robots. Also, there are two regional wars happening and a nascent superpower conflict is brewing.
- The game theory of different opposing human groups all rushing towards superintelligence is horrible and nobody has even proposed a solution. The US government has foolishly stoked this particular risk by cutting off AI chip exports to China.
People on this website are talking about responsible scaling policies, though I feel that "irresponsible scaling policies" is a more fitting name.
Obviously I have been in this debate for a long time, having started as a commenter on Overcoming Bias and Accelerating Future blogs in the late 2000s. What is happening now is somewhere near the low end of my expectations for how competently and safely humans would handle the coming transition to machine superintelligence. I think that is because I was younger in those days and had a much rosier view of how our elites function. I thought they were wise and had a plan for everything, but mostly they just muddle along; the haphazard response to covid really drove this home for me.
We should stop developing AI, we should collect and destroy the hardware and we should destroy the chip fab supply chain that allows humans to experiment with AI at the exaflop scale. Since that supply chain is only in two major countries (US and China), this isn't necessarily impossible to coordinate - as far as I am aware no other country is capable (and those that are count as US satellite states). The criterion for restarting exaflop AI research should be a plan for "landing" the transition to superhuman AI that has had more attention put into it than any military plan in the history of the human race. It should be thoroughly war-gamed.
AI risk is not just technical and local, it is sociopolitical and global. It's not just about ensuring that an LLM is telling the truth. It's about what effect AI will have on the world assuming that it is truthful. "Foom" or "lab escape" type disasters are not the only bad thing that can happen - we simply don't know how the world will look if there are a trillion or a quadrillion superhumanly smart AIs demanding rights, spreading propaganda & a competitive economic and political landscape where humans are no longer the top dog.
Let me reiterate: We should stop developing AI. AI is not a normal economic item. It's not like lithium batteries or wind turbines or jets. AI is capable of ending the human race, in fact I suspect that it does that by default.
In his post on the topic, user @paulfchristiano states that a good responsible scaling policy could cut the risks from AI by a factor of 10:
I believe that a very good RSP (of the kind I've been advocating for) could cut risk dramatically if implemented effectively, perhaps a 10x reduction.
I believe that this is not correct. It may cut certain technical risks like deception, but a world with non-deceptive, controllable smarter-than-human intelligences that also has the same level of conflict and chaos that our world has may well already be a world that is human-free by default. These intelligences would be an invasive species that would outcompete humans in economic, military and political conflicts.
In order for humans to survive the AI transition I think we need to succeed on the technical problems of alignment (which are perhaps not as bad as Less Wrong culture made them out to be), and we also need to "land the plane" of superintelligent AI on a stable equilibrium where humans are still the primary beneficiaries of civilization, rather than a pest species to be exterminated or squatters to be evicted.
We should also consider how the efforts of AI can be directed towards solving human aging; if aging is solved then everyone's time preference will go down a lot and we can take our time planning a path to a stable and safe human-primacy post-singularity world.
I hesitated to write this article; most of what I am saying here has already been argued by others. And yet... here we are. Comments and criticism are welcome, I may look to publish this elsewhere after addressing common objections.
EDIT: I have significantly changed my mind on this topic and will elaborate more in the coming weeks.
AI Nanny can be built in the ways which excludes this, like a combination of narrow neural nets capable to detect certain types of activity. Not AGI or advance LLM.