What I find incredible is how contributing to the development of existentially dangerous systems is viewed as a morally acceptable course of action within communities that on paper accept that AGI is a threat.
Both OpenAI and Anthropic are incredibly influential among AI safety researchers, despite both organisations being key players in bringing the advent of TAI ever closer.
Both organisations benefit from lexical confusion over the word "safety".
The average person concerned with existential risk from AGI might assume "safety" means working to reduce the likelihood that we all die. They would be disheartened to learn that many "AI Safety" researchers are instead focused on making sure contemporary LLMs behave appropriately. Such "safety" research simply makes the contemporary technology more viable and profitable, driving investment and reducing timelines. There is to my knowledge no published research that proves these techniques will extend to controlling AGI in a useful way.*
OpenAI's "Superalignment" plan is a more ambitious safety play.Their plan to "solve" alignment involves building a human level general intelligence within 4 years and then using this to automate alignment re...
In order for humans to survive the AI transition I think we need to succeed on the technical problems of alignment (which are perhaps not as bad as Less Wrong culture made them out to be), and we also need to "land the plane" of superintelligent AI on a stable equilibrium where humans are still the primary beneficiaries of civilization, rather than a pest species to be exterminated or squatters to be evicted.
Do we really need both? It seems like either a technical solution OR competent global governance would mostly suffice.
Actually-competent global governance should be able to coordinate around just not building AGI (and preventing anyone else from building it) indefinitely. If we could solve a coordination problem on that scale, we could also probably solve a bunch of other mundane coordination problems, governance issues, unrelated x-risks, etc., resulting in a massive boost to global prosperity and happiness through non-AI technological progress and good policy.
Conversely, if we had a complete technical solution, I don't see why we necessarily need that much governance competence. Even if takeoff turns out to be relatively slow, the people initially building and controlling AGI...
Without governance you're stuck trusting that the lead researcher (or whoever is in control) turns down near infinite power and instead act selflessly. That seems like quite the gamble.
Yeah. I think a key point that is often overlooked is that even if powerful AI is technically controllable, i.e. we solve inner alignment, that doesn't mean society will handle it safely. I think by default it looks like every company and military is forced to start using a ton of AI agents (or they will be outcompeted by someone else who does). Competition between a bunch of superhuman AIs that are trying to maximize profits or military tech seems really bad for us. We might not lose control all at once, but rather just be gradually outcompeted by machines, where "gradually" might actually be pretty quick. Basically, we die by Moloch.
The scenario I am most concerned about is a strongly multipolar Malthusian one. There is some chance (maybe even a fair one) that a singleton or oligopoly ASI decides or rigorously coordinate respectively to preserve the biosphere - including humans - at an adequate or superlative level of comfort or fulfillment, or help them ascend themselves, due to ethical considerations, for research purposes, or simulation/karma type considerations.
In a multipolar scenario of gazillions of AI at Malthusian subsistence levels, none of that matters in the default scenario. Individual AIs can be as ethical or empathic as they come, even much more so than any human. But keeping the biosphere around would be a luxury, and any that try to do so, will be outcompeted by more unsentimental economical ones. A farm that can feed a dozen people or an acre of rainforest that can support x species if converted to high efficiency solar panels can support a trillion AIs.
The second scenario is near certain doom so at a bare minimum we should at least get a good inkling of whether AI world is more likely to be unipolar or oligopolistic, or massively multipolar, before proceeding. So a pause is indeed needed, and the most credible way of effecting it is a hardware cap and subsequent back-peddling on compute power. (Roko has good ideas on how to go about that and should develop on them here and at his Substack). Granted if anthropic reasoning is valid, geopolitics might well soon do the job for us. 🚀💥
The field is something like 5 years old.
I'm not sure what you are imagining as 'the field', but isn't it closer to twenty years old? (Both numbers are, of course, much less than the age of the AI field, or of computer science more broadly.)
Much of the source of my worry is that I think in the first ten-twenty years of work on safety, we mostly got impossibility and difficulty results, and so "let's just try and maybe it'll be easy" seems inconsistent with our experience so far.
In order for humans to survive the AI transition I think we need to succeed on the technical problems of alignment (which are perhaps not as bad as Less Wrong culture made them out to be), and we also need to "land the plane" of superintelligent AI on a stable equilibrium where humans are still the primary beneficiaries of civilization, rather than a pest species to be exterminated or squatters to be evicted.
I agree completely with this.
I want to take the opportunity to elaborate a little on what a "stable equilibrium" civilisation should have, in my mind:...
Agreed.
However, there is no collective "we" to whom this message can be effectively directed. The readers of LW are not the ones who can influence the overarching policies of the US and China. That said, leaders at OpenAI and Anthropic might come across this.
This leads to the question of how to halt AI development on a global scale. Several propositions have been put forth:
1. A worldwide political agreement. Given the current state of wars and conflicts, this seems improbable.
2. A global nuclear war. As the likelihood of a political agreement diminishes, t...
I am someone who is at present unsure how to think about AI risk. As a complete layperson with a strong interest in science, technology, futurism and so on, there are - seemingly - some very smart people in the field who appear to be saying that the risk is basically zero (eg: Andrew Ng, Yann Le Cunn). Then there are others who are very worried indeed - as represented by this post I am responding to.
This is confusing.
To get people at my level to support a shut down of the type described above, there needs to be some kind of explanation as to why there is s...
The presumption here is that civilisation is run by governments are chaotic and low competence. If this is true, there is clearly a problem implementing an AI lockdown policy. It would be great to identify the sort of political or economic steps needed to execute the shutdown.
Title changed from
"Architects of Our Own Demise: We Should Stop Developing AI"
to
"Architects of Our Own Demise: We Should Stop Developing AI Carelessly"
The LessWrong Review runs every year to select the posts that have most stood the test of time. This post is not yet eligible for review, but will be at the end of 2024. The top fifty or so posts are featured prominently on the site throughout the year.
Hopefully, the review is better than karma at judging enduring value. If we have accurate prediction markets on the review results, maybe we can have better incentives on LessWrong today. Will this post make the top fifty?
Global compliance is the sine qua non of regulatory approaches, and there is no evidence of the political will to make that happen being within our possible futures unless some catastrophic but survivable casus belli happens to wake the population up - as with Frank Herbert's Butlerian Jihad (irrelevant aside; Samuel Butler who wrote of the dangers of machine evolution and supremacy lived at film location for Eddoras in Lord of the Rings films in the 19th century).
Is it insane to think that a limited nuclear conflict (as seems to be an increasingly ...
Some brief thoughts at a difficult time in the AI risk debate.
Imagine you go back in time to the year 1999 and tell people that in 24 years time, humans will be on the verge of building weakly superhuman AI systems. I remember watching the anime short series The Animatrix at roughly this time, in particular a story called The Second Renaissance I part 2 II part 1 II part 2 . For those who haven't seen it, it is a self-contained origin tale for the events in the seminal 1999 movie The Matrix, telling the story of how humans lost control of the planet.
Humans develop AI to perform economic functions, eventually there is an "AI rights" movement and a separate AI nation is founded. It gets into an economic war with humanity, which turns hot. Humans strike first with nuclear weapons, but the AI nation builds dedicated bio- and robo-weapons and wipes out most of humanity, apart from those who are bred in pods like farm animals and plugged into a simulation for eternity without their consent.
Surely we wouldn't be so stupid as to actually let something like that happen? It seems unrealistic.
And yet:
People on this website are talking about responsible scaling policies, though I feel that "irresponsible scaling policies" is a more fitting name.
Obviously I have been in this debate for a long time, having started as a commenter on Overcoming Bias and Accelerating Future blogs in the late 2000s. What is happening now is somewhere near the low end of my expectations for how competently and safely humans would handle the coming transition to machine superintelligence. I think that is because I was younger in those days and had a much rosier view of how our elites function. I thought they were wise and had a plan for everything, but mostly they just muddle along; the haphazard response to covid really drove this home for me.
We should stop developing AI, we should collect and destroy the hardware and we should destroy the chip fab supply chain that allows humans to experiment with AI at the exaflop scale. Since that supply chain is only in two major countries (US and China), this isn't necessarily impossible to coordinate - as far as I am aware no other country is capable (and those that are count as US satellite states). The criterion for restarting exaflop AI research should be a plan for "landing" the transition to superhuman AI that has had more attention put into it than any military plan in the history of the human race. It should be thoroughly war-gamed.AI risk is not just technical and local, it is sociopolitical and global. It's not just about ensuring that an LLM is telling the truth. It's about what effect AI will have on the world assuming that it is truthful. "Foom" or "lab escape" type disasters are not the only bad thing that can happen - we simply don't know how the world will look if there are a trillion or a quadrillion superhumanly smart AIs demanding rights, spreading propaganda & a competitive economic and political landscape where humans are no longer the top dog.
Let me reiterate:We should stop developing AI. AI is not a normal economic item. It's not like lithium batteries or wind turbines or jets.AI is capable of ending the human race, in fact I suspect that it does that by default.In his post on the topic, user @paulfchristiano states that a good responsible scaling policy could cut the risks from AI by a factor of 10:
I believe that this is not correct. It may cut certain technical risks like deception, but a world with non-deceptive, controllable smarter-than-human intelligences that also has the same level of conflict and chaos that our world has may well already be a world that is human-free by default. These intelligences would be an invasive species that would outcompete humans in economic, military and political conflicts.
In order for humans to survive the AI transition I
think we need to succeed on the technical problems of alignment(which are perhaps not as bad as Less Wrong culture made them out to be), and we also need to "land the plane" of superintelligent AI on a stable equilibrium where humans are still the primary beneficiaries of civilization, rather than a pest species to be exterminated or squatters to be evicted.We should also consider how the efforts of AI can be directed towards solving human aging; if aging is solved then everyone's time preference will go down a lot and we can take our time planning a path to a stable and safe human-primacy post-singularity world.
I hesitated to write this article; most of what I am saying here has already been argued by others. And yet... here we are. Comments and criticism are welcome, I may look to publish this elsewhere after addressing common objections.
EDIT: I have significantly changed my mind on this topic and will elaborate more in the coming weeks.