[Caveat lector: I know roughly nothing about policy!]

Suppose that there were political support to really halt research that might lead to an unstoppable, unsteerable transfer of control over the lightcone from humans to AGIs. What government policy could exert that political value?

[That does sound relaxing.]

Banning AGI research specifically

This question is NOT ASKING ABOUT GENERALLY SLOWING DOWN AI-RELATED ACTIVITY. The question is specifically about what it could look like to ban (or rather, impose an indefinite moratorium on) research that is aimed at creating artifacts that are more capable in general than humanity.

So "restrict chip exports to China" or "require large vector processing clusters to submit to inspections" or "require evals for commercialized systems" don't answer the question.

The question is NOT LIMITED to policies that would be actually practically enforceable by their letter. Making AGI research illegal would slow it down, even if the ban is physically evadable; researchers generally want to think publishable thoughts, and generally want to plausibly be doing something good or neutral by their society's judgement. If the FBI felt they had a mandate to investigate AGI attempts, even if they would have to figure out some only-sorta-related crime to actually charge, maybe that would also chill AGI research. The question is about making the societal value of "let's not build this for now" be exerted in the most forceful and explicit form that's feasible.

Some sorts of things that would more address the question (in the following, replace "AGI" with "computer programs that learn, perform tasks, or answer questions in full generality", or something else that could go in a government policy):

  1. Make it illegal to write AGIs.
  2. Make it illegal to pay someone if the job description explicitly talks about making AGIs.
  3. Make it illegal to conspire to write AGIs.

Why ask this?

I've asked this question of several (5-10) people, some of whom know something about policy and have thought about policies that would decrease AGI X-risk. All of them said they had not thought about this question. I think they mostly viewed it as not a very salient question because there isn't political support for such a ban. Maybe the possibility has been analyzed somewhere that I haven't seen; links?

But I'm still curious because:

  1. I just am. Curious, I mean.
  2. Maybe there will be support later, at which point it would be good to have already mostly figured out a policy that would actually delay AGI for decades.
  3. Maybe having a clearer proposal would crystallize more political support, for example by having something more concrete to rally around, and by having something for AGI researchers "locked in races" to coordinate on as an escape from the race.
  4. Maybe having a clearer proposal would allow people who want to do non-AGI AI research to build social niches for non-AGI AI research, and thereby be less bluntly opposed to regulation on AGI specifically.
  5. [other benefits of clarity]

Has anyone really been far even as decided to use?

There's a lot of problems with an "AGI ban" policy like this. I'm wondering, though, which problems, if any, are really dealbreakers.

For example, one problem is: How do you even define what "AGI" or "trying to write an AGI" is? I'm wondering how much this is actually a problem, though. As a layman, as far as I know there could be existing government policies that are somewhat comparably difficult to evaluate. Many judicial decisions related to crimes, as I vaguely understand it, depend on intentionality and belief——e.g. for a killing to be a murder, the killer must have intended to kill and must not have believed on reasonable grounds that zer life was imminently unjustifiedly threatened by the victim. So it's not like not-directly-observable mental states are out of bounds. What are some crimes that are defined by mental states that are even more difficult to evaluate? Insider trading? (The problem is still very hairy, because e.g. you have to define "AGI" broadly enough that it includes "generalist scientist tool-AI", even though that phrase gives some plausible deniability like "we're trying to make a thing which is bad at agentic stuff, and only good at thinky stuff". Can you ban "unbounded algorithmic search"?)

Some other comparisons:

  • Bans on computer programs. E.g. bans on hacking private computer systems. How much do these bans work? Presumably fewer people hack their school's grades database than would without whatever laws there are; on the other hand, there's tons of piracy.
  • Bans on research. E.g. recombinant DNA, cloning, gain-of-function.
  • Bans on conspiracies with illegal long-term goals. E.g. hopefully-presumably you can't in real life create the Let's Build A Nuclear Bomb, Inc. company and hire a bunch of nuclear scientists and engineers with the express goal of blowing up a city. And hopefully-presumably your nuke company gets shut down well before you actually try to smuggle some uranium, even though "you were just doing theoretical math research on a whiteboard". How specifically is this regulated? Could the same mechanism apply to AGI research?

Is that good to do?

Yeah, probably, though we couldn't know whether a policy would be good without knowing what the policy would look like. There are some world-destroying things that we have to ban, for now; for everything else, there's Mastercard libertarian techno-optimism.

New Answer
New Comment

3 Answers sorted by

faul_sname

75

I think you get very different answers depending on whether your question is "what is an example of a policy that makes it illegal in the United States to do research with the explicit intent of creating AGI" or whether it is "what is an example of a policy that results in nobody, including intelligence agencies, doing AI research that could lead to AGI, anywhere in the world".

For the former, something like updates to export administration regulations could maybe make it de-facto illegal to develop AI aimed at the international market. Historically, that was successful at making it illegal to intentionally export software which implemented strong encryption for a bit. It didn't actually prevent the export, but it did arguably make that export unlawful. I'd recommend reading that article in full, actually, to give you an idea of how "what the law says" and "what ends up happening" can diverge.

20

State monopoly:

  • The Song Dynasty (960-1279) established a state monopoly over saltpeter production, a critical ingredient in gunpowder. The government appointed officials to oversee the collection and refinement of saltpeter.
  • During the Ming Dynasty (1368-1644), the government further tightened its control over saltpeter production, with the "Saltpeter Censorate" responsible for managing the state monopoly.

Limiting knowledge:

  • Chinese officials kept the recipe for gunpowder a closely guarded secret. The exact proportions of saltpeter, sulfur, and charcoal were not shared widely.
  • Technical manuals on gunpowder production, such as the "Wujing Zongyao" (Collection of the Most Important Military Techniques) from the Song Dynasty, were restricted and not freely circulated.

Strict regulations:

  • The Ming Dynasty implemented strict laws and regulations on the private manufacture and use of gunpowder weapons. Violators faced severe punishments, including execution.
  • In the 14th century, Ming Emperor Hongwu issued an edict prohibiting the private production and sale of gunpowder and firearms, limiting their use to the military.

Emphasis on traditional weapons:

  • Confucian scholars and officials promoted the idea that traditional weapons, such as bows and crossbows, were more suitable for maintaining social order and harmony.
  • Military examinations during the Ming Dynasty focused on proficiency in traditional weapons rather than gunpowder weapons, reinforcing the importance of traditional warfare techniques.
     


 

What's the saltpeter?

ASML Shows Off Next-Generation, $380 Million High-NA EUV Lithography  Machines | Extremetech


You make manufacturing programmable parallel processing hardware, above some certain baseline, a controlled item, and track at all times the location of the chipmaking tooling.

Every IC fab under international regulations would be inspected, and only low power ICs would be permitted without a licence.  All our phones, game consoles, computers work as dumb terminals - very little compute is local (computers are all just remote terminals) and as much as possible is in licensed and inspected data centers.  

 

 

Any internal source codes, documents, etc belonging to AI companies must be seized and classified under the same schema as nuclear secrets.  

All software also is a restricted item - being a SWE is a licensed profession, you must have permission from the government (like planning approval) to author anything, and only government licensed software can run on a computer above a certain threshold.  The "government software inspectors" will be checking to make sure the implementation isn't just a lazy call to a neural network.

 

 

We need to have state controlled media promote the idea that intellectual labor, such as art and computer programming, is more suitable for maintaining social order and harmony.  Propaganda must emphasize the danger of trusting any generated assets from AI as soulless cheap copies that will never work.  

Schools need to remove any training on using AI from the curriculum - even discussing "prompt engineering" should be a crime.

 

 

It wasn't until the the Opium Wars (1839-1842 and 1856-1860) and the Sino-Japanese War (1894-1895) where this ban caused substantial losses to China.  
 

880 years!   Getting 20 years out of an AGI ban, before falling to attack from AGI driven weapons from foreign rivals, sounds optimistic.  

 

Just keep in mind the ultimate consequences.  Imagine the Chinese soldiers facing machine guns and advanced 19th century warships.

 

Create a split battlefield scene to clearly differentiate the two sides. On the left, depict aged soldiers, their faces and uniforms showing the wear of time, standing in a defensive posture with outdated weaponry. Their side of the battlefield is marked by trenches and makeshift barricades, emphasizing a low-tech, human resilience against overwhelming odds. On the right, illustrate a futuristic force comprising hyper-advanced drones swarming in the sky and humanoid figures with feline features, indicating a blend of high technology and speculative bioengineering. These cat-like humanoids are equipped with cutting-edge weaponry, standing alongside the drones, ready to launch an assault. This clear left-to-right division visually contrasts the stark technological disparity between the two factions, highlighting the imminent clash of eras.

 

Or:
Prompt: draw a nation that banned AGI facing attack by foreign rivals. Show a battle scene, where all the soldiers in the foreground are old from a lack of regenerative medicine, and the enemy is mostly a sky blotting swarm of drones

This is actually a dynamic I've read a lot about. The risk of ending up militarily/technologically behind is already well on the minds of the people who make up all of the major powers today, and all diplomacy and negotiations are already built on top of that ground truth and mitigating the harm/distrust that stems from it. 

Weakness at mitigating distrust = just being bad at diplomacy. Finding galaxy-brained solutions to coordination problems is necessary for being above par in this space.

I'm imagining the cat masks are some sort of adversarial attack on possible enemy image classifiers.

MiguelDev

-1-2

It might be possible to ban a training environment that remains relatively untested? For example, combinations of learning rates or epochs that haven't been documented as safe for achieving an ethical aligned objective. Certainly, implementing such a ban would require a robust global governance mechanism to review what training environment constitutes to achieving an ethically aligned objective but this is how I envision the process of enforcing such a ban could work.

18 comments, sorted by Click to highlight new comments since:
[-]trevor121

[Caveat lector: I know roughly nothing about policy!]

For AI people new to international affairs, I've generally recommend skimming these well-respected texts that are pretty well-known to have directly inspired many of the people making foreign policy decisions:

  • Chapters 1 and 2 of Mearsheimer's Tragedy of Great Power Politics (2010). The model (offensive realism) is not enough by itself, but it helps to start with a flawed model because the space is full of them, this model has been predictive, it's popular among policymakers in DC, and gives a great perspective on how impoverished foreign policy culture is because nobody ever reads stuff like the Sequences.
  • Chapters 1 and 4 of Nye's Soft Power (2004)(skim ch. 1 extra fast and ch. 4 slower). Basically a modern history of propaganda and influence operations, except cutting off at 2004. Describes how the world is more complicated than Tragedy of Great Power Politics describes.
  • Chapters 1 and 2 of Schelling's Arms and Influence (1966). Yes, it's that Schelling, this was when he started influencing the world's thinking about how decision theory drives nuclear standoffs, and diplomacy in general, in the wake of the Cuban Missile Crisis. You can be extremely confident that this was a big part of the cultural foundation of foreign policy establishments around the world, plus for a MIRI employee it should be an incredibly light read applying decision theory to international politics and nuclear war. 

I'm going to read some more stuff soon and possibly overhaul these standard recommendations.

Akash also recommended Devil's Chessboard to understand intelligence agencies, and Master of the Senate and Act of Congress to understand Congress. I haven't gotten around to reading them yet, and I can't tell how successful his org has been in Congress itself (which is the main measurement of success tendency), but the Final Takes section of his post on Congress is fantastic and makes me confident enough to try them out.

I think all AI research makes AGI easier, so "non-AGI AI research" might not be a thing. And even if I'm wrong about that, it also seems to me that most harms of AGI could come from tool AI + humans just as well. So I'm not sure the question is right. Tbh I'd just stop most AI work.

I agree that tool AI + humans can create a lot of large magnitude harms. I think probably its still quite a bit less bad than directly having a high intelligence, faster-than-human, self-duplicating, anti-human AGI on the loose. The trouble though is that with sufficient available compute, sufficient broad scientific knowledge about the brain and learning algorithms, and sufficiently powerful tool AI... It becomes trivially fast and easy for a single well-resourced human to make the unwise irreversible decision to create and unleash a powerful unaligned AGI.

If anyone on Earth had the option to anonymously purchase a nuclear bomb for $10k at any time, I don't expect a law against owning or using nuclear weapons would prevent all use of nuclear weapons. Sometimes people do bad things.

AI + Humans would just eventually give rise to AGI anyway so I dont see the distinction people try to make here.

Can we make it illegal to make superhuman AGI, and then create rules regarding criminal conspiracy to create AGI, which would cover unexecuted plans? https://en.wikipedia.org/wiki/Criminal_conspiracy

Have there been serious (e.g. large fines, jail time, corporate dissolution) penalties (e.g. judicial or statutory) for large bodies (companies, contractors, government orgs) due to extreme negligence about some harm prospectively (without the harm having happened yet) and speculatively (where the harm has not actually ever happened)?

As a hypothetical example, suppose that nuclear regulation is informed by Scenario X, in which 100k people die. Scenario X is believed to happen if conditions A,B,C are met, so nuclear power companies are required to meet conditions ¬A,¬B,¬C. But then an inspector finds that ¬A and ¬B are not firmly met. So then the company is dissolved and the CEO is thrown in jail.

What are some extreme examples of this? (E.g. an extreme penalty, or where the negligence is unclear (prospective, speculative).)

There is some discussion here and here on similar topics.

There are some world-destroying things that we have to ban, for now; for everything else, there's Mastercard libertarian techno-optimism.

This seems to suggest that gradually banning and phasing out almost all computers is not on the table (which is probably a good thing, as I am "not sure" we would want to live in a computer-less society). I can imagine something as radical as gradually banning and phasing out almost all computers actually working (or not, depending on the structure of global society, and also noting that the sacrifice would be pretty bad).

But if this kind of radical measures are not on the table, I think, the main effects of a Prohibition will be similar to the effects of our drug war (the availability is lower, the quality is often pretty bad, and the potency is occasionally very high, there might indeed be an additional lag before this kind of extra high potency emerges).

So, translating to advanced AI: "the quality is often pretty bad" translates to all kinds of safety measures often being non-existent, "the potency is occasionally very high" translates to completely unregulated and uncontrolled spikes of capability (possibly including "true foom"), "there might indeed be an additional lag before this kind of extra high potency emerges" translates to indeed buying some time in exchange for the consequences down the road.

I am not sure that the eventual P(doom) is lower, if we do try to go down this road, and I would not be surprised if this would make the eventual P(doom) higher.

"the quality is often pretty bad" translates to all kinds of safety measures often being non-existent, "the potency is occasionally very high" translates to completely unregulated and uncontrolled spikes of capability (possibly including "true foom")

Both of these points precisely reflect our current circumstances. It may not even be possible to accidentally make these two things worse with regulation.

What has historically made things worse for AI Safety is rushing ahead "because we are the good guys."

[-]mishka3-4

Both of these points precisely reflect our current circumstances.

No, there is plenty of room between the current circumstances and the bottom. We might be back to Eliezer's "an unknown random team creates a fooming AI in a garage" old threat model, if we curtail the current high-end too much.

Just like there is plenty of room between legal pharmacy and black market for pain relievers (even when the name of the drug is the same).

It's very easy to make things worse.

It may not even be possible to accidentally make these two things worse with regulation.

It's probably possible. But regulation is often good, and we do need more regulation for AI.

In this post we are not talking about regulation, we are talking about prohibition, which is a different story.

[-]O O20

Replace "unknown random team" with the US military and a garage with a "military base" and you would be correct. There is no incentive for militaries to stop building autonomous drones/AGI. 

[-]mishka1-1

Militaries are certainly doing that, I agree.

However, I am not sure they are creative enough and not-control-freaks enough to try to build seriously self-modifying systems. They also don't mind spending tons of money and allocating large teams, so they might not be aiming for artificial AI researchers all that much. And they are afraid to lose control (they know how to control people, but artificial self-modifying systems are something else).

Whereas a team in a garage is creative, is short on resources and quite interested in creating a team of artificial co-workers to help them (a success in that leads to a serious recursive self-improvement situation automatically), and might not hesitate to try other recursive self-improvements schemas (we are seeing more and more descriptions of novel recursive self-improvements schemas in recent publications), so they might end up with a foom even before they build more conventional artificial AI researchers (a sufficiently powerful self-referential metalearning schema might result in that; a typical experience is that all those recursive self-improvement schemas saturate disappointingly early, so the teams will be pushing harder at them trying to prevent premature saturation, and someone might succeed too well).

Basically, having "true AGI" means being able to create competent artificial AI researchers, which are sufficient for very serious recursive self-improvement capabilities, but one might also obtain drastic recursive self-improvement capabilities way before achieving anything like "true AGI". "True AGI" is sufficient to start a far reaching recursive self-improvement, but there is no reason to think that "true AGI" is necessary for that (being more persistent at hacking the currently crippled self-improvement schemas and at studying ways to improve them might be enough).

[-]Haiku-2-5

I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.

There is no "good guy with an AGI" or "marginally safer frontier lab." There is only "oops, all entity smarter than us that we never figured out how to align or control."

If just the State of California suddenly made training runs above 10^26 FLOP illegal, that would be a massive improvement over our current situation on multiple fronts: it would significantly inconvenience most frontier labs for at least a few months, and it would send a strong message around the world that it is long past time to actually start taking this issue seriously.

Being extremely careful about our initial policy proposals doesn't buy us nearly as much utility as being extremely loud about not wanting to die.

I expect AGI within 5 years. I give it a 95% chance that if an AGI is built, it will self-improve and wipe out humanity. In my view, the remaining 5% depends very little on who builds it. Someone who builds AGI while actively trying to end the world has almost exactly as much chance of doing so as someone who builds AGI for any other reason.

There is no "good guy with an AGI" or "marginally safer frontier lab." There is only "oops, all entity smarter than us that we never figured out how to align or control."

So what do you allocate the remaining 5% to? No matter who builds the AGI, there's 5% chance that it doesn't wipe out humanity because... what? (Or is it just model uncertainty?)

Yes, that's my model uncertainty.

[-]mishka0-4

I do expect a foom (via AGI or via other route), and my timelines are much shorter than 5 years.

But algorithms for AI are improving faster than hardware (people seem to quote doubling in compute efficiency approximately each 8 months), so if one simply bans training runs above fixed compute thresholds, one trades off a bit of extra time before a particular achievement vs increase of number of active players achieving it a bit later (basically, this delays the most well-equipped companies a bit and democratizes the race, which is not necessarily better).

We can make bans progressively tighter, so we can buy some time, but as the algorithms progress further, it is not unlikely that we might at some point face the choice of banning computers altogether or facing a foom. So eventually we are likely going to face huge risks anyway.

I do think it's time to focus not on "aligning" or "controlling" self-modifying ecosystems of self-modifying super-intelligences, but on figuring out how to increase the chances that a possible foom goes well for us instead of killing us. I believe that thinking only in terms of "aligning" or "controlling" limits the space of possible approaches to AI existential safety, and that approaches not based on notions of "alignment" and "control" might be more fruitful.

And, yes, I think our chances are better if the most thoughtful of practitioners achieve that first. For example, Ilya Sutskever's thinking on the subject has been very promising (which is why I tend to favor OpenAI if he continues to lead the AI safety effort there, but I would be much more skeptical of them otherwise).

It doesn't matter how promising anyone's thinking has been on the subject. This isn't a game. If we are in a position such that continuing to accelerate toward the cliff and hoping it works out is truly our best bet, then I strongly expect that we are dead people walking. Nearly 100% of the utility is in not doing the outrageously stupid dangerous thing. I don't want a singularity and I absolutely do not buy the fatalistic ideologies that say it is inevitable, while actively shoveling coal into Moloch's furnace.

I physically get out into the world to hand out flyers and tell everyone I can that the experts say the world might end soon because of the obscene recklessness of a handful of companies. I am absolutely not the best person to do so, but no one else in my entire city will, and I really, seriously, actually don't want everyone to die soon. If we are not crying out and demanding that the frontier labs be forced to stop what they are doing, then we are passively committing suicide. Anyone who has a P(doom) above 1% and debates the minutiae of policy but hasn't so much as emailed a politician is not serious about wanting the world to continue to exist.

I am confident that this comment represents what the billions of normal, average people of the world would actually think and want if they heard, understood, and absorbed the basic facts of our current situation with regard to AI and doom. I'm with the reasonable majority who say when polled that they don't want AGI. How dare we risk murdering every last one of them by throwing dice at the craps table to fulfill some childish sci-fi fantasy.

[-]mishka2-1

Like I said, if we try to apply forceful measures we might delay it for some time (at the price of people aging and dying from old age and illnesses to the tune of dozens of millions per year due to the continuing delay; but we might think that the risks are so high that the price is worth it, and we might think that the price of everyone who is alive today eventually dying of old age is worth it, although some of us might disagree with that and might say that taking the risk of foom is better than the guaranteed eventual death of old age or other illness; there is a room for disagreement on which of these risks it is preferable to choose).

But if we are talking about avoiding foom indefinitely, we should start with asking ourselves, how easy or how difficult is it to achieve. How long before a small group of people equipped with home computers can create foom?

And the results of this analysis are not pretty. Computers are ultimate self-modifying devices, they can produce code which programs them. Methods to produce computer code much better than we do it today are not all that difficult, they are just in the collective cognitive blindspot, like backpropagation was for a long time, like ReLU activations were for decades, like residual connectivity in neural machines was in the collective cognitive blindspot for unreasonably long time. But this state of those enhanced methods of producing new computer code being relatively unknown would not last forever.

And just like backpropagation, ReLU, or residual connections, these methods are not all that difficult, it's not like if a single "genius" who might discover them would refrain from sharing them, they would remain unknown. People keep rediscovering and rediscovering those methods, they are not that tricky (backpropagation was independently discovered more than 10 times between 1970 and 1986, before people stopped ignoring it and started to use it).

It's just the case that the memetic fitness of those methods is currently low, just like memetic fitness of backpropagation, ReLU, and residual connections used to be low in the strangest possible ways. But this would not last, the understanding of how to have competent self-improvement in small-scale software on ordinary laptops will gradually form and proliferate. At the end of the day, we'll end up having to either phase out universal computers (at least those which are as powerful as our home computers today) or to find ways to control them very tightly, so that people are no longer free to program their own computers as they see fit.

Perhaps, humans will chose to do that, I don't know. Nor do I know whether they would succeed in a Butlerian jihad of this kind, or whether some of the side effects of trying to get rid of computers would become X-risks themselves. In any case, in order to avoid foom indefinitely, people will have to take drastic, radical measures which would make everyone miserable, would kill a lot of people, and might create other existential risks.

I think it's safer if the most competent leading group tries to go ahead, that our eventual chances of survival are higher along this path, compared to the likely alternatives.

I do think that the risks on the alternative paths are very high too; the great powers are continuing to inch towards major nuclear confrontation; we are enabling more and more people to create diverse super-covid-like artificial pandemics with 30% mortality or more; things are pretty bad in terms of major risks this civilization is facing; instead of asking "what's your P(doom)" we should be asking people, "what's your P(doom) conditional on foom and what's your P(doom) conditional on no foom happening". My P(doom) is not small, but is it higher conditional on foom, than conditional on no foom? I don't know...