1 min read

4

This is a special post for quick takes by Holly_Elmore. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
24 comments, sorted by Click to highlight new comments since:

For years, I’ve been worried that we were doomed to die by misaligned AGI because alignment wasn’t happening fast enough or maybe because it wouldn’t work at all. Since I didn’t have the chops to do technical alignment and I didn’t think there was another option, I was living my life for the worlds where disaster didn’t happen (or hadn’t happened yet) and trying to make them better places. The advent of AI Pause as an option— something the public and government might actually hear and act on— has been extremely hopeful for me. I’ve quit my job in animal welfare to devote myself to it.

So I’m confused by the reticence I’m seeing toward Pause from people who, this time last year, were reconciling themselves to “dying with dignity”. Some people think the Pause would somehow accelerate capabilities or make gains on capabilities, which at least make sense to me as a reason not to support it. But I’ve gotten responses that make no sense to me like “every day we wait to make AGI, more galaxies are out of our reach forever”. More than one person has said to me that they are worried that “AGI will never get built” if a pause is successful. (For the record I think is is very unlikely that humanity will not eventually make AGI at this point unless another catastrophe sets back civilization.) This is sometimes coming from the same people who were mourning our species’ extinction as just a matter of time before the Pause option arose. I keep hearing comparisons to nuclear power and ridicule of people who were overcautious about new technology.

What gives? If you’re not excited about Pause, can you tell me why?

[-]1a3orn2319

Two different reasons you might have more pushback, from two different groups of people:

On one hand -- one set of people might have read List of Lethalities / Dying with Dignity a year or two ago and thought "Dang, this seems really badly wrong." But there wasn't that much reason to argue against it, then, because it wasn't like it moved public opinion -- people on the internet being wrong are gonna keep being wrong. Now that public opinion might be moved to regulation, people who thought they had better things to do than correct someone being wrong on the internet are more actively speaking up against such measures. So that's some opposition.

On the other hand -- another set of people might have agreed with List of Lethalities, etc, and did want AI regulation / something like the pause, were operating in far mode for such regulation -- regulation was a thing that might not happen, which you can just imagine happening in an ideal and perfect way, not considering how regulations actually tend to be implemented by governments. Now that regulation might actually happen, you have to operate in near mode and consider the actual contingencies of such acts, where it becomes obvious that the government can fuck up something as simple as a tiktok ban, let alone more complicated things like covid response. I think when operating in near mode AI regulation becomes much less attractive. So that would likely produce some more opposition even from people who previously were in favor.

I don’t recall much conversation about regulation after Dying with Dignity. At the time, I was uncritically accepting the claim that, since this issue was outside of the Overton window, that just wasn’t an option. I do remember a lot of baleful talk about how we’re going to die.

I just don’t understand how anyone who believed in Dying with Dignity would consider regulation too imperfect a solution. Why would you not try? What are you trying to preserve if you think we're on track to die with no solution to alignment in sight? Even if you don’t think regulation will accomplish the goal of saving civilization, isn’t shooting your shot anyway what “dying with dignity” means?

I wouldn't try because most regulations that I can think of -- at least in the form our government is likely to pass them -- have downsides which I consider worse than their benefits.

I also think that x-risk from AI misalignment is more like a 5% chance than a 95% chance. If heavy AI regulations increase other AI-related x-risks -- say, permanent totalitarianism -- while negligibly impacting misalignment risk, the EV can easily come out quite negative.

I think the model by which permanent totalitarianism comes about is actually cleaner than the x-risk RSI model -- and requires less-drastically-smart-superintelligence -- so I think it is worth serious consideration.

But I don't know what particular concrete regulations you have in mind, though. Through what actual means do you want to implement an AI pause, concretely? What kind of downsides do you anticipate from such measures, and how would you mitigate these downsides?

[-][anonymous]129

Hi holly.  

AI pauses are a controversial topic.

If you model out your assumptions, depending on how you resolve certain key questions, you can reach answers of "should pause indefinitely" and "should never pause for any length of time".  

Here's a few cruxes that can lead to "never pause":

(1) is it feasible, using currently known methods in software and computer engineering, to build restricted superintelligences that are safe?  Empirical evidence: hyperscaler software systems.

(2) Is it feasible to coordinate with other nuclear powers to restrict GPU access/production and cause pauses of any meaningful length?  Empirical evidence : cold war

(3)  Do you think that if a superintelligence was available as a tool in the near future, you could solve some of biology and rapidly develop treatments to extend lifespans by meaningful numbers of years?  (remember, we have such treatments for rats, the problem has always been efficiently finding a treatment we know will work in humans)  Empirical evidence: rats on sirolimus and other drugs

(4) if you think ASI is controllable, do you think there it is possible to remain sovereign/not immediately invaded and subject to the whim of another power if you don't race for ASI?  empirical evidence: european colonization era, many wars between more technologically advanced powers and weaker ones

(5) The market forces in favor of no pause seem to be large, and they may grow further until most of the investment capital on earth is going into AI.  How much of a voice will that money have, is it meaningful to even request a pause vs trying to race to build high quality, safer ASI first? (In USA politics, money has a 'voice' that amplifies those with it over the base pool of voters, who have expressed as a majority they do not want AGI.   I would assume similar is true for all other nuclear powers) empirical evidence : stock activity for Nvidia and other AI stocks, past examples of lobbyists manipulating public opinion, most of the history of US politics

(6) do you believe a pause will have any utility other than possibly adding more years to live to your personal existence?  Historically, I do not know of any meaningful examples where a complex technology was improved before building working examples of it to study.  Humans aren't that smart.  empirical evidence: history of all inventions, all engineering  

Conversely, while I am biased towards 'never pause', let's name some cruxes that would lead me to shift to the other pole:

(1) do ASIs emergently develop deception and ability to coordinate with other instances of each other?  no empirical evidence

(2) Will ASIs run as coherent entities with online weight updates and other online self improvement, retaining global context over all tasks the machine has been requested to do?  early chatbots such as Tay AI immediately failed from online weight updates

(3) If we have built multiple ASIs, and with each attempt, the result is like a horror movie, where the machine immediately tries mass murder the moment it has any ability to manipulate the environment  no empirical evidence

(4)  Does nanotechnology turn out to be 'quite easy' and we interrupt ASIs who have successfully solved it?  strong empirical evidence against

(5) Does manipulating humans turn out to be 'quite easy' and we interrupt ASIs who have their own cults?  some empirical evidence for, question is how generalizable are cult creation strategies

(6) Does an ASI find an easy way to compress itself/optimize itself, such that weak computers can support ASI, and there are multiple 'rampant ones' infesting computers? some empirical evidence, intelligence seems to require immense amounts of memory, but early efforts have found huge optimizations

 

Perhaps I should write up a front page post on this.  One thing that I will note is that even if the evidence is for the policy of 'no pause', some may estimate that the benefit of a pause (avoiding say a 10% chance of wiping out all value in the universe) exceeds the cost (delaying the creation of advanced technology that may ultimately lead to pausing human aging and saving billions of lives)

I see the cost of simply delaying life extension technology as totally acceptable to avoid catastrophe. It’s more difficult to contemplate losing it entirely, but I don’t think think a pause would cause that. It might put it out of reach for people alive today, but what are you going to do about that? Gamble the whole future of humanity?

[-][anonymous]-4-5

For many people the answer to that hypothetical is yes. Putting life extension "out of reach" is no small ask. Think about how many billion people you are expecting to die for the possibility of existence of people not even alive right now.

One way to model this crux is discount rate. How many do you value futures you and your currently living descendants would not get to see? For some the answer is zero, this is simply Making Beliefs Pay Rent. If there is no possibility of you personally seeing the outcome you should be indifferent to it.

I suspect the overwhelming majority of living humans make decisions on short, high discount rate timescales and there is a lot of evidence to support this.

Applying even a modest discount rate of say a few percent per year also makes the value of humanity in 30-100 years very small. So again, rational policy should be to seek life extension and never pause, unless you believe in very low discount rates on future positive events.

If you wanted to convince people to support AI pauses you would need to convince them of short, immediate term harm - egregiously misaligned ai, short term job loss - and conceal from them or spread misinformation that ASI is unable to quickly find effective methods of life extension.

Lying is a well established strategy for winning political arguments but if you must resort to lies in a world that will have the technology to unmask them, perhaps you should reexamine your priors.

For many people the answer to that hypothetical is yes.

For a handful of people, a large chunk of them on this website, the answer is yes. Most people don't think life extension is possible for them and it isn't their first concern about AGI. I would bet the majority of people would not want to gamble with the possibility of everyone dying of an AGI because it might under a subset of scenarios extend their lives. 

[-][anonymous]20

I think the most coherent argument above is discount rate. Using the discount rate model you and I are both wrong. Since AGI is an unpredictable number of years away, as well as life extension, neither of us has any meaningful support for our positions among the voting public. You need to show the immediacy of your concerns about AGI, I need to show life extension driven by AGI beginning to visibly work.

AI pauses will not happen due to this discounting. (So it's irrelevant whether or not they are good or bad). That's because the threat is far away and uncertain, while the possible money to be made is near and essentially all the investment money on earth "wants" to bet on AI/AGI. (As rationally speaking there is no greater expected roi)

Please note I am sympathetic to your position, I am saying "will not happen" as a strong prediction based on the evidence, not what I want to happen.

Also the short term harms of AI aren't lies?

I think you're right and the legitimate dilemma is underecognized. I think a front page post on this would be very useful

I feel a bit reticent about pause advocacy, altho I have to admit I'm not familiar with the details (and I'm not feeling so negative about it that I want to spend a bunch of time trashing it). My attempt to flesh out why:

  • I'm pretty influenced by the type of libertarian political philosophy that says that hastily-assembled policy proposals can have big negative unintended side effects, especially when such policy proposals involve giving discretionary control over something to a government.
  • I'm pessimistic about our odds of surviving really powerful AI, but not so pessimistic that I think p(doom) couldn't be 10 percentage points higher.
  • Pause advocacy seems to seek compromise with normal people in order to get their policy proposals passed - an obviously good strategy on some level, but I kind of hate policy proposals that normal people like! This is doubly true for polities where it's easiest to start passing serious tech regulation (California, the EU).
  • Relatedly, I have the impression that pause policy advocacy tends to look like taking popular policies and promoting those that slow down AI the most, rather than doing something like mandatory AI liability insurance which seems like it's close to optimal, then adjusting it to be popular with lots of people.
  • I worry that "give more discretionary control over AI to such-and-such political body" just produces worse decisions.

Anyway that's why I have some sort of instinctive negative reaction, but again I'm not very familiar with the details and I'm sure different people are doing different things etc.

Some people think the Pause would somehow accelerate capabilities or make gains on capabilities, which at least make sense to me as a reason not to support it.

I'm not sure what this argument refers to, but I haven't heard it before. However, it sounds related to the argument that pausing AI creates a compute overhang, which could lead to rapid progress if the pause is lifted. I think this argument makes sense, but I don't think it's an overwhelming consideration. I just think it's a major consideration that should be weighed against the benefits of a pause, and I'm not convinced that those benefits would be large.

I think the "we might never build AGI" arguments seem like nonsense. My main concern with the Pause AI movement is that I think this kind of social advocacy is complex and messy, and have the intuition that it's easy to do more harm than good and discredit the movement/seem like fringe extremists. Plus not knowing of people who I consider to be competent and experienced in the AI Policy space who are involved with/advising the movement.

those sound like secondhand positions to me. not like those people were the originators of the reasoning. I think a pause is likely to guarantee we die though. we need to actually resist all bad directions, which a pause just lets some people ignore. pauses could never be enforced well enough to save us without an already save us grade ai.

If AGI is sufficiently nontrivial, delay of a few years might be feasible and give time to decrease doom. If AGI requires enormous datacenters, outlawing production of GPUs and treating existing ones like illegal firearms might lead to indefinite delay (though it probably takes at least an observed disaster to enter Overton window).

I see your argument moving the conclusion (pause becoming worse than no-pause) when expecting imminent AGI that only really needs modest compute and doesn't guarantee doom. Then the only pause that works is AI-enforced one, and nobody working on that can outweigh burning some alignment progress timeline.

How does a pause let us ignore bad directions?

a pause lets some people ignore the pause and move in bad directions. we need to be able as a civilization to prevent the tugs on society to get sucked into AIs. the AIs of today will take longer to kill us all, but they'll still degrade our soul-data, compare YouTube recommender. authoritarian cultures that want to destroy humanity's soul might agree not to make bigger ai, but today's big ai is plenty. it's not like not pausing is massively better; nothing but drastically speeding up safety and liberty-generating alignment could save us

Irrevocable soul-data degradation with modern AI within decades doesn't seem likely though, if AI doesn't develop further. AI that develops further in undesirable ways seems a much larger and faster threat to liberty/self-determination, even if it doesn't kill everyone. And if not making bigger AI for a while is feasible, it gives time to figure out how to make better AI. If that time goes unused, that's no improvement, but all else equal option to improve is better than its absence.

every death and every forgotten memory is soul degradation. we won't be able to just reconstruct everyone exactly.

That's the tradeoffs from hell we have to confront, there are arguments on either side of the decision. I was responding to your "AIs of today will take longer to kill us all, but they'll still degrade our soul-data", a claim about AIs of today (as opposed to AIs of tomorrow), not about human mortality. If AIs of tomorrow eat humanity's soul outright, its degradation from mortality and forgetting is the lesser evil that persists while we get better at doing something about AIs of tomorrow. (There is also growth while humanity lives, hope to find a way forward, not only ongoing damage.)

Pause opposes development of AIs that prevent rogue AGIs. If rogue AGIs despite pause are likely, and it's feasible to develop AIs sufficiently aligned to prevent rogue AGIs, then pause increases doom. The premises of the argument are contentious, but within its scope the argument seems valid.

[-]JNS35

How does pausing change much of anything?

Lets say we manage to implement a world wide ban/pause on large training runs, what happens next?

Well obviously smaller training runs, up to whatever limit has been imposed, or no training runs for some time.[1]

The next obvious thing that happens, and btw is already happening in the open source community, would be optimizing algorithms. You have a limit on compute? Well then you OBVIOUSLY will try and make the most of the compute you have.

Non of that fixes anything.

What we should do:[2]

Pour tons of money into research, first order of business is to make the formal case for x-risk is a thing and must actively be mitigated. Or said another way, we need humans aligned on "alignment does not happen by default" [3]

Next order of business, assuming alignment does not happen by default, is to formally produce and verify plans   for how to build safe / aligned cognitive architectures.

And all the while there cannot be any training runs, and no work on algorithmic optimization or cognitive architectures in general.[4]

The problem is we can't do that, its too late, the cat is out of the bag, there is too much money to be made in the short term, open source is plowing ahead and the amount of people who actually looked at the entire edifice for long enough to realize "yeah you know what I think we have a problem, we really must look into if that is real or not, and if it is we need to figure out what it takes to do this risk free"  is miniscule compared to the amount of people who go "bah, it will be fine, don't be such a drama queen"

And that's why I think a pause at best extends timelines ever so slightly, and at worst they shorten them considerably, and either way the outcomes remains unchanged.

  1. ^

    Except people will do runs no matter what, the draconian measures needed will not happen, cannot happen.

  2. ^

    Actually its what we should have done.

  3. ^

    Unless of course it does, and a formal proof of this can be produced.

  4. ^

    Contingent on how hard the problem is - if we need 100 years to solve the problem, we would destroy the world many time over if we plowed ahead with capabilities research.

A weakness I often observe in my numerous rationalist friends is "rationalizing and making excuses to feel like doing the intellectually cool thing is the useful or moral thing". Fwiw. If you want to do the cool thing, own it, own the consequences, and own the way that changes how you can honestly see yourself.