tl;dr: If you think humanity is on a dangerous path, and needs to "pivot" toward a different future in order to achieve safety, consider how such a pivot could be achieved by multiple acts across multiple persons and institutions, rather than a single act. Engaging more actors in the process is more costly in terms of coordination, but in the end may be a more practicable social process involving less extreme risk-taking than a single "pivotal act".
Preceded by: “Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments
[This post is also available on the EA Forum.]
In the preceding post, I argued for the negative consequences of the intention to carry out a pivotal act, i.e., a single, large world-changing act sufficient to 'pivot' humanity off of a dangerous path onto a safer one. In short, there are negative side effects of being the sort of institution aiming or willing to carry out a pivotal act, and those negative side effects alone might outweigh the benefit of the act, or prevent the act from even happening.
In this post, I argue that it's still a good idea for humanity-as-a-whole to make a large / pivotal change in its developmental trajectory in order to become safer. In other words, my main concern is not with the "pivot", but with trying to get the whole "pivot" from a single "act", i.e., from a single agent-like entity, such a single human person, institution, or AI system.
Pivotal outcomes and processes
To contrast with pivotal acts, here's a simplified example of a pivotal outcome that one could imagine making a big positive difference to humanity's future, which in principle could be brought about by a multiplicity of actors:
- (the "AI immune system") The whole internet — including space satellites and the internet-of-things — becomes way more secure, and includes a distributed network of non-nuclear electromagnetic pulse emitters that will physically shut down any tech infrastructure appearing to be running rogue AI agents.
(For now, let's set aside debate about whether this outcome on its own would be pivotal, in the sense of pivoting humanity onto a safe developmental trajectory... it needs a lot more details and improvements to be adequate for that! My goal in this post is to focus on how the outcome comes about. So for the sake of argument I'm asking to take the "pivotality" of the outcome for granted.)
If a single institution imposed the construction of such an AI immune system on its own, that would constitute a pivotal act. But if a distributed network of several states and companies separately instituted different parts of the change — say, designing and building the EMP emitters, installing them in various jurisdictions, etc. — then I'd call that a pivotal distributed process, or pivotal process for short.
In summary, a pivotal outcome can be achieved through a pivotal (distributed) process without a single pivotal act being carried out by any one institution. Of course, the "can" there is very difficult, and involves solving a ton of coordination problems that I'm not saying humanity will succeed in solving. However, aiming for a pivotal outcome via a pivotal distributed process definitively seems safer to me, in terms of the dynamics it would create between labs and militaries, compared to a single lab planning to do it all on their own.
Revisiting the consequences of pivotal act intentions
In AGI Ruin, Eliezer writes the following, I believe correctly:
- The reason why nobody in this community has successfully named a 'pivotal weak act' where you do something weak enough with an AGI to be passively safe, but powerful enough to prevent any other AGI from destroying the world a year later - and yet also we can't just go do that right now and need to wait on AI - is that nothing like that exists. There's no reason why it should exist. There is not some elaborate clever reason why it exists but nobody can see it. It takes a lot of power to do something to the current world that prevents any other AGI from coming into existence; nothing which can do that is passively safe in virtue of its weakness.
I think the above realization is important. The un-safety of trying to get a single locus of action to bring about a pivotal outcome all on its own is important, and it pretty much covers my rationale for why we (humanity) shouldn't advocate for unilateral actors doing that sort of thing.
Less convincingly-to-me, Eliezer then goes on to (seemingly) advocate for using AI to carry out a pivotal act, which he acknowledges would be quite a forceful intervention on the world:
- If you can't solve the problem right now (which you can't, because you're opposed to other actors who don't want [it] to be solved and those actors are on roughly the same level as you) then you are resorting to some cognitive system that can do things you could not figure out how to do yourself, that you were not close to figuring out because you are not close to being able to, for example, burn all GPUs. Burning all GPUs would actually stop Facebook AI Research from destroying the world six months later; weaksauce Overton-abiding stuff about 'improving public epistemology by setting GPT-4 loose on Twitter to provide scientifically literate arguments about everything' will be cool but will not actually prevent Facebook AI Research from destroying the world six months later, or some eager open-source collaborative from destroying the world a year later if you manage to stop FAIR specifically. There are no pivotal weak acts.
I'm not entirely sure if the above is meant to advocate for AGI development teams planning to use their future AGI to burn other people's GPU's, but it could certainly be read that way, and my counterargument to that reading has already been written, in “Pivotal Act” Intentions: Negative Consequences and Fallacious Arguments. Basically, a lab X with the intention to burn all the world's GPUs will create a lot of fear that lab X is going to do something drastic that ends up destroying the world by mistake, which in particular drives up the fear and desperation of other AI labs to "get there first" to pull off their own version of a pivotal act. Plus, it requires populating the AGI lab with people willing to do some pretty drastically invasive things to other companies, in particular violating private property laws and state boundaries. From the perspective of a tech CEO, it's quite unnerving to employ and empower AGI developers who are willing to do that sort of thing. You'd have to wonder if they're going to slip out with a thumb drive to try deploying an AGI against you, because they have their own notion of the greater good that they're willing to violate your boundaries to achieve.
So, thankfully-according-to-me, no currently-successful AGI labs are oriented on carrying out pivotal acts, at least not all on their own.
Back to pivotal outcomes
Again, my critique of pivotal acts is not meant to imply that humanity has to give up on pivotal outcomes. Granted, it's usually harder to get an outcome through a distributed process spanning many actors, but in the case of a pivotal outcome for humanity, I argue that:
- it's safer to aim for a pivotal outcome to be carried out by a distributed process spanning multiple institutions and states, because the process can happen in a piecemeal fashion that doesn't change the whole world at once, and
- it's easier as well, because
- you won't be constantly setting off alarm bells of the form "Those people are going to try to unilaterally change the whole world in a drastic way", and
- you won't be trying to populate a lab with AGI developers who, in John Wentworth's terms, think like "villains" (source).
I'm not arguing that we (humanity) are going to succeed in achieving a pivotal outcome through a distributed process; only that it's a safer and more practical endeavor than aiming for a single pivotal act from a single institution.
Let's say one makes a comment on LW that shifts the discourse in a way that eventually ramifies into a succesful navigation of the alignment problem.
Was there a pivotal outcome?
If so, was there a corresponding pivotal act? What was it?