One possible thing that I imagine might happen, conditional on an existential catastrophe not occurring, is a Manhattan project for aligned AGI. I don’t want to argue that this is particularly likely or desirable. The point of this post is to sketch the scenario, and briefly discuss some implications for what is needed from current research.
Imagine the following scenario: It is only late that top AI scientists take the existential risk of AGI seriously, and there hasn't yet been a significant change in the effort put into AI safety relative to our current trajectory. At some point, there is a recognition among AI scientists and relevant decision-makers that AGI will be developed soon by one AI lab or another (within a few months/years), and that without explicit effort there is a large probability of catastrophic results. A project is started to develop AGI:
- It has an XX B$ or XXX B$ budget.
- Dozens of the top AI scientists are part of the project, and many more assistants. People you might recognize or know from top papers and AI labs join the project.
- A fairly constrained set of concepts, theories and tools are available that give a broad roadmap for building aligned AGI.
- There is a consensus understanding among management and the research team that without this project, AGI will plausibly be developed relatively soon, and that without explicitly understanding how to build the system safely it will pose an existential risk.
It seems to me that it is useful to backchain from this scenario to see what is needed, assuming that this kind of alignment Manhattan project is indeed what should happen.
Firstly, my view is that if this Manhattan project would start in intellectual conditions similar to today’s, there wouldn't be very many top AI scientists significantly motivated to work on the problem, and it would not be taken seriously. Even very large sums of money would not suffice, since there wouldn't be enough of a common understanding about what the problem is for it to work.
Secondly, it seems to me that there isn't enough of a roadmap for building aligned AGI for such a project to succeed in a short time-frame of months to years. I expect some people to disagree with this, but looking at current rates of progress in our understanding of AI safety, and my model of the practical parallelizability of conceptual progress, I am skeptical that the problem can be solved in a few years even by a group of 40 highly motivated and financed top AI scientists. It is plausible that this will look different closer to the finish line, but I am skeptical.
On this model, I have in mind basically two kinds of work that contribute to good outcomes. This is not a significant change relative to my prior view, but in my mind it constrains the motivation behind such work to some degree:
- Research that makes the case for AGI x-risk clearer, and constrains how we believe the problem occurs, in order to make it eventually easier to convince top AI scientists that working in such an alignment Manhattan project is reasonable, and to make sure there is a team that's on the same page as to what the problem is.
- Research that constrains the roadmap for building aligned AGI. I'm thinking mostly of conceptual/theoretical/empirical work that helps us converge to an approach that can then be developed/refined and scaled by a large effort over a short time period.
I suspect this mostly shouldn't change my general picture of what needs to be done, but it does shift my emphasis somewhat.
As I understand, theory of atomic bomb was considerably more advanced at the beginning of Manhattan project compared to our understanding of theory of aligned AGI.
To somewhat simplify, there were two unknown parameters. The critical mass of uranium-235, and the rate of uranium isotope separation. Given these two parameters, you could calculate how long it would take by simple division. Remember Little Boy was not tested at all: theory was that solid. Success was basically guaranteed if you had enough time, although success in 100 years would have been rightfully considered failure.
What about nuclear reactor, plutonium, and implosion device? Those were gambles to speed things up, because they thought it would take too long. (They were right: war in Europe ended first.) But Manhattan project would have succeeded without them, in the sense of producing fission weapons.
Another thing they tried to speed things up was better isotope separation. Electromagnetic separation was well understood and basically worked as designed. They gambled on developing gaseous diffusion, and it ended up more efficient, but development took too long so it didn't shorten the timeline at all.
In retrospect, they should have gambled on centrifuges, which is the current preferred method. What was missing was a clever innovation, not an advanced material or other things of that nature. Manhattan project could have been finished a lot faster if only they had known about Zippe centrifuge.
In fact there is an alternate history novel based on this, The Berlin Project by Gregory Benford (recommended). The author's estimate, which seemed reasonable to me, is that centrifuge would have shorten the timeline by one year, finishing in 1944. As a result, as the title suggests, atomic bomb is dropped on Berlin.
So, let me answer the question. I will define success as producing fission weapons before the end of war in Europe. (This is reasonable interpretation of statements by scientists who worked on Manhattan project.) The real world Manhattan project failed.
No one could predict anything before the necessary experiments were done to figure out the critical mass. Rough estimates varied by one order of magnitude, implying one to ten years. Once critical mass was figured out, electromagnetic separation implied three years (1942~1945), which was felt to be about 50% success rate based on guesses about how war would progress. They tried hard to speed things up and shorten the timeline, but they failed. Choosing centrifuge would have led to success in 1944 but there was no reasonable way to know that and unlucky choice was made.