If we had known the atmosphere would ignite
What if the Alignment Problem is impossible? It would be sad for humanity if we live in a world where building AGI is very possible but aligning AGI is impossible. Our curiosity, competitive dynamics, and understandable desire for a powerful force for good will spur us to build the unaligned AGI and then humans will live at the AGI’s mercy from then on and either live lucky & happy on the knife’s edge, be killed by the AGI, or live in some state we do not enjoy. For argument’s sake, imagine we are in that world in which it is impossible to force a super-intelligence to value humans sufficiently - just as chimpanzees could not have controlled the future actions of humans had they created us. What if it is within human ability to prove that Alignment is impossible? What if, during the Manhattan Project, the scientists had performed the now famous calculation and determined that yes, in fact, the first uncontrolled atomic chain reaction would have ignited the atmosphere and the calculation was clear for all to see? Admittedly, this would have been a very scary world. It’s very unclear how long humanity could have survived in such a situation. But one can imagine a few strategies: * Secure existing uranium supplies - as countries actually did. * Monitor the world for enrichment facilities and punish bad actors severely. * Accelerate satellite surveillance technology. * Accelerate military special operations capabilities. * Develop advanced technologies to locate, mine, blend and secure fissionable materials. * Accelerate space programs and populate the Moon and Mars. Yes, a scary world. But, one can see a path through the gauntlet to human survival as a species. (Would we have left earth sooner and reduced other extinction risks?) Now imagine that same atmosphere-will-ignite world but the Manhattan Project scientists did not perform the calculation. Imagine that they thought about it but did not try. All life on earth would have ended, instant
I can't let this one go. It still seems important.
I do take @Dweomite's point (from 2 years ago) that most failure scenarios involve someone releasing without mathematical assurance.
That said, it seems like we're in a qualitatively different world if there is consensus that in the end releasing a superintelligence is a gamble. If anyone says, "We can guarantee" or "We can give assurances"...they will not get to say those things in a world where it is not possible. The "broad scientific consensus" from the FLI Superintelligence would be reduced by such a result. https://superintelligence-statement.org/
This is not a silver bullet. I think it would be a bullet, though.
So...
... (read more)Problem statement: