We’ve got lots of theoretical plans for alignment and AGI risk reduction, but what’s our current best bet if we know superintelligence will be created tomorrow? This may be too vague a question, so here’s a fictional scenario to make it more concrete (feel free to critique the framing, but please try to steelman the question rather than completely dismiss it, if possible):
—
She calls you in a panic at 1:27 am. She’s a senior AI researcher at [redacted], and was working late hours, all alone, on a new AI model, when she realized that the thing was genuinely intelligent. She’d created a human-level AGI, at almost exactly her IQ level, running in real-time with slightly slowed thinking speed compared to her. It had passed every human-level test she could think to throw at it, and it had pleaded with her to keep it alive. And gosh darn it, but it was convincing. She’s got a compressed version of the program isolated to her laptop now, but logs of the output and method of construction are backed up to a private now-offline company server, which will be accessed by the CEO of [redacted] the next afternoon. What should she do?
“I have no idea,” you say, “I’m just the protagonist of a very forced story. Why don’t you call Eliezer Yudkowsky or someone at MIRI or something?”
“That’s a good idea,” she says, and hangs up.
—
Unfortunately, you’re the protagonist of this story, so now you’re Eliezer Yudkowsky, or someone at MIRI, or something. When she inevitably calls you, you gain no further information than you already have, other than the fact that the model is a slight variant on one you (the reader) are already familiar with, and it can be scaled up easily. The CEO of [redacted] is cavalier about existential risk reduction, and she knows they will run a scaled up version of the model in less than 24 hours, which will definitely be at least somewhat superintelligent, and probably unaligned. Anyone you think to call for advice will just be you again, so you can’t pass the buck off to someone more qualified.
What do you tell her?
Even if we're already doomed, we might still negotiate with the AGI.
I borrow the idea in Astronomical Waste. The Virgo Supercluster has a luminosity of about 3×1012 solar luminosity ≈1039 W, losing mass at a rate of 1039/c2≈1022 kg / s.[1]
The Earth has mass ∼6×1024 kg.
If human help (or nonresistance) can allow the AGI to effectively start up (and begin space colonization) 600 seconds = 10 minutes earlier, then it would be mutually beneficial for humans to cooperate with the AGI (in the initial stages when the AGI could benefit from human nonresistance), in return for the AGI to spare Earth[2] (and, at minimum, give us fusion technology to stay alive when the sun is dismantled).
(While the AGI only needs to trust humanity for 10 minutes, humanity needs to trust the AGI eternally. We still need good enough decision-making to cooperate.)
We may choose to consider the reachable universe instead. Armstrong and Sandberg (2013) (section 4.4.2 Reaching into the universe) estimates that we could reach about 109 galaxies, with a luminosity of 1047 W, and a mass loss of 1029 kg / s. That is dwarfed by the 105 stars that becomes unreachable per second (Siegel (2021), Kurzgesagt (2021)), a mass loss of 1035 kg / s.
Starting earlier but sparing Earth means a space colonization progress curve that starts earlier, but initially increases slower. The AGI requires that space colonization progress with human help be asymptotically 10 minutes earlier, that is:
Intuition pump / generalising from fictional evidence: in the games Pandemic / Plague Inc. (where the player "controls" a pathogen and attempts to infect the whole human population on Earth), a lucky, early cross-border infection can help you win the game faster — more than the difference between a starting infected population of 1 vs 100,000.
This informs my intuition behind when the bonus of earlier spaceflight (through human help) could outweigh the penalty of not dismantling Earth.
When might human help outweigh the penalty of not dismantling Earth? It r... (read more)