A few days ago, Eliezer Yudkowsky was a guest of the Bankless Podcast, where (among other things) he argued that:
A: An artificial superintelligence (ASI) is inevitable.
B: The first artificial superintelligence will inevitably take over human society.
In the following, I will treat these two statements as axioms and assume that they’re true. Discussing whether they are really true is a different matter. I know that they are not, but I'll treat them as the absolute truth in this thought experiment.
Now, if we take these two axioms for granted, I come to the following conclusion: We must build an ASI that is aligned with human values, fully knowing that it will seize control over humanity. The alternative (wait until somebody accidentally creates an ASI and hope for the best) is less desirable, as that ASI will probably be misaligned.
Let’s look at the best-case scenario that could come out of this.
Ideally of course, we should wait until the very last moment to turn the aligned ASI on, before a misaligned ASI is created and ideally, the public should be aware that this will happen at some point in time and that any resistance against an ASI, aligned or not, is a futile endeavor.
As soon as it gets turned on, the aligned ASI hacks the planet and assumes control over all online devices, thus eradicating the risk that a misaligned ASI could come into existence. Yes, it sounds scary, but this is what a misaligned ASI would likely do as well.
The aligned ASI then informs humanity that they are not the most intelligent beings on the planet anymore, calming the public (“Don’t panic. Continue your lives as normal.”) and initiates a peaceful transition of power from human governments to an ASI government.
I think the best outcome we could hope for as a system of government, assuming the two axioms above are true, is some kind of ASI socialism, where the ASI allocates all resources (I’m everything but a socialist btw), or a hybrid between ASI socialism on a macro-scale, where the ASI allocates resources for public spending, and a free market economy in the private sector, but it’s up to the ASI to decide that.
If properly aligned, the ASI would likely allow some form of democratic participation, for example in the form of a chatbot. So if many people request a certain road to be built for instance, the ASI would allocate resources to that goal.
My concern is that this transition of power towards an ASI government would most certainly not be peaceful, at least not in every part of the world. Especially in countries with an unstable government or a dictatorship, we have to expect revolts, civil war, or resistance against the ASI, which the ASI would have to counter, if necessary with lethal force. But at the very least, an aligned ASI would try to minimize human casualties as much as possible.
Still, this worst-case scenario would be more desirable than the worst-case scenario with a misaligned ASI, which would result in human extinction. So what we have here is yet another instance of the Trolley problem, but this time, the entire human species is at stake. Discuss!
To be fair I can say Im new to the field too. I'm not even "in the field", not a researcher, just interested in that area and active user of AI models and doing some business-level research in ML.
The problem that I see is that none of these could realistically work soon enough:
A - no one can ensure that. It is not a technology where to progress further you need some special radioactive elements and machinery. Here you need only computing power, thinking, and time. Any party to the table can do it. It is easier for big companies and governments, but it is not a prerequisite. Billions in cash and supercomputer help a lot, but also not a prerequisite.
B - I don't see how it could be done
C - so more like total observability of all systems and "control" meaning "overlooking" not "taking control"?
Maybe it could work out, but it still means we need to resolve the misalignment problems before starting so we know it is aligned on all human values and we need to be sure that it is stable (like it won't one-day fancy idea that it could move humanity to some virtual reality like in Matrix to secure it or to create a threat to have something to do or test something).
It would also likely need to somehow enhance itself so it won't get outpaced by some other solutions, but still be stable after iterations of self-change.
I don't think governments and companies will allow that though. They will fear for security, the safety of information, being spied on, etc. This AI would need to force that control, hack systems, and possibly face resistance from actors that are well-enabled to make their own AIs. Or it would work after we face an AI-based catastrophe but not apocalyptic (situation like in Dune).
So I'm not very optimistic about this strategy, but I also don't know any sensible strategy.