AI is here, and AGI is coming. It's quite possible that any work being done now will be futile in comparison to reducing AI risk.
This is one of those things that's unsettling for me as someone who did a Ph. D. in a non-AI area of computer science.
But one of the main vectors by which a bootstrap AGI will gain power is by hacking into other systems. And that's something I can do something about.
Not many appreciate this, but unhackable systems are very possible. Security vulnerabilities occur when there is some broken assumption or coding mistake. They are not omnipresent: someone has to put them there. Software has in general gotten more secure over the last few decades, and technologies that provide extremely high security guarantees have. Consider the verified hypervisor coming out of Bedrock Systems; RockSalt, an unbreakable sandbox; or sel4, the verified kernel now being used in real safety-critical systems.
Suppose we "solve" security by bringing the vulnerabilities in important applications to near zero. Suppose we also "solve" the legacy problem, and are able to upgrade a super-majority of old software, included embedded devices, to be similarly secure. How much will this reduce AI risk?
To be clear: I personally am mainly interested in assuming this will be solved, and then asking the impact on AI safety. If you want talk about how hard it is, then, well, I won't be interested because I've given many lectures on closely related topics, although some others here may benefit from the discussion.
(When I call something verified or unbreakable, there are a number of technicalities about what exactly has been proven and what the assumptions are. E.g.: nothing I've mentioned provides guarantees against hardware attacks such as Row Hammer or instruction skipping. I'll be happy to explain these to anyone in great detail, but am more interested in discussion which assumes these will all be solved.)
While I can't quantify, I think secure computer systems would help a lot by limiting the options of an AI attempting malicious actions.
Imagine a near-AGI system with uneven capabilities compared to humans. Maybe its GPT-like (natural language interaction) and Copilot-like (code understanding and generation) capabilities pass humans but robotics lags behind. More generally, in virtual domains, especially those involving strings of characters, it's superior, but elsewhere it's inferior. This is all easy to imagine because it's just assuming the relative balance of capabilities remains similar to what it is today.
Such a near-AGI system would presumably be superhuman at cyber-attacking. After all, that plays to its strengths. It'd be great at both finding new vulnerabilities and exploiting known ones. Having impenetrable cyber-defenses would neutralize this advantage.
Could the near-AGI system improve its robotics capabilities to gain an advantage in the physical world too? Probably, but that might take a significant amount of time. Doing things in the physical world is hard. No matter how smart you are, your mental model of the world is a simplification of true physical reality, so you will need to run experiments, which takes time and resources. That's unlike AlphaZero, for example, which can exceed human capabilities quickly because its experiments (self-play games) take place in a perfectly accurate simulation.
One last thing to consider is that provable security has the nice property that you can make progress on it without knowing the nature of the AI you'll be up against. Having robust cyber-defense will help whether AIs turn out to be deep-learning-based or something else entirely. That makes it in some sense a safe bet, even though it obviously can't solve AGI risk on its own.