Strawman?
"... idea for an indirect strategy to increase the likelihood of society acquiring robustly safe and beneficial AI." is what you said. I said preventing the creation of an unfriendly AI.
Ok. valid point. Not the same.
I would say the items described will do nothing whatsoever to "increase the likelihood of society acquiring robustly safe and beneficial AI."
They are certainly of value in normal software development, but it seems increasingly likely as time passes without a proper general AI actually being created that such a task is far, far more difficult than anyone expected, and that if one does come into being, it will happen in a manner other than the typical software development process as we do things today. It will be an incremental process of change and refinement seeking a goal, is my guess. Starting from a great starting point might presumably reduce the iterations a bit, but other than a head start toward the finish line, I cannot imagine it would affect the course much.
If we drop single cell organisms on a terraformed planet, and come back a hundred million years or so - we might well expect to find higher life forms evolved from it, but finding human beings is basically not gonna happen. If we repeat that - same general outcome (higher life forms), but wildly differing specifics. The initial state of the system ends up being largely unimportant - what matters is evolution, the ability to reproduce, mutate and adapt. Direction during that process could well guide it - but the exact configuration of the initial state (the exact type of organisms we used as a seed) is largely irrelevant.
re. Computer security - I actually do that for a living. Small security rant - my apologies:
You do not actually try to get every layer "as right and secure as possible." The whole point of defense in depth is that any given security measure can fail, so to ensure protection, you use multiple layers of different technologies so that when (not if) one layer fails, the other layers are there to "take up the slack", so to speak.
The goal on each layer is not "as secure as possible", but simply "as secure as reasonable" (you seek a "sweet spot" that balances security and other factors like cost), and you rely on the whole to achieve the goal. Considerations include cost to implement and maintain, the value of what you are protecting, the damage caused should security fail, who your likely attackers will be and their technical capabilities, performance impact, customer impact, and many other factors.
Additionally, security costs at a given layer do not increase linearly, so making a given layer more secure, while often possible, quickly becomes inefficient. Example - Most websites use a 2k SSL key; 4k is more secure, and 8k is even moreso. Except - 8k doesn't work everywhere, and the bigger keys come with a performance impact that matters at scale - and the key size is usually not the reason a key is compromised. So - the entire world (for the most part) does not use the most secure option, simply because it's not worth it - the additional security is swamped by the drawbacks. (Similar issues occur regarding cipher choice, fwiw).
In reality - in nearly all situations, human beings are the weak link. You can have awesome security, and all it takes is one bozo and it all comes down. SSL is great, until someone manages to get a key signed fraudulently, and bypasses it entirely. Packet filtering is dandy, except that fred in accounting wanted to play minecraft and opened up a ssh tunnel, incorrectly. MFA is fine, except the secretary who logged into the VPN using MFA just plugged the thumb drive they found in the parking lot into per PC, and actually ran "Elf Bowling", and now your AD is owned and the attacker is escalating privledge from inside. so it doesn't matter that much about your hard candy shell, he's in the soft, chewy center. THIS, by the way, is where things like education are of the most value - not in making the very skilled more skilled, but in making the clueless somewhat more clueful. If you want to make a friendly AI - remove human beings from the loop as much as possible...
Ok, done with rant. Again, sorry - I live this 40-60 hours a week.
These are quick notes on an idea for an indirect strategy to increase the likelihood of society acquiring robustly safe and beneficial AI.
Motivation:
Most challenges we can approach with trial-and-error, so many of our habits and social structures are set up to encourage this. There are some challenges where we may not get this opportunity, and it could be very helpful to know what methods help you to tackle a complex challenge that you need to get right first time.
Giving an artificial intelligence good values may be a particularly important challenge, and one where we need to be correct first time. (Distinct from creating systems that act intelligently at all, which can be done by trial and error.)
Building stronger societal knowledge about how to approach such problems may make us more robustly prepared for such challenges. Having more programmers in the AI field familiar with the techniques is likely to be particularly important.
Idea: Develop methods for training people to write code without bugs.
Trying to teach the skill of getting things right first time.
Writing or editing code that has to be bug-free without any testing is a fairly easy challenge to set up, and has several of the right kind of properties. There are some parallels between value specification and programming.
Set-up puts people in scenarios where they only get one chance -- no opportunity to test part/all of the code, just analyse closely before submitting.
Interested in personal habits as well as social norms or procedures that help this.
Daniel Dewey points to standards for code on the space shuttle as a good example of getting high reliability code edits.
How to implement:
Ideal: Offer this training to staff at software companies, for profit.
Although it’s teaching a skill under artificial hardship, it seems plausible that it could teach enough good habits and lines of thinking to noticeably increase productivity, so people would be willing to pay for this.
Because such training could create social value in the short run, this might give a good opportunity to launch as a business that is simultaneously doing valuable direct work.
Similarly, there might be a market for a consultancy that helped organisations to get general tasks right the first time, if we knew how to teach that skill.
More funding-intensive, less labour intensive: run competitions with cash prizes
Try to establish it as something like a competitive sport for teams.
Outsource the work of determining good methods to the contestants.
This is all quite preliminary and I’d love to get more thoughts on it. I offer up this idea because I think it would be valuable but not my comparative advantage. If anyone is interested in a project in this direction, I’m very happy to talk about it.