This is the second in a three part sequence, examining the scenario in which software development organisations might knowingly take a significant risk of unintentionally launching an AI program into inadequately supervised or controlled self-improvement, by comparing the scenario to the 'press your luck' game mechanic.
This second part lays out the problem caused for humanity by the risk inherent in having software development being carried out by multiple organisations with competitive motivations.
Milestones
We can conceptualise the progress towards the launch of a general purpose AI that can out-think humans as a series of milestones, analogous to the alert states used by the US military.
DEFCON 5 : A computer program designed to become an expert in a single pre-determined domain, such as chess, that can only learn and improve from examples within that domain, as its learning algorithm has built in domain-specific assumptions.
DEFCON 4 : A computer program with a generic learning algorithm, that can improve at identifying tanks, or riding a unicycle, depending on the examples it is trained with, and the dependencies it has already learned.
DEFCON 3 : A computer program that has learned sufficient dependencies, that it can learn to program computers, at a level where it can comprehend an existing computer program, and then produce an improved next generation of it by devising, implementing and testing multiple new algorithms or approaches.
DEFCON 2 : A computer program that looks ahead more than one generation, and tries to produce an improved program to play chess, by first designing an improved version of a chess program designing program.
DEFCON 1 : A computer program which considers the full environment in which a program runs (including hardware, networking and human dictated programming or monitoring constraints) as factors which affect the capability of a program to run efficiently, and treats altering that environment as a problem to solve.
Somewhere between DEFCON 2 and DEFCON 1, humanity enters a game of "Press Your Luck".
Example
Suppose there are two companies, MarchCorp and AprilCorp, that are competing for market share in an area of business where the use of efficient programs is a significant factor. If MarchCorp steps closer to the edge, and starts taking AprilCorp's market share because MarchCorp took some risks and ended up with more efficient programs, then AprilCorp has to choose between going bust, or also pressing its luck and taking a step closer to the edge of the precipice.
Solutions
What options does humanity have, in this situation?
The Legal Solution
Require all companies with sufficient resources to support a scary computer program, to submit all proposed changes to a central authority for verification before they get implemented. Back it up with severe legal penalties, mandatory snoopware and random physical inspections.
PROBLEMS : Expensive, and it would do no good unless implemented effectively in all countries, no matter how nationalistic or corrupt.
If we don’t know how to build competitive benign AI, then users/designers of AI systems have to compromise efficiency in order to maintain reliable control over those systems. The most efficient systems will by default be built by whoever is willing to accept the largest risk of catastrophe (or perhaps by actors who consider unaligned AI a desirable outcome).
It may be possible to avert this kind of race to the bottom by effective coordination by e.g. enforcing regulations which mandate adequate investments in alignment or restrict what kinds of AI are deployed. Enforcing such controls domestically is already a huge headache. But internationally things are even worse: a country that handicapped its AI industry in order to proceed cautiously would face the risk of being overtaken by a less prudent competitor, and avoiding that race would require effective international coordination.
The Genetic Solution
Eliminate capitalism, or change human nature, so there are no organisations or people willing to take risks in order to beat others for a goal they are both racing to.
PROBLEMS : Many
The Religious Solution
Confiscate all the computers, and prevent more being made. Or shoot all people capable of writing such programs. Make it a taboo.
PROBLEMS : A Butlerian Jihad won't get launched unless humanity has already had a near miss that frightened it sufficiently.
All such solutions are like trying to prevent the tide from coming in. You might delay it, at increasing expense, but if it is possible at all, sooner or later it will happen. Such a delay might be useful, but in the end we will most likely be faced with one of just three approaches:
The Ostrich Solution
Deny that it is inevitable, and take pot luck with whatever finally falls off the edge first.
The Manhattan Solution
The 'good guys' (ie the ones more concerned about humanity getting wiped out, than about profits) cooperate in making a big public effort to find a perfect solution, meaning a self-improving computer system that is provably correct, which can be set going at full speed with zero risk of it deciding at a later date to wipe out humanity, maximise paperclips or anything else that would make us regret having launched it. And hope that the good guys win the race, and their perfect solution gets found and launched before some random company presses its luck once too often.
The Pragmatic Solution
Accept that it is inevitable that someone will eventually set a program-improving program to improve itself, and that it is possible that this will happen before a 100% safe version of such a program has been discovered. Work towards making sure that, in the event that that happens, the ones launched by the 'good guys' (which have as high a chance of being safe as they have yet been able to devise) outnumber and out-resource any rogue programs (which, not being created by an international collaboration of top programmers, probably have a lower chance of being safe than the 'good guys' current best contenders). And work towards making the world computer environment such that, in the event non-perfect self-improving programs get launches, humanity has as high a chance as possible of detecting this and shifting control of existing computing resources away from rogue ones and towards the 'good guy' candidates.
Accept that a race might get started, even if nobody wants it to, and have a pre-agreed entrant ready to go, as a precaution against that eventuality. Of course that would only be important were circumstances such that a race like that were not a foregone conclusion with the winner always being the entrant who starts first (which is a question I have tacked in a different sequence).
This is the second in a three part sequence, examining the scenario in which software development organisations might knowingly take a significant risk of unintentionally launching an AI program into inadequately supervised or controlled self-improvement, by comparing the scenario to the 'press your luck' game mechanic.
This second part lays out the problem caused for humanity by the risk inherent in having software development being carried out by multiple organisations with competitive motivations.
Milestones
We can conceptualise the progress towards the launch of a general purpose AI that can out-think humans as a series of milestones, analogous to the alert states used by the US military.
DEFCON 5 : A computer program designed to become an expert in a single pre-determined domain, such as chess, that can only learn and improve from examples within that domain, as its learning algorithm has built in domain-specific assumptions.
DEFCON 4 : A computer program with a generic learning algorithm, that can improve at identifying tanks, or riding a unicycle, depending on the examples it is trained with, and the dependencies it has already learned.
DEFCON 3 : A computer program that has learned sufficient dependencies, that it can learn to program computers, at a level where it can comprehend an existing computer program, and then produce an improved next generation of it by devising, implementing and testing multiple new algorithms or approaches.
DEFCON 2 : A computer program that looks ahead more than one generation, and tries to produce an improved program to play chess, by first designing an improved version of a chess program designing program.
DEFCON 1 : A computer program which considers the full environment in which a program runs (including hardware, networking and human dictated programming or monitoring constraints) as factors which affect the capability of a program to run efficiently, and treats altering that environment as a problem to solve.
Somewhere between DEFCON 2 and DEFCON 1, humanity enters a game of "Press Your Luck".
Example
Suppose there are two companies, MarchCorp and AprilCorp, that are competing for market share in an area of business where the use of efficient programs is a significant factor. If MarchCorp steps closer to the edge, and starts taking AprilCorp's market share because MarchCorp took some risks and ended up with more efficient programs, then AprilCorp has to choose between going bust, or also pressing its luck and taking a step closer to the edge of the precipice.
Solutions
What options does humanity have, in this situation?
The Legal Solution
Require all companies with sufficient resources to support a scary computer program, to submit all proposed changes to a central authority for verification before they get implemented. Back it up with severe legal penalties, mandatory snoopware and random physical inspections.
PROBLEMS : Expensive, and it would do no good unless implemented effectively in all countries, no matter how nationalistic or corrupt.
Paul Christiano wrote
The Genetic Solution
Eliminate capitalism, or change human nature, so there are no organisations or people willing to take risks in order to beat others for a goal they are both racing to.
PROBLEMS : Many
The Religious Solution
Confiscate all the computers, and prevent more being made. Or shoot all people capable of writing such programs. Make it a taboo.
PROBLEMS : A Butlerian Jihad won't get launched unless humanity has already had a near miss that frightened it sufficiently.
All such solutions are like trying to prevent the tide from coming in. You might delay it, at increasing expense, but if it is possible at all, sooner or later it will happen. Such a delay might be useful, but in the end we will most likely be faced with one of just three approaches:
The Ostrich Solution
Deny that it is inevitable, and take pot luck with whatever finally falls off the edge first.
The Manhattan Solution
The 'good guys' (ie the ones more concerned about humanity getting wiped out, than about profits) cooperate in making a big public effort to find a perfect solution, meaning a self-improving computer system that is provably correct, which can be set going at full speed with zero risk of it deciding at a later date to wipe out humanity, maximise paperclips or anything else that would make us regret having launched it. And hope that the good guys win the race, and their perfect solution gets found and launched before some random company presses its luck once too often.
The Pragmatic Solution
Accept that it is inevitable that someone will eventually set a program-improving program to improve itself, and that it is possible that this will happen before a 100% safe version of such a program has been discovered. Work towards making sure that, in the event that that happens, the ones launched by the 'good guys' (which have as high a chance of being safe as they have yet been able to devise) outnumber and out-resource any rogue programs (which, not being created by an international collaboration of top programmers, probably have a lower chance of being safe than the 'good guys' current best contenders). And work towards making the world computer environment such that, in the event non-perfect self-improving programs get launches, humanity has as high a chance as possible of detecting this and shifting control of existing computing resources away from rogue ones and towards the 'good guy' candidates.
Accept that a race might get started, even if nobody wants it to, and have a pre-agreed entrant ready to go, as a precaution against that eventuality. Of course that would only be important were circumstances such that a race like that were not a foregone conclusion with the winner always being the entrant who starts first (which is a question I have tacked in a different sequence).