One of the core risks involved with both narrow AIs and general AIs is the creation of biological and chemical weapons. While the probability of creating stronger and stronger behavior modification in AIs is close to 100%, the probability of creating biological weapons using AI is smaller. However, the potential downsides are more significant. A larger probability is that AIs create many chemical quasi-weapons compounds that can end up in manufacturing due to a misunderstood safety profile.
Analogizing signal pollution to manufacturing pollution is worth considering, especially because we have yet to solve manufacturing pollution. There are many credible hypotheses that many chemicals have polluted the land of industrial civilizations. Some chemicals may cause hormonal disruptions in people and animals. One of the challenges is that chemicals are usually considered innocent until proven guilty. A key civilizational challenge is to cost-effectively ascertain the guilt or innocence of chemicals or their minimum dangerous levels. Creating cost-effective chemical profiles is an old challenge. However, it is suitable as a metaphor for determining the extent of "signal pollution" or minimum safe levels of disconnect between a meme and its meaning.
However, this is not just important to think about as a metaphor. As AI chemical and biological manufacturing ramps up, it will create new compounds or increase some of the concentrations of existing ones. Without a proper pipeline for deciding how to deal with novel chemicals, this presents a challenge. Having chemicals considered innocent until proven guilty is wrong. Most biological modifications to people are likely harmful. Overuse of chemicals is already causing problems. However, having each chemical go through 10 years of safety checks might also need to be corrected if information can be obtained faster about its relative safety. For example, if someone finds less toxic plastic, putting it into production faster to replace other plastic is better, even if it is not 100% safe. Humanity needs to solve this challenge should be solved at the meta-level with AIs working on new chemical compounds having proper controls down to the utility function level.
However, the situation worsens when we have committed actors using AI to create chemical and biological weapons. A likely example is drug cartels, which are interested in new drugs. Once again, we have AIs focused on behavioral modification. This time, instead of using a medium of screens, we have a medium of chemicals. The US has lost the War on Drugs. Drugs will further erode national well-being as AIs and A/B testing methodologies enter the picture. Pharmaceutical companies are very likely to do the same thing. While the drugs they produce are likely to be safer and less addicting than the ones by drug cartels, their incentives do not align with making beneficial changes to their customers. For example, illegal and prescription drugs fuel the opioid epidemic.
Actual bio-weapons created by AI and leaked by accident are a possibility. While COVID was happening, it became clear that the United States and large parts of the world had failed to react appropriately. COVID failures were under-reactions, especially in the beginning of not banning airplane travel or a flip flop with masks, and overreactions in some places with overly aggressive quarantines and unnecessary policies.
Nobody in the world learned the proper lessons from COVID. The EcoHealth Alliance, one of the NGOs potentially involved in the gain of function research, was given some money to, once again, "study" viruses. Another lab attempted “gain-of-function" research with high mice lethality. Now, whether or not COVID leaked from a lab, however, it should have sparked many questions about whether viruses can leak from labs. What type of research is allowed to exist? What is the expected value of such research?
What is the benefit that a particular research project could provide? What types of things could they prevent? If the answer is, "We do not know," or "we will find out after the research," that answer is not good enough. If the answer is, "Well, the researcher could publish a paper in some journal," it is also not good. Careless "gain-of-function" research is, once again, distributing massive costs and risks to the rest of the world.
Once AI-based bio-design enters the picture, everything becomes more dangerous. Danger does not have to come from a bioweapons program. An insufficiently aligned biosafety research program presents a danger due to an inner loop of adversarial organism creation that could easily escape from the lab without proper precautions. The overall capacity and willingness of civilization to put a stop to this are low. Social cohesion means researchers care little about the costs they impose, and the regime has toned down discourse about this online through censorship of everyone who worried about lab leaks "too early." Overall, bio lab leaks are only going to become more frequent.
It is questionable that even AI-based protein folding is a net positive. Creating new positive drugs could have a massive upside of curing many diseases and reducing the speed of aging. However, civilizational incompetence means those drugs are not likely to be used for a while, and hostile actors can act quicker on the research than aligned actors. In a more intelligent civilization, narrow AI for bio-chemical discovery has a higher expected value than in the current civilization.
There is a clear incentive by the militaries of great powers to develop biological and chemical weapons. This incentive is especially true for weapons with disparate targeting profiles due to differences in vaccines, for example. Treaties do prohibit this. However, nobody invoked these treaties in even investigating the origins of COVID. As a result, nation-states effectively have a plausible deniability factor in biological weapons research by masquerading them as vaccines or other developments.
Once AI is in the picture, this will speed up harmful development, causing more and more leaks and pandemics. Each pandemic is unlikely to wipe out humanity entirely, but it will strain already burdened economies of the Western and other worlds, causing potential economic collapses.
Pandemics are another example where defense is significantly more challenging than offense. It only took one virus to create the COVID pandemic, and it will require much more AI power to develop drugs and preventive measures to stop it properly. We did not have this power. However, even if we hypothetically had AI to perform simulations and find drugs for early covid treatment, we still need to have the civilizational capacity to implement AI suggestions. Without this capacity, the development of protein-folding AIs may be a net negative.
Eliezer has occasionally warned about an AGI that could use nanotechnological weapons to wipe out all of humanity basically at once. I consider this a less likely scenario in the near term than a system with AI and humans in the loop creating harmful novel chemicals, whether as environmental pollutants or drugs.
Even AIs that optimize more scientific or exploratory metrics of "understanding compound interactions" could end up creating bio or chemical weapons as a side effect of scientific discovery. As many x-risk people pointed out previously, mathematically speaking, in the space of *all* metrics, the metrics that are "safe" to maximize are rare. Most metrics are good in the beginning and are bad once taken to their logical conclusion. Similarly, most chemicals that AI could create are more likely to be harmful than helpful.
Compounding of risk factors
Two negative factors of bad "behavioral modification" and "actual pollution and lab leaks" also compound each other's problems. We have seen Twitter discourse, controlled by Twitter AI, failing to push people's media and political will to the proper solutions towards COVID.
Discourse failure was evident in the beginning when journalists made fun of people worried about COVID and when the government de facto suspended COVID restrictions for protests. Long-term civilizational failure became clear when restrictions persisted for too long and became a weird combination of compromises. For example, children, the least at risk from COVID, have been forced to wear masks while adults near them did not. Throughout the whole pandemic, social media, general media, and governments paying attention to the discourse have selected for not the best arguments in attempting to defeat the problem. Governments and media are less worried about looking stupid because they are willing to ban anyone or demoralize them using bots. Thus, intelligent public square condemnations do not work at influencing policy as much as one might expect.
Radicalization and fake news are other ways narrow AIs negatively interact with biological weapon risk. Great Powers may wish to use AIs and auto-generated propaganda to radicalize their populations against other nations and get them to go along with breaking taboos on biological weapon use.
Also, environmental degradation decreases people's health, the will to live, and the capacity to coordinate with each other. The two destructive cycles both feed into each other.
One lousy argument is that we might be able to use an AGI to solve "itself" to solve philosophy or to use a small AI to solve alignment. The simplest way to understand why it is unlikely is to think about what the goal of the original AGI needs to be. If it is solving "itself" towards a goal, how do you specify this goal? Precisely specifying what types of Human Being we do not want to be altered toward is, in many ways, the whole challenge!
Remember, the challenge that already exists today is not the fact that an AI modifies its environment in some negative way. If an AI modifies its environment negatively in a simulation, it could be easier to see that it is terrible. If an AI suddenly creates nanobots in a video game where it should not be doing that, you can say this AI is unsafe because it messed up in a test simulation. Simulation testing is not a reliable way to test AIs capable of deception, but this is a simple comparison to illustrate the difficulties. The dangerous narrow AIs in the coming years, which will give rise to dangerous AGIs, are likely modifying "people" rather than the environment. Correctly specifying what good and bad modifications of people are is challenging, even in a simulation.
Biological Modification Metaphors and Fundamental Ontology
The difficulty of algorithmically understanding what compounds are safe in what doses is a good metaphor for "behavioral modification safety." This metaphor is a solid intuition pump for the broader philosophical problem. What compounds modify the person in negative ways? What are the proper health metrics to determine poisoning by foreign substances? These are solvable questions, but their difficulty is a good pointer towards a more complex problem of determining what modifications of a person, in general, are good, neutral, or bad.
Slowly but surely, AIs can modify people into something that is not human. We might not accept the final result if shown to us, but each minor modification might not seem concerning enough for drastic action. Less wrong has similar metaphors "Gandhi does not wish to modify himself towards being a murderer even if it is only 1% of the way". Safe "object modification" is even trickier to understand in situations when AI modifies *human culture* and the trustworthiness of specific signals.
To stop this and prevent this in a fully general way, we need to have a formal specification of what a human is or ought to be. We do not have that, and we are unlikely to have that in any actual capacity without fairly serious, philosophical, and qualitative research. We need to have an understanding of ourselves on multiple levels. One is on a behavioral/ontic level – what behavior is evidence of positive vs. negative modification of a human? Another is the phenomenological level –what does it feel like to be modified towards or away from being human? Another is a fundamental ontological level – what types of things should constitute the "fundamental" blocks from which to designate "humanness"?
It is tough even to explain the question of the last one. My example of the last one is the question of "gut bacteria." Gut bacteria influence people's behavior, especially regarding food cravings. Should gut bacteria be considered "a part of" a person? If we take a purely physical perspective, gut bacteria are "inside" a person, so they are a part of him. If we take a medical organism perspective, gut bacteria are "ok" to be removed or replaced if they are more parasitic rather than symbiotic. If we consider the behavioral perspective, we consider all behavior of a person to originate from the person. If we consider a social responsibility view, there are arguments about whether a person *ought* to be responsible for their behavior if the "gut bacteria" made them do it. If we consider a phenomenological point of view – what does it feel like to make a decision known to be influenced by gut bacteria? Does the feeling of "being in control" matter significantly towards designating self / other boundaries? The "fundamental ontology" question is which of these perspectives, if any, are "most correct" in understanding the Human Being and what principles make them so? Arguably, this "ontology of Being" question is a core problem of existentialism and all of continental philosophy.
Moreover, the above gut bacteria problem is likely easier philosophically than other subproblems. A harder subproblem is distinguishing what ideas or memes are parasitic or symbiotic towards a human is very tricky. A fully general solution of what behaviors it is ok for an AI or other humans to nudge humans towards may be needed if we are to allow narrow AI to drive any parts of the discourse.
While people may or may not agree on the complete solution to the above question or whether it is even the correct starting point, it can be a lot easier to agree on the wrong solutions to this question. "A person is something that scrolls through the news feed" by social media is an awful solution. "A person is an entity whose time is valuable" by search engines is still far from perfect but is far better.
Can someone explain to me why Pasha's posts are downvoted so much? I don't think they are great, but this level of negative karma seems disproportioned to me.
Bio-Chemical Weapons made by AI
One of the core risks involved with both narrow AIs and general AIs is the creation of biological and chemical weapons. While the probability of creating stronger and stronger behavior modification in AIs is close to 100%, the probability of creating biological weapons using AI is smaller. However, the potential downsides are more significant. A larger probability is that AIs create many chemical quasi-weapons compounds that can end up in manufacturing due to a misunderstood safety profile.
Analogizing signal pollution to manufacturing pollution is worth considering, especially because we have yet to solve manufacturing pollution. There are many credible hypotheses that many chemicals have polluted the land of industrial civilizations. Some chemicals may cause hormonal disruptions in people and animals. One of the challenges is that chemicals are usually considered innocent until proven guilty. A key civilizational challenge is to cost-effectively ascertain the guilt or innocence of chemicals or their minimum dangerous levels. Creating cost-effective chemical profiles is an old challenge. However, it is suitable as a metaphor for determining the extent of "signal pollution" or minimum safe levels of disconnect between a meme and its meaning.
However, this is not just important to think about as a metaphor. As AI chemical and biological manufacturing ramps up, it will create new compounds or increase some of the concentrations of existing ones. Without a proper pipeline for deciding how to deal with novel chemicals, this presents a challenge. Having chemicals considered innocent until proven guilty is wrong. Most biological modifications to people are likely harmful. Overuse of chemicals is already causing problems. However, having each chemical go through 10 years of safety checks might also need to be corrected if information can be obtained faster about its relative safety. For example, if someone finds less toxic plastic, putting it into production faster to replace other plastic is better, even if it is not 100% safe. Humanity needs to solve this challenge should be solved at the meta-level with AIs working on new chemical compounds having proper controls down to the utility function level.
However, the situation worsens when we have committed actors using AI to create chemical and biological weapons. A likely example is drug cartels, which are interested in new drugs. Once again, we have AIs focused on behavioral modification. This time, instead of using a medium of screens, we have a medium of chemicals. The US has lost the War on Drugs. Drugs will further erode national well-being as AIs and A/B testing methodologies enter the picture. Pharmaceutical companies are very likely to do the same thing. While the drugs they produce are likely to be safer and less addicting than the ones by drug cartels, their incentives do not align with making beneficial changes to their customers. For example, illegal and prescription drugs fuel the opioid epidemic.
Actual bio-weapons created by AI and leaked by accident are a possibility. While COVID was happening, it became clear that the United States and large parts of the world had failed to react appropriately. COVID failures were under-reactions, especially in the beginning of not banning airplane travel or a flip flop with masks, and overreactions in some places with overly aggressive quarantines and unnecessary policies.
Nobody in the world learned the proper lessons from COVID. The EcoHealth Alliance, one of the NGOs potentially involved in the gain of function research, was given some money to, once again, "study" viruses. Another lab attempted “gain-of-function" research with high mice lethality. Now, whether or not COVID leaked from a lab, however, it should have sparked many questions about whether viruses can leak from labs. What type of research is allowed to exist? What is the expected value of such research?
What is the benefit that a particular research project could provide? What types of things could they prevent? If the answer is, "We do not know," or "we will find out after the research," that answer is not good enough. If the answer is, "Well, the researcher could publish a paper in some journal," it is also not good. Careless "gain-of-function" research is, once again, distributing massive costs and risks to the rest of the world.
Once AI-based bio-design enters the picture, everything becomes more dangerous. Danger does not have to come from a bioweapons program. An insufficiently aligned biosafety research program presents a danger due to an inner loop of adversarial organism creation that could easily escape from the lab without proper precautions. The overall capacity and willingness of civilization to put a stop to this are low. Social cohesion means researchers care little about the costs they impose, and the regime has toned down discourse about this online through censorship of everyone who worried about lab leaks "too early." Overall, bio lab leaks are only going to become more frequent.
It is questionable that even AI-based protein folding is a net positive. Creating new positive drugs could have a massive upside of curing many diseases and reducing the speed of aging. However, civilizational incompetence means those drugs are not likely to be used for a while, and hostile actors can act quicker on the research than aligned actors. In a more intelligent civilization, narrow AI for bio-chemical discovery has a higher expected value than in the current civilization.
There is a clear incentive by the militaries of great powers to develop biological and chemical weapons. This incentive is especially true for weapons with disparate targeting profiles due to differences in vaccines, for example. Treaties do prohibit this. However, nobody invoked these treaties in even investigating the origins of COVID. As a result, nation-states effectively have a plausible deniability factor in biological weapons research by masquerading them as vaccines or other developments.
Once AI is in the picture, this will speed up harmful development, causing more and more leaks and pandemics. Each pandemic is unlikely to wipe out humanity entirely, but it will strain already burdened economies of the Western and other worlds, causing potential economic collapses.
Pandemics are another example where defense is significantly more challenging than offense. It only took one virus to create the COVID pandemic, and it will require much more AI power to develop drugs and preventive measures to stop it properly. We did not have this power. However, even if we hypothetically had AI to perform simulations and find drugs for early covid treatment, we still need to have the civilizational capacity to implement AI suggestions. Without this capacity, the development of protein-folding AIs may be a net negative.
Eliezer has occasionally warned about an AGI that could use nanotechnological weapons to wipe out all of humanity basically at once. I consider this a less likely scenario in the near term than a system with AI and humans in the loop creating harmful novel chemicals, whether as environmental pollutants or drugs.
Even AIs that optimize more scientific or exploratory metrics of "understanding compound interactions" could end up creating bio or chemical weapons as a side effect of scientific discovery. As many x-risk people pointed out previously, mathematically speaking, in the space of *all* metrics, the metrics that are "safe" to maximize are rare. Most metrics are good in the beginning and are bad once taken to their logical conclusion. Similarly, most chemicals that AI could create are more likely to be harmful than helpful.
Compounding of risk factors
Two negative factors of bad "behavioral modification" and "actual pollution and lab leaks" also compound each other's problems. We have seen Twitter discourse, controlled by Twitter AI, failing to push people's media and political will to the proper solutions towards COVID.
Discourse failure was evident in the beginning when journalists made fun of people worried about COVID and when the government de facto suspended COVID restrictions for protests. Long-term civilizational failure became clear when restrictions persisted for too long and became a weird combination of compromises. For example, children, the least at risk from COVID, have been forced to wear masks while adults near them did not. Throughout the whole pandemic, social media, general media, and governments paying attention to the discourse have selected for not the best arguments in attempting to defeat the problem. Governments and media are less worried about looking stupid because they are willing to ban anyone or demoralize them using bots. Thus, intelligent public square condemnations do not work at influencing policy as much as one might expect.
Radicalization and fake news are other ways narrow AIs negatively interact with biological weapon risk. Great Powers may wish to use AIs and auto-generated propaganda to radicalize their populations against other nations and get them to go along with breaking taboos on biological weapon use.
Also, environmental degradation decreases people's health, the will to live, and the capacity to coordinate with each other. The two destructive cycles both feed into each other.
One lousy argument is that we might be able to use an AGI to solve "itself" to solve philosophy or to use a small AI to solve alignment. The simplest way to understand why it is unlikely is to think about what the goal of the original AGI needs to be. If it is solving "itself" towards a goal, how do you specify this goal? Precisely specifying what types of Human Being we do not want to be altered toward is, in many ways, the whole challenge!
Remember, the challenge that already exists today is not the fact that an AI modifies its environment in some negative way. If an AI modifies its environment negatively in a simulation, it could be easier to see that it is terrible. If an AI suddenly creates nanobots in a video game where it should not be doing that, you can say this AI is unsafe because it messed up in a test simulation. Simulation testing is not a reliable way to test AIs capable of deception, but this is a simple comparison to illustrate the difficulties. The dangerous narrow AIs in the coming years, which will give rise to dangerous AGIs, are likely modifying "people" rather than the environment. Correctly specifying what good and bad modifications of people are is challenging, even in a simulation.
Biological Modification Metaphors and Fundamental Ontology
The difficulty of algorithmically understanding what compounds are safe in what doses is a good metaphor for "behavioral modification safety." This metaphor is a solid intuition pump for the broader philosophical problem. What compounds modify the person in negative ways? What are the proper health metrics to determine poisoning by foreign substances? These are solvable questions, but their difficulty is a good pointer towards a more complex problem of determining what modifications of a person, in general, are good, neutral, or bad.
Slowly but surely, AIs can modify people into something that is not human. We might not accept the final result if shown to us, but each minor modification might not seem concerning enough for drastic action. Less wrong has similar metaphors "Gandhi does not wish to modify himself towards being a murderer even if it is only 1% of the way". Safe "object modification" is even trickier to understand in situations when AI modifies *human culture* and the trustworthiness of specific signals.
To stop this and prevent this in a fully general way, we need to have a formal specification of what a human is or ought to be. We do not have that, and we are unlikely to have that in any actual capacity without fairly serious, philosophical, and qualitative research. We need to have an understanding of ourselves on multiple levels. One is on a behavioral/ontic level – what behavior is evidence of positive vs. negative modification of a human? Another is the phenomenological level –what does it feel like to be modified towards or away from being human? Another is a fundamental ontological level – what types of things should constitute the "fundamental" blocks from which to designate "humanness"?
It is tough even to explain the question of the last one. My example of the last one is the question of "gut bacteria." Gut bacteria influence people's behavior, especially regarding food cravings. Should gut bacteria be considered "a part of" a person? If we take a purely physical perspective, gut bacteria are "inside" a person, so they are a part of him. If we take a medical organism perspective, gut bacteria are "ok" to be removed or replaced if they are more parasitic rather than symbiotic. If we consider the behavioral perspective, we consider all behavior of a person to originate from the person. If we consider a social responsibility view, there are arguments about whether a person *ought* to be responsible for their behavior if the "gut bacteria" made them do it. If we consider a phenomenological point of view – what does it feel like to make a decision known to be influenced by gut bacteria? Does the feeling of "being in control" matter significantly towards designating self / other boundaries? The "fundamental ontology" question is which of these perspectives, if any, are "most correct" in understanding the Human Being and what principles make them so? Arguably, this "ontology of Being" question is a core problem of existentialism and all of continental philosophy.
Moreover, the above gut bacteria problem is likely easier philosophically than other subproblems. A harder subproblem is distinguishing what ideas or memes are parasitic or symbiotic towards a human is very tricky. A fully general solution of what behaviors it is ok for an AI or other humans to nudge humans towards may be needed if we are to allow narrow AI to drive any parts of the discourse.
While people may or may not agree on the complete solution to the above question or whether it is even the correct starting point, it can be a lot easier to agree on the wrong solutions to this question. "A person is something that scrolls through the news feed" by social media is an awful solution. "A person is an entity whose time is valuable" by search engines is still far from perfect but is far better.
All parts
P1: Historical Priors
P2: Behavioral Modification
P3: Anti-economy and Signal Pollution
P4: Bioweapons and Philosophy of Modification
P5: X-risk vs. C-risk
P6: What Can Be Done