Does either the linked paywalled analysis, or the unpublished analysis, consider the question of what competitors might do outside of the management of their AI research projects and outside of the question of AI safety measures? How detailed and comprehensive is the view of everything they might do even within those limits?
Narrow game-theoretic models are easy to analyze, but likely to be badly misleading if they don't faithfully reflect the entire game that's actually being played.
If you're a participant who really believes that there's an existential risk in play, and that you're in a "hard takeoff" world, then a rational response to a suspicion that somebody is way ahead of you might be to nuke them. Or start a conventional war with them, or try to get them restricted by their or your government, or deprive them of resources, or do other things to hinder them outside the limits of the "game".
Even if you think that the important game will be between relatively normal private-sector actors who won't have super-dramatic options like nukes and total warfare, they probably will have all kinds of unmodelled wildcard actions available to them... like any commercial dirty trick you can think of. And if everybody including governments and the public become convinced that there's really existential risk in play, then even the most dramatic things are on the table. Even a third party who wasn't doing research to begin with might choose to take such actions, so the set of players could get complicated.
Also, it seems like there's a bit of a contradiction between the idea that a clear leader may feel it has breathing room to work on safety, and the idea of restricting information about the state of play. If there were secrecy and no effective spying, then how would you know whether you were the leader? Without information about what the other side was actually up to, the conservative assumption would be that they were at least as far along as you were, so you should make the minimum supportable investment in safety, and at the same time consider dramatic "outside the game" actions.
Not only does all of that make it more complicated to understand the positive or negative impact of spying on who wins the race, but some of the more dramatic moves might get pretty close to being existential risks in themselves.
I'm not sure I'd want to make any guesses, but it surely doesn't seem at all supported that preventing espionage should be a priority.
It doesn't even seem likely to me that it's possible to create any game theoretic model that gives reliable insight into such a question. If you forget to include even one possible real-world move, then the model may fail catastrophically.
By the way, if you don't like espionage, then another argument against adopting secrecy to begin with is without secrecy, there is no espionage, and therefore no potential risk associated with espionage (except, of course, for those risks shared with transparency itself). Of course, that might have to mean forcibly eliminating secrecy even if some or all of the players would prefer to have secrecy.
One should expect espionage to increase enmity between competitors
Why? When the stakes are high, even allies spy on one another all the time. If one side is spying, the other side is probably also spying. It's normal. So how would it increase enmity?
Intuitively, I would expect essentially all of the strength of opposition in "hard takeoff" arms races to come either from desires to own the results[1], or from mutual convictions that "an AI aligned with those guys is worse than a random AI". Spying seems like pretty small potatoes compared to those.
I tend to think that if something is actually powerful enough to represent an existential risk, then there's a very strong, if rebuttable, presumption that no private organization, and maybe no instution of any kind we have today, ought to be allowed to "own" it at all. But that doesn't mean that's how things will actually play out... ↩︎
I'm not sure I'd want to make any guesses, but it surely doesn't seem at all supported that preventing espionage should be a priority.
It doesn't even seem likely to me that it's possible to create any game theoretic model that gives reliable insight into such a question. If you forget to include even one possible real-world move, then the model may fail catastrophically.
I think your skepticism here is radical; it proves too much. If you consistently applied it you'd be reduced to basically not making any guesses about the future at all.
By the way, if you don't like espionage, then another argument against adopting secrecy to begin with is without secrecy, there is no espionage, and therefore no potential risk associated with espionage (except, of course, for those risks shared with transparency itself). Of course, that might have to mean forcibly eliminating secrecy even if some or all of the players would prefer to have secrecy.
What? The main reason for disliking espionage is that it decreases the lead of the leader. I suppose giving away everything (and thereby decreasing the lead even more) has the silver lining of maybe reducing enmity between projects... but it's not worth doing unless you thought the enmity was the overwhelmingly dominant factor.
One should expect espionage to increase enmity between competitors
Why? When the stakes are high, even allies spy on one another all the time. If one side is spying, the other side is probably also spying. It's normal. So how would it increase enmity?
But you don't even think espionage increases emnity? It may be normal, but historically it does seem to increase emnity, and I can think of some mechanisms by which it might.
I think your skepticism here is radical; it proves too much. If you consistently applied it you'd be reduced to basically not making any guesses about the future at all.
I think that proves too much. I'm saying that game theory in particular is brittle, and that I'm not convinced given that only that brittle method has been brought to bear. That doesn't mean that nothing can ever be convincing.
I will admit that I think something like espionage is probably unusually unpredictable, and maybe its effects can never be predicted very well... but that's only about espionage. It doesn't mean that nothing at all can be predicted.
On edit: ... and if I were reduced to making no predictions, that wouldn't mean I was wrong, just that useful predictions were, unfortunately, unavailable, no matter how desirable they might be...
The main reason for disliking espionage is that it decreases the lead of the leader. [...] unless you thought the enmity was the overwhelmingly dominant factor.
But, again, as you've described it, the value added by having a clear leader is mediated through their knowing that they're the clear leader. If they don't know, there's no value.
But you don't even think espionage increases emnity?
I think it's probably insignificant compared to the "intrinsic" enmity in the scenario.
I've seen private sector actors get pretty incensed about industrial espionage... but I'm not sure it changed their actual level of competition very much. On the government side, there's a whole ritual of talking about being upset when you find a spy, but it seems like it's basically just that.
Here is an unpaywalled version of the first model.
Also, it seems like there's a bit of a contradiction between the idea that a clear leader may feel it has breathing room to work on safety, and the idea of restricting information about the state of play. If there were secrecy and no effective spying, then how would you know whether you were the leader? Without information about what the other side was actually up to, the conservative assumption would be that they were at least as far along as you were, so you should make the minimum supportable investment in safety, and at the same time consider dramatic "outside the game" actions.
In the first model, the effect of a close race increasing risk through corner cutting only happens when projects know how they are doing relative to their competitors. I think it is useful to distinguish two different kinds of secrecy. It is possible for the achievements of a project to be secret, or the techniques of a project to be secret, or both. In the Manhattan Project case, the existence of the Manhattan Project and the techniques for building nuclear bombs were both secret. But you can easily imagine an AI arms race where techniques are secret but the existence of competing projects or their general level of capabilities is not secret. In such a situation you can know about the size of leads without espionage. And adding espionage could decrease the size of leads and increase enmity, making a bad situation worse.
I think the "outside the game" criticism is interesting. I'm not sure whether it is correct or not, and I'm not sure if these models should be modified to account for it, but I will think about it.
I've seen private sector actors get pretty incensed about industrial espionage... but I'm not sure it changed their actual level of competition very much. On the government side, there's a whole ritual of talking about being upset when you find a spy, but it seems like it's basically just that.
I don't think it's fair to say that governments getting upset about spies is just talk. Or rather, governments assume that they are being spied on most of the time. When they find spies that they have already priced in, they don't really react to that. But discovering a hitherto unsuspected spy in an especially sensitive role probably increases enmity a lot (but of course the amount will vary based on the nature of the government doing the discovering, the strategic situation, and the details of the case).
But you can easily imagine an AI arms race where techniques are secret but the existence of competing projects or their general level of capabilities is not secret.
How do you arrange for honest and credible disclosure of those things?
ABSTRACT:
Using both primary and secondary sources, I discuss the role of espionage in early nuclear history. Nuclear weapons are analogous to AI in many ways, so this period may hold lessons for AI governance. Nuclear spies successfully transferred information about the plutonium implosion bomb design and the enrichment of fissile material. Spies were mostly ideologically motivated. Counterintelligence was hampered by its fragmentation across multiple agencies and its inability to be choosy about talent used on the most important military research program in the largest war in human history. Furthermore, the Manhattan Project’s leadership prioritized avoiding domestic political oversight over preventing espionage. Nuclear espionage most likely sped up Soviet nuclear weapons development, but the Soviet Union would have been capable of developing nuclear weapons within a few years without spying. The slight gain in speed due to spying may nevertheless have been strategically significant.
Based on my study of nuclear espionage, I offer some tentative lessons for AI governance:
Acknowledgements: I am grateful to Matthew Gentzel for supervising this project and Michael Aird, Christina Barta, Daniel Filan, Aaron Gertler, Sidney Hough, Nat Kozak, Jeffery Ohl, and Waqar Zaidi for providing comments. This research was supported by a fellowship from the Stanford Existential Risks Initiative.
This post is a short version of the report, x-posted from EA Forum. The full version with additional sections, an appendix, and a bibliography, is available here.
1. Introduction
The early history of nuclear weapons is in many ways similar to hypothesized future strategic situations involving advanced artificial intelligence (Zaidi and Dafoe 2021, 4). And, in addition to the objective similarity of the situations, the situations may be made more similar by deliberate imitation of the Manhattan Project experience (see this report to the US House Armed Service Committee). So it is worth looking to the history of nuclear espionage for inductive evidence and conceptual problems relevant to AI development.
The Americans produced a detailed official history and explanation of the Manhattan Project, entitled the Smyth Report, and released it on August 11, 1945, five days after they dropped the first nuclear bomb on Japan (Wellerstein 2021, 126). For the Soviets, the Smyth Report “candidly revealed the scale of the effort and the sheer quantity of resources, and also hinted at some of the paths that might work and, by omission, some that probably would not” (Gordin 2009, 103). While it would not have allowed for copying the Manhattan Project in every detail, the Soviets were able to use the Smyth Report as “a general guide to the methods of isotope separation, as a checklist of problems that needed to be solved to make separation work, and as a primer in nuclear engineering for the thousands upon thousands of engineers and workers who were drafted into the project” (Gordin 2009, 104).
There were several reasons that the Smyth Report was released. One was a belief that, in a democratic country, the public ought to know about such an important matter as nuclear weapons. Another reason was a feeling that the Soviets would likely be able to get most of the information in the Smyth Report fairly easily regardless of whether it was released. Finally, releasing a single report would clearly demarcate information that was disseminable from information that was controlled, thereby stemming the tide of disclosures coming from investigative journalists and the tens of thousands of former Manhattan Project employees (Wellerstein 2021, 124-125). Those leaks would not be subject to strategic omission, and might, according to General Leslie Groves (Director of the Manhattan Project) “start a scientific battle which would end up in congress” (Quoted in Wellerstein 2021, 125). The historian Michael Gordin summarized the general state of debate between proponents and opponents of nuclear secrecy in the U.S. federal government in the late 1940s as follows:
In Gordin's view, "in the context of the late 1940s, his [Groves's] critics were more right than wrong" (Gordin 2009, 93), though it is important to note that the Smyth Report's usefulness was complemented by the extent of KGB spying of which neither Groves nor his critics were yet aware. Stalin decided to imitate the American path to the nuclear bomb as closely as possible because he believed that it would be both the “fastest” and the “most reliable” (Quoted in Gordin 2009, 152-153). The Smyth Report (and other publicly available materials on nuclear weapons) contained strategic omissions. The Soviets used copious information gathered by spies to fill in some of the gaps.
2. Types of information stolen
2.1 Highly abstract engineering: bomb designs
Bomb designs were one of the most important categories of information transferred by espionage. To illustrate why design transfer was so important, it is necessary to review some basic principles of nuclear weaponry (most of what follows on nuclear weapons design is adapted from a 2017 talk by Matt Bunn).
Fission weapons work by concentrating a critical mass of fissile material. A critical mass is enough fissile material to start a nuclear chain reaction. A critical mass by itself, however, is not a viable nuclear weapon because it will heat up dramatically, turn into gas, expand in volume, and cease to constitute a critical mass, thereby stopping the chain reaction before it has had a chance to consume most of the fuel. The simplest possible nuclear bomb, a gun type design, works by launching a shell of highly enriched uranium-235 into another piece of highly enriched uranium-235. Neither piece of uranium-235 is critical by itself, but together they amount to a critical mass. The tamper prevents the critical mass from expanding out into a diffuse cloud of gas. A massive amount of heat is released, turning the fissile material to gas. The temperature rises to that of the core of the sun. In a gas, a rise in temperature causes a corresponding increase in pressure. This leads to a massive increase in pressure, and an extremely energetic explosion. The bomb dropped on Hiroshima, Little Boy, was a gun type bomb.
The amount of fissile material required to achieve critical mass decreases with density squared. So compressing one’s fissile material means one gets more explosive power for the same amount of fuel. This is the key to the more advanced plutonium implosion bomb design, which was used for the Fat Man bomb dropped on Nagasaki. A plutonium implosion bomb has a core of fissionable plutonium surrounded by a tamper in the middle and, at the top layer, a chemical explosive. The explosive detonates, pushing the tamper in towards the core, which begins a nuclear chain reaction. This design uses plutonium-239, which is easier to obtain than the uranium-235 used in a gun type bomb.
The first Soviet nuclear test was not of the relatively simple gun type. Instead it was a far more complex plutonium implosion assembly. The Soviets received the American plutonium implosion design twice, from two spies, and copied it for their first nuclear bomb (Holloway 1994, 366; Haynes, Klehr, and Vassiliev 2009, 117, 119).
Having two sources for the design gave the Soviets confidence that the design would work and was not FBI disinformation, no small thing given that the leaders of the Soviet nuclear weapons effort had reason to believe they would be executed if the first test failed (Gordin 2009, 171; Holloway 1994, 218). Furthermore, the Soviets were hard pressed to separate enough uranium-235 from the more plentiful uranium-238 to make a gun type uranium bomb work (gun type plutonium bombs are not viable). This was because the Western Allies had taken pains to corner the world supply of high quality uranium ore. The low quality ore that the Soviets had was adequate to the task of breeding plutonium, but it would have been more expensive and slower for the Soviets to separate enough uranium-235 enough to build a gun type bomb (Gordin 2009, 149-151). Often, controlling material and controlling information are thought of as different strategies for preventing nuclear proliferation. But in the first years after the creation of the atomic bomb, the West’s failure to control information about nuclear weapons design undermined its strategy of controlling fissile material to prevent nuclear proliferation.
2.2 Less abstract engineering
Most of the effort expended during the Manhattan Project went into the enrichment of fissile material. Ted Hall provided information about methods of uranium isotope (“25” in KGB code) separation, as reported in a decrypted cable sent from New York Station to Moscow Center on May 26, 1945:
Fuchs similarly provided data about electromagnetic techniques of isotope separation for uranium-235 (“ENORMOZ” in KGB code ordinarily referred to the Manhattan Project as a whole, but in this case it meant uranium-235 specifically), which was reported in a decrypted cable from Moscow to New York of April 10, 1945.
In addition to technical reports on enriching fissile material from Fuchs and Hall, the Soviets had plant designs for the Oak Ridge facility from Russell McNutt, data on plutonium from an unidentified spy, and data on the Chalk River facility in Canada’s nuclear reactor from Alan Nunn May, (see the appendix of the full report for a list of Manhattan Project spies). The Soviets were also occasionally able to acquire physical samples from spies. They received 162 micrograms of uranium-235 from Alan Nunn May, and David Greenglass “provided the Soviets with a physical sample of part of the triggering mechanism [of a plutonium bomb]” (Klehr and Haynes 2019, 12).
2.3 Types of information and the AI case
To the extent that the information that the most advanced AI projects have that their closest competitors lack is highly abstract and easy to convey, the potential significance of spying is very large. Simple, abstract ideas (analogous to basic principles of bomb design in the nuclear case) are the easiest to transfer. The question of how important theoretical breakthroughs will be to the future development of AI is closely related to the debate over the scaling hypothesis. The scaling hypothesis holds that current techniques are sufficient to eventually produce transformative artificial intelligence (TAI) if the neural networks are just made large enough (Branwen 2020; for an explanation of the idea of TAI see Karnofsky 2016). The reason that TAI does not yet exist, per the scaling hypothesis, is that the hardware and the will to invest in scaling does not yet exist (Branwen 2020). To the extent that this is true, it seems that stealing highly abstract ideas about AI algorithms is unlikely to make much of an impact, and that there is unlikely to be an algorithmic analog of the plutonium implosion bomb design. On the other hand, abstract ideas about data types, data processing, or assembling the requisite computing power might be transferred by spies to great effect.
Spies transferred about 10,000 pages of technical material on nuclear weapons from the Manhattan Project to the Soviet Union (Haynes, Klehr, and Vassiliev 2009, 60). At that level of volume, one can convey information about engineering that is concrete and detailed rather than abstract and simple, such as the design of machinery and facilities used for the separation of uranium isotopes. Even devoted proponents of the scaling hypothesis acknowledge that when replicating an effort based on scaling up existing techniques, one should “never underestimate the amount of tweaking and special sauce it takes” (Branwen 2020).
But just how significant is engineering knowledge of an intermediate level of abstraction likely to be as a bottleneck on AI capabilities? Unlike the Manhattan Project, advanced AI does not obviously require a massive industrial effort to purify rare materials. However, if significant AI research begins to be conducted by governments and international trade in computer chips becomes more restricted, the ability to solve engineering problems in the physical world might again come to differentiate the leading nation from its closest competitors. In such a regime, spying on the details of electrical engineering, materials science, and industrial processes might prove important (see Khan and Mann 2020).
The Anglo-American effort to prevent nuclear proliferation by cornering the world supply of uranium (discussed in section 2.1 above) might have been far more effective but for the Soviet’s use of espionage data on the plutonium route to the bomb. Similarly, strategies to restrict AI proliferation that rely on restricting information, and strategies that rely on restricting access to physical materials (in the AI case rare earth metals, chips, and semiconductor manufacturing equipment rather than high quality uranium ore) might be mutually reinforcing.
Tacit knowledge seems to play an important role in AI research. Knowing what sorts of training data to select for a model might involve tacit knowledge. More significantly, knowing which research directions are likely to be promising is a key element of AI research (or any other kind of research), and such knowledge includes an important tacit dimension. In a discussion of what one learns in a computer science PhD program, Andrej Karpathy explained the importance of the tacit knowledge embedded in “taste” to computer science research:
Research taste cannot easily be transferred by espionage. It might be possible to formalize certain aspects of research taste, or to accelerate the process of learning about it implicitly by mimicking the experience of training under a senior researcher. How much better is the taste of the best researchers on the most advanced AI project likely to be than the taste of the second-best researchers on the second best AI project? Rohin Shah reports that advanced computer science PhD students at UC Berkeley have much better research taste than beginning PhD students, and that professors have better taste than advanced PhD students (Shah 2020). Is there a similar asymmetry in taste between the very best researchers in the world and their close competitors? This seems like a promising question for further study but, provisionally: Michael Polanyi—the philosopher whose work brought about the current focus on tacit knowledge in the history of science and technology—believed that the greatness of a discovery was proportional to the amount of tacit knowledge required to select the problem that led to it (Polanyi [1966] 2009, 23). The more that taste and other forms of tacit knowledge distinguish the leading AI projects from less advanced ones, the more difficult it will be for spies to significantly help the laggards catch up. Spies could work to transfer personnel from the leader to the laggards as a way of transferring tacit knowledge. But this would duplicate the issues with trust that limited the usefulness of Soviet spies who were exfiltrated such as George Koval and Oscar Seborer. Alternatively, spies might try some scheme of rendering tacit knowledge explicit.
3. Motivations for espionage
3.1 Klaus Fuchs: ideology and conscience
Klaus Fuchs was (along with Ted Hall) one of the two most important spies in the Manhattan Project. He was a theoretical physicist. Fuchs took refuge in England after the Nazis came to power in Germany because his history as a Communist Party activist made him a target of the Gestapo. While in England, Fuchs began to work on nuclear weapons research and informed a German Communist Party leader that he had information that might be of interest to Soviet intelligence. Fuchs was sent to America to work as a nuclear physicist on the Manhattan Project, and continued to spy for the U.S.S.R. (Haynes, Klehr, and Vassiliev 2009, 92-93).
Fuch’s sister Kristel Heineman helped him on several occasions to make contact with his KGB courier in America, Harry Gold (Haynes, Klehr, and Vassiliev 2009, 95). Fuchs’s initial involvement in spying was clearly ideologically motivated. He later accepted money from the KGB. Fuchs claimed to his KGB courier that he did so to prove his loyalty to the Soviet Union, because he had been told that offering payment was a KGB strategy used to “morally bind” other spies to keep helping the KGB (Haynes, Klehr, and Vassiliev 2009, 128).
In 1949, British and American intelligence discovered Fuchs by decrypting KGB cables as part of the Venona counterintelligence project and correlating the covernames "Charles" and "Rest" with known facts about Fuchs’s background and whereabouts (Greenspan 2020, 193-228). By that time, Fuchs was back in England and working for the British nuclear weapons lab at Harwell. MI5 investigator James Skardon approached Fuchs and said that MI5 was certain Fuchs had been spying, but did not disclose it knew: “Skardon… suggested that FUCHS had been passing information to the Russians.... Skardon then took him very carefully over the ground during the period when he [Fuchs] was in America... and said that if it was not FUCHS it ‘could only be his twin brother’” (Greenspan 2020 239-240). Skardon repeatedly led Fuchs to believe he could keep his job at Harwell if he confessed (Greenspan 2020, 239, 259-260). At first Fuchs denied it, but after several interviews, he confessed to spying (Greenspan 2020, 257-258).
Later, Fuchs gave a written confession. The ideological motivations given in that confession were as follows: Fuchs’s father always emphasized to him the importance of following his conscience. In university, Fuchs started out as a social democrat, but switched to the Communist Party after what he saw as the social democrat’s failure to effectively oppose the rise of Hitler (Fuchs [1950] 1989, 182-183). While working as a Communist Party activist, he began to feel that he should subordinate his personal conscience and ideas about decency to party discipline (Fuchs [1950] 1989, 183). In his confession, he reported a kind of inward compartmentalization, allowing one part of himself to be at ease with his fellow scientists and another part to spy on them.
In Fuchs’s confession, he claimed to have come to reject his former beliefs that 1. standards of personal decency had to be suspended for political reasons 2. one should subordinate one's thoughts to the Party and 3. the Marxist theory of freedom through the mastery of the blind forces that control society could be put into practice in an individual's life by skillful manipulation of his own environment, including that part of his environment composed of the people around him (Fuchs [1950] 1989, 185-186). Fuchs claimed his newly re-awakened conscience required him to stop working with the KGB early in 1949 and to turn himself in 1950 in order to spare his friends at Harwell from the suspicion that would be cast on them by ambiguity about who the spy in the British nuclear weapons program was (Fuchs [1950] 1989, 185-186). His confession shows that he continued to believe he would be allowed to remain at Harwell (Fuchs [1950] 1989, 185).
The primary source evidence is potentially consistent with ideological disillusionment serving as one factor motivating Fuchs’s decisions to stop meeting with his KGB courier in early 1949 (although this also might also have been due to Fuchs somehow discovering that he was being investigated, see Greenspan 2020, 271-284). Remarkably, Fuchs told a similar story of ideological development (but with a different valence) when he met with KGB agents in a Moscow restaurant after his release from British prison and relocation to East Germany. Fuchs told the agents that he had been unduly influenced by bourgeois ideology, but that he had since corrected himself (Haynes, Klehr, and Vassiliev 2009, 134-135).
3.2 Ted Hall: ideology and great power balancing
Ted Hall was the youngest physicist working on the Manhattan Project. He graduated from Harvard at 18. Hall was a communist, and had been active as a labor organizer while in college (Haynes, Klehr, and Vassiliev 2009, 110-112). In 1944, at age 19, he approached a representative of the Soviet Union in New York and offered to serve as a spy. His explanation of his motivations for giving the U.S.S.R. information about American nuclear weapons research is recorded in former KGB agent Alexander Vassiliev’s notes on the KGB’s archives, which have been translated into English and are hosted on the Wilson Center’s website:
Although Hall would later claim that he had originally set out only to inform the Soviet Union of the fact that the United States was developing nuclear weapons (Hall [1995] 1997, 288), that claim would seem to be belied by his statement that the "U.S.S.R. ought to... stay abreast of the progress of experiments and construction." Decrypted Venona cables revealed Hall’s status as a Soviet spy to American intelligence services after the war. However, Hall, unlike Fuchs, did not confess when questioned. Unwilling to reveal its access to secret Soviet communications, and unable to admit secret evidence in court, the U.S. government let Hall go (Haynes, Klehr, and Vassiliev 2009, 123-124). After his spying was revealed by the declassification of the Venona cables in 1995, Hall admitted to having been a Soviet spy:
Hall’s two justifications, more than fifty years apart, both focused on the international balance of power.
3.3 Reflections on nuclear spy motivations
Ideology was by far the biggest motivation for Manhattan Project spies. Financial motivations were less important than ideological motivations, probably because penalties for spying could include decades in prison or death. When the stakes are very high, spying requires a certain kind of altruism, as narrowly self-interested motivations are unlikely to be able to overcome fear of the penalties if one is caught. It is also striking how many spies (Klaus Fuchs, David Greenglass, Oscar Seborer) were helped by members of their families in their espionage. Family loyalties might have served to prevent spies from desisting from spying (although Greenglass overcame this obstacle when he testified against his sister and brother-in-law, sending them to the electric chair). Another factor, in addition to family loyalties, that served to make it easier to start spying for the Soviet Union than to stop was the KGB practice of paying spies even if they were originally ideologically motivated. Receiving payment from the KGB removed any possible ambiguity about what the spies were doing and increased expected penalties, reducing the odds that spies would confess.
3.4 Possible AI spy motivations
The Soviet Union was in an unusual position in the 1930s and 1940s. Its governing ideology commanded a significant following among educated people all over the world. This made it much easier to recruit spies. Unlike socialist internationalist loyalty to the Soviet Union, nationalism continues to be widespread and might motivate AI spying. This is true even of spying in the private sector, as spies might believe that by helping firms based in their homelands they are doing their patriotic duty. The most significant nuclear spy outside of the Manhattan Project, A. Q. Khan, was motivated by Pakistani nationalism. While security clearance investigations try to detect foreign loyalties, nothing like the security clearance system exists in the private sector. Furthermore, nation-states might force their otherwise unwilling nationals or firms to help with AI espionage. However, this issue must be treated with extreme care. There is an obvious risk of xenophobic or racist bias. Furthermore, there is a risk that attempting to prevent espionage by restricting the access to sensitive information of those with potential conflicts of national loyalties will, pragmatically in addition to morally, backfire. During the Cold War, the United States deported a Chinese-born aerospace engineer, Qian Xuesen based on unproven allegations that he was a spy. Qian went on to build missile systems for the People’s Republic of China. In addition to ideas that are widely popular (such as nationalism), ideas that are common among software engineers and computer scientists but rarer in the general population might prove significant as motivations for AI espionage. Belief in the open source or free software ideal, which opposes secrecy in software development, is one obvious example.
Despite the potential motivating force of American nationalism as an ideology for spies, it seems doubtful that the U.S. government or U.S. firms will be net beneficiaries of AI espionage if competition is most intense between countries (if an AI arms race is undertaken largely between U.S. firms, then some U.S. firms may well be net beneficiaries). Spying can help lagging participants in a race to develop new technologies catch up, but it is hard to see how it can help the leader improve its lead (unless the overall leader is behind in certain specific areas). The United States appears to be ahead of the rest of the world in AI, with China being its only plausible close competitor. One recent analysis broke down AI capabilities into four drivers: hardware; research and algorithms; data; and size of commercial AI sector. The United States led China by a wide margin in every category except for data (Ding 2018, 29).
The most important AI research today is conducted in the private sector. Unless that changes, the most important spying will have to be done on private firms. This changes the balance of motivations that might prove significant. Most obviously, given that most people approach their work with the goal of making money, it suggests that financial gain might be more significant as a motive for AI espionage than it was as a motive for nuclear espionage. Financially motivated public sector spies tend to be of lower quality than ideological spies because, given the legal penalties for spying, only irrational people or people in truly desperate need of money would agree to take on the requisite level of risk. But in the private sector, expected penalties are lower.
4. Manhattan Project counterintelligence
The historian Alex Wellerstein argues that counterintelligence efforts at the Manhattan Project had three main goals: 1. preventing Axis powers from spying 2. preventing wartime allies (such as the Soviet Union) from spying and 3. preventing scientists from getting a holistic understanding of the Manhattan Project, and (more importantly) preventing politicians and the broader American public from discovering the Manhattan Project's existence. Broadly, 1 and 3 were successful but 2 was not (Wellerstein 2021, 91-92). It may be that bureaucratic incentives to focus on secrecy from domestic political actors drew energy away from preventing Soviet espionage. General Leslie Groves was particularly concerned about Congress getting wind of the massive budget of the Manhattan Project and cutting it off, or subjecting Manhattan Project leaders to onerous postwar investigations (Wellerstein 2021, 81). During congressional hearings on atomic spying after the war, Groves “argued… that the Manhattan Project security apparatus had been primarily focused on preventing leaks and indiscretions, not rooting out disloyalty” (Wellerstein 2021, 224-225).
There were other reasons, besides Groves’s relative lack of interest in preventing Soviet spying, for the success of the Manhattan Project spies. Responsibility for detecting espionage was divided between two mutually hostile agencies, the FBI and army intelligence. And, most fundamentally, a significant portion of the world’s top scientific talent was sympathetic to the Soviet Union, which introduced a capability-alignment tradeoff (Walsh 2009).
5. The significance of nuclear espionage
The Soviet Union detonated its first nuclear bomb on August 29, 1949, four years after the first successful American nuclear test. In Stalin and the Bomb, David Holloway evaluated the impact of nuclear espionage on Soviet nuclear weapons development as follows:
The empirical outline of Holloway’s account does not appear to be open to serious doubt. The Soviets made significant use of espionage data and, on the other hand, Soviet scientists were world-class and could have developed the bomb within a few years of 1949 without espionage.
Michael Gordin makes an interesting argument in Red Cloud at Dawn. The Soviets laboriously checked, re-checked, and adapted spy data. Given the effort that the Soviets had to go through to assure themselves of the veracity of the information that they got from spies, Gordin suggests that it is an open question whether the Soviets really saved any time by using spy data (Gordin 2009, 153-154). Gordin concedes however that, even if the Soviets saved no time, they “surely saved much uncertainty” (Gordin 2009, 153).
Reducing uncertainty can change one’s strategy. If a country increases its confidence that it will soon have a powerful weapon hitherto monopolized by an enemy, it may become rational to behave more aggressively towards that enemy.
Ignoring the prospective effects of knowing (rather than merely guessing) that one will soon have a powerful weapon, saving uncertainty meant removing the chance that the Soviets were unlucky and would have had to wait longer to get nuclear weapons. Stalin himself did not believe that nuclear weapons were very strategically significant in and of themselves (Gordin 2009, 62). He did, however, understand the enormous importance that the Americans assigned to nuclear weapons. Thus, he refused Kim Il Sung’s request to support a North Korean invasion of South Korea in 1948 because he feared an American intervention on the South Korean side. In 1950, however, Stalin was willing to support Kim’s invasion, in part because he believed that the Soviet Union’s nuclear weapons would deter American intervention (Haynes, Klehr, and Vassiliev 2009, 62). Therefore, it seems that even if one takes maximally unfavorable assumptions and assumes that espionage saved the Soviet Union no time and only uncertainty, without espionage there would have been a substantially greater chance that the Korean War would have been delayed or, because of the other changes made possible by delay, avoided.
Furthermore, I do not think maximally unfavorable assumptions about the efficacy of nuclear espionage are justified. Absent further argument, it seems to me that we should default to the view that it is easier to check data and designs that one has in hand than it is to derive entirely new data and designs. Holloway’s estimate that intelligence saved the Soviets two to three years seems to be a subjective guess rather than the output of a quantitative model of bomb timelines. However, given that Holloway undertook the most thorough study of the Soviet nuclear weapons program (at least in English), he should be afforded some (small) amount of epistemic deference. Given the basic facts of the case, the Soviets saving something in the neighborhood of two to three years is not hard to believe. Because of the importance of the Korean War, that ought to qualify as a significant impact on world history.
In addition to the impact of espionage on the development of nuclear weapons, nuclear espionage may also have raised the temperature of the Cold War. Even if we grant, as we should, that the Cold War would have occurred anyway, the discovery of Alan Nunn May’s nuclear spying in 1946 may have reduced the odds that control of nuclear weapons would be ceded to multilateral international institutions (Zaidi and Dafoe 2021, 23, 42, 42n179). The distrust engendered by nuclear espionage highlights the potential of spying to increase enmity between the leader and the laggards in a technological race, and to reduce the odds of cooperation aimed at mitigating the risks of such a race. This effect emerges from the inherent dynamics of espionage and is likely to apply to AI races as well as nuclear races.
6. Secrecy
Among people concerned about existential risk, there sometimes seems to be a presumption in favor of secrecy. One plausible origin for this presumption is the 2016 article “The Unilateralist’s Curse and the Case for a Principle of Conformity” by Nick Bostrom, Thomas Douglas, and Anders Sandberg. Bostrom et al. argue that even a well-intentioned group of independent actors is likely to err in the direction of taking a risky action, because if one can act unilaterally the probability of action will be proportional not to the average of the group but to the probability that the most optimistic actor will act. Bostrom et al.’s proposed solution to the unilateralist's curse is a principle of conformity in situations where unilateralism is possible. When the action in question is publishing or not publishing some information, the principle of conformity is equivalent to a presumption in favor of secrecy.
Note, though, that in “The Unilateralist’s Curse” Bostrom et al. do not argue for conformity all things considered. Rather, they argue that the unilateralist’s curse provides a defeasible reason for conformity. Their paper does not attempt to establish whether, in any given, situation our prior inclinations to conform or not to conform are correct. If one is concerned about the dissemination of information hazards, one should bear in mind that omissions might reveal as much as commissions in certain circumstances, and weigh carefully what strategy of releasing or withholding information is least hazardous (Bostrom 2019; Bostrom 2011).
One should also be concerned by the tendency of secrecy regimes to perpetuate themselves. Alex Wellerstein, explains this point of view
Many of the concerns Wellerstein raises seem rather remote from existential risk. This might lead researchers concerned with existential risk to assume that they have nothing to learn from the anti-secrecy perspective. I think that would be a mistake, because Wellerstein’s observation that regimes of secrecy tends to be self-perpetuating is highly relevant to existential risk. Secrecy serves to worsen our understanding of (and, therefore, our ability to control) emerging technologies. Secrecy may have had this effect in the early Cold War United States, where a large thermonuclear arsenal was accumulated alongside a failure to seriously study the catastrophic risks that thermonuclear war posed (Gentzel 2018). If secrecy is hard to uproot, it might further raise existential risk by preventing concerns about safety from spreading to all relevant actors.
In “What Failure Looks Like,” the AI researcher Paul Christiano explains some reasons why AI may pose an existential risk. Those reasons all involve imperfectly understood AI systems whose goals diverge from those of human beings and which are able to gain power and influence in part because of their creators' imperfect understanding of the systems' true goals. Christiano anticipates that this problem will arise due to competitive incentives to deploy powerful AI systems as soon as possible combined with the inherent opacity of contemporary machine learning techniques (Christiano 2019). But secrecy about advanced AI might compound the problem of recognizing misaligned AI systems. And if approaches to AI safety that rely on rendering AI systems interpretable prove essential to preventing misalignment, secrecy is likely to be a major barrier. Whether such considerations are important enough to establish a presumption against secrecy is beyond the scope of this post. But the empirical tendency of secrecy regimes to expand their remit and endure indefinitely should be taken seriously.
7. Conclusion: espionage and existential risk
Espionage is most likely to be significant if discontinuous progress in AI can be achieved on the basis of key abstract insights. To the extent that the scaling hypothesis is true, espionage is likely to be less important. But even if the scaling hypothesis is true, espionage may be significant if it transfers engineering knowledge or tacit knowledge (which can be transferred either by exfiltrating agents or rendering what was tacit explicit). Espionage during the Manhattan Project may have accelerated Soviet nuclear weapons development by two to three years, which does not sound like much, but may have altered the course of the early Cold War. This was achieved by the less than 0.1% of Manhattan Project employees who were Soviet spies (part of the effectiveness of this small group may have been due to the disproportionate representation of high-ranking employees among spies). If a technology is truly transformative, even a small gain in speed is strategically significant.
On balance, AI espionage is likely to increase existential risk. In “Racing to the Precipice” Stuart Armstrong, Nick Bostrom, and Carl Shulman create a game theoretic model of AI arms races’ effects on safety. Armstrong et al. find that risks are greatest when enmity between competitors is high, knowledge of other projects is available, and (conditional on knowledge of other projects being available) the leader has only a small lead. One should expect espionage to increase enmity between competitors, increase knowledge of competitors’ projects, and reduce the distance between the leader and the laggards. Thus, to the extent that Armstrong et al.’s model reflects the real strategic situation, the expected impact of espionage is to increase existential risk. Eoghan Stafford, Robert Trager, and Allan Dafoe’s forthcoming “International Strategic Dynamics of Risky Technology Races” builds a more complex model. Like Armstrong et al., Stafford et al. find that enmity increases risk in all situations. However, whereas Armstrong et al. find that a close race is more dangerous, Stafford et al. find that under certain circumstances, close races are less dangerous than very uneven races. If, in Stafford et al.’s model, enmity between leader and laggard is high and the laggard is far behind, compromising on safety might seem to be the only way that the laggard can have a chance of winning. But in a more even race, the laggard might be less willing to compromise on safety because they would have a chance of winning without taking extreme risks. Thus, granting for the sake of the argument that the assumptions of Stafford et al.’s model hold, espionage’s tendency to narrow gaps might, under some circumstances, reduce existential risk. However, this consideration would seem to me to be outweighed by espionage’s tendency to increase enmity.
It therefore may be valuable for people concerned about existential risk to contribute to preventing AI espionage even if they have no preference between the project being spied on and the project doing the spying. On the other hand, secrecy (the most obvious countermeasure to espionage) may increase existential risk by worsening issues with interpretability. And subjecting AI researchers to background checks may asymmetrically weaken conscientious projects as their competitors, not worried about existential risk or espionage, will gain from the talent that they reject. All of these considerations should be carefully weighed by AI policy practitioners before deciding to prioritize or deprioritize preventing espionage.