This is part of a weekly reading group on Nick Bostrom's book, Superintelligence. For more information about the group, and an index of posts so far see the announcement post. For the schedule of future topics, see MIRI's reading guide.
Welcome. This week we discuss the twenty-eighth section in the reading guide: Collaboration.
This post summarizes the section, and offers a few relevant notes, and ideas for further investigation. Some of my own thoughts and questions for discussion are in the comments.
There is no need to proceed in order through this post, or to look at everything. Feel free to jump straight to the discussion. Where applicable and I remember, page numbers indicate the rough part of the chapter that is most related (not necessarily that the chapter is being cited for the specific claim).
Reading: “Collaboration” from Chapter 14
Summary
- The degree of collaboration among those building AI might affect the outcome a lot. (p246)
- If multiple projects are close to developing AI, and the first will reap substantial benefits, there might be a 'race dynamic' where safety is sacrificed on all sides for a greater chance of winning. (247-8)
- Averting such a race dynamic with collaboration should have these benefits:
- More safety
- Slower AI progress (allowing more considered responses)
- Less other damage from conflict over the race
- More sharing of ideas for safety
- More equitable outcomes (for a variety of reasons)
- Equitable outcomes are good for various moral and prudential reasons. They may also be easier to compromise over than expected, because humans have diminishing returns to resources. However in the future, their returns may be less diminishing (e.g. if resources can buy more time instead of entertainments one has no time for).
- Collaboration before a transition to an AI economy might affect how much collaboration there is afterwards. This might not be straightforward. For instance, if a singleton is the default outcome, then low collaboration before a transition might lead to a singleton (i.e. high collaboration) afterwards, and vice versa. (p252)
- An international collaborative AI project might deserve nearly infeasible levels of security, such as being almost completely isolated from the world. (p253)
- It is good to start collaboration early, to benefit from being ignorant about who will benefit more from it, but hard because the project is not yet recognized as important. Perhaps the appropriate collaboration at this point is to propound something like 'the common good principle'. (p253)
- 'The common good principle': Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals. (p254)
Another view
Miles Brundage on the Collaboration section:
This is an important topic, and Bostrom says many things I agree with. A few places where I think the issues are less clear:
- Many of Bostrom’s proposals depend on AI recalcitrance being low. For instance, a highly secretive international effort makes less sense if building AI is a long and incremental slog. Recalcitrance may well be low, but this isn’t obvious, and it is good to recognize this dependency and consider what proposals would be appropriate for other recalcitrance levels.
- Arms races are ubiquitous in our global capitalist economy, and AI is already in one. Arms races can stem from market competition by firms or state-driven national security-oriented R+D efforts as well as complex combinations of these, suggesting the need for further research on the relationship between AI development, national security, and global capitalist market dynamics. It's unclear how well the simple arms race model here matches the reality of the current AI arms race or future variations of it. The model's main value is probably in probing assumptions and inspiring the development of richer models, as it's probably too simple in to fit reality well as-is. For instance, it is unclear that safety and capability are close to orthogonal in practice today. If many AI people genuinely care about safety (which the quantity and quality of signatories to the FLI open letter suggests is plausible), or work on economically relevant near-term safety issues at each point is important, or consumers reward ethical companies with their purchases, then better AI firms might invest a lot in safety for self-interested as well as altruistic reasons. Also, if the AI field shifts to focus more on human-complementary intelligence that requires and benefits from long-term, high-frequency interaction with humans, then safety and capability may be synergistic rather than trading off against each other. Incentives related to research priorities should also be considered in a strategic analysis of AI governance (e.g. are AI researchers currently incentivized only to demonstrate capability advances in the papers they write, and could incentives be changed or the aims and scope of the field redefined so that more progress is made on safety issues?).
- ‘AI’ is too course grained a unit for a strategic analysis of collaboration. The nature and urgency of collaboration depends on the details of what is being developed. An enormous variety of artificial intelligence research is possible and the goals of the field are underconstrained by nature (e.g. we can model systems based on approximations of rationality, or on humans, or animals, or something else entirely, based on curiosity, social impact, and other considerations that could be more explicitly evaluated), and are thus open to change in the future. We need to think more about differential technology development within the domain of AI. This too will affect the urgency and nature of cooperation.
Notes
1. In Bostrom's description of his model, it is a bit unclear how safety precautions affect performance. He says 'one can model each team's performance as a function of its capability (measuring its raw ability and luck) and a penalty term corresponding to the cost of its safety precautions' (p247), which sounds like they are purely a negative. However this wouldn't make sense: if safety precautions were just a cost, then regardless of competition, nobody would invest in safety. In reality, whoever wins control over the world benefits a lot from whatever safety precautions have been taken. If the world is destroyed in the process of an AI transition, they have lost everything! I think this is the model Bostrom means to refer to. While he says it may lead to minimum precautions, note that in many models it would merely lead to less safety than one would want. If you are spending nothing on safety, and thus going to take over a world that is worth nothing, you would often prefer to move to a lower probability of winning a more valuable world. Armstrong, Bostrom and Shulman discuss this kind of model in more depth.
2. If you are interested in the game theory of conflicts like this, The Strategy of Conflict is a great book.
3. Given the gains to competitors cooperating to not destroy the world that they are trying to take over, research on how to arrange cooperation seems helpful for all sides. The situation is much like a tragedy of the commons, except for the winner-takes-all aspect: each person gains from neglecting safety, while exerting a small cost on everyone. Academia seems to be pretty interested in resolving tragedies of the commons, so perhaps that literature is worth trying to apply here.
4. The most famous arms race is arguably the nuclear one. I wonder to what extent this was a major arms race because nuclear weapons were destined to be an unusually massive jump in progress. If this was important, it leads to the question of whether we have reason to expect anything similar in AI.
In-depth investigations
If you are particularly interested in these topics, and want to do further research, these are a few plausible directions, some inspired by Luke Muehlhauser's list, which contains many suggestions related to parts of Superintelligence. These projects could be attempted at various levels of depth.
- Explore other models of competitive AI development.
- What policy interventions help in promoting collaboration?
- What kinds of situations produce arms races?
- Examine international collaboration on major innovative technology. How often does it happen? What blocks it from happening more? What are the necessary conditions? Examples: Concord jet, LHC, international space station, etc.
- Conduct a broad survey of past and current civilizational competence. In what ways, and under what conditions, do human civilizations show competence vs. incompetence? Which kinds of problems do they handle well or poorly? Similar in scope and ambition to, say, Perrow’s Normal Accidents and Sagan’s The Limits of Safety. The aim is to get some insight into the likelihood of our civilization handling various aspects of the superintelligence challenge well or poorly. Some initial steps were taken here and here.
- What happens when governments ban or restrict certain kinds of technological development? What happens when a certain kind of technological development is banned or restricted in one country but not in other countries where technological development sees heavy investment?
- What kinds of innovative technology projects do governments monitor, shut down, or nationalize? How likely are major governments to monitor, shut down, or nationalize serious AGI projects?
- How likely is it that AGI will be a surprise to most policy-makers and industry leaders? How much advance warning are they likely to have? Some notes on this here.
How to proceed
This has been a collection of notes on the chapter. The most important part of the reading group though is discussion, which is in the comments section. I pose some questions for you there, and I invite you to add your own. Please remember that this group contains a variety of levels of expertise: if a line of discussion seems too basic or too incomprehensible, look around for one that suits you better!
Next week, we will talk about what to do in this 'crunch time'. To prepare, read Chapter 15. The discussion will go live at 6pm Pacific time next Monday 30 March. Sign up to be notified here.
The idea of an international collaboration reminds me of this article I read a while ago about the difficulties coordinating international efforts to create nuclear fusion: http://www.newyorker.com/magazine/2014/03/03/a-star-in-a-bottle As a software developer, I tend to think that the best software is produced by small teams of elite software developers who know each other well, work together well, have been working together for a long time, work out of a single office, and are all native or extremely fluent speakers of the same language (English being the best language by a wide margin, because almost all programming languages are based on it and the majority of tool documentation is written in it, especially for the most cutting edge development tools and libraries). This is the rough model that you see used in Silicon Valley, and it seems to have won out over other models like outsourcing half your team to a foreign country where developers are not extremely fluent in English and hiring managers aren’t ruthlessly obsessed with finding the most brilliant and qualified people possible. (There are a few differences, such as the fact that Silicon Valley workers change jobs rather often and that Silicon Valley companies are now being forced to hire people who aren’t quite as brilliant or fluent as they would like. But I think I’ve described the type of team that many or most of the best CTOs in the valley would like to have.)
An international collaboration pattern matches to one of those horror stories you read about in a book like The Mythical Man-Month about a project that takes way longer than expected, goes way over budget, and might succeed in delivering a poorly designed, bug-ridden piece of software if it isn’t cancelled or started over from scratch first. Writing great software is a big topic that I don’t feel very qualified to speak on, but it does worry me that Bostrom’s plan doesn’t pass my sniff test; it makes me worry that he spent too much time theorizing from first principles and not enough having discussion with domain experts.
Either way, I think this discussion might benefit from surveying the literature on software development best practices, international research collaborations, safety-critical software development, etc. There might be some strategy besides an international collaboration that accomplishes the same thing, e.g. a core development team in a single location writing all of the software, with external teams monitoring its development, taking the time to understand it, and checking for flaws. This would both give those external teams domain expertise in producing AGIs if it turns out they’re only very powerful rather than extremely powerful, and serves the additional role of having an additional layer of safety checks. (To provide proper incentives, perhaps any monitoring team that succeeded in identifying a bug in the work of the main team would have the prestige of writing the AI revert to it. Apparently something like this adversarial structure works for a company writing safety-critical space shuttle software: http://www.fastcompany.com/28121/they-write-right-stuff )
Another idea I’ve been toying with recently is the idea that some people who are concerned with AI safety should go off and start a company that writes safety-critical AI software now, say for piloting killer drones. That would give them the opportunity to develop the soft skills and expertise necessary to write really high quality, bug-free AI software. In the ideal case they might spend half their time writing code and the other half of their time improving processes to reduce the incidence of bugs. Then we’d have a team in place to build FAI when it became possible.