A source tells me there's a fair bit of non-public discussion of AGI-safety-relevant strategy/policy/governance issues, but it often takes a while for those discussions to cohere into a form that is released publicly (e.g. in a book or paper), and some of it is kept under wraps due to worries about infohazards (and worries about the unilateralist's curse w.r.t. infohazards).
I have since been given access to a sample of such non-public discussions. (The sample is small but I think at least somewhat representative.) Worryingly, it seems that there's a disconnect between the kind of global coordination that AI governance researchers are thinking and talking about, and the kind that technical AI safety researchers often talk about nowadays as necessary to ensure safety.
In short, the Google docs I've seen all seem to assume that a safe and competitive AGI can be achieved at some reasonable level of investment into technical safety, and the main coordination problem is how to prevent a "race to the bottom" whereby some actors try to obtain a lead in AI capabilities by underinvesting in safety. However, current discussion among technical AI safety researchers suggest that a safe and competitive AGI perhaps can't be achieved at any feasible level of investment into technical safety, and at a certain point we'll probably need global coordination to stop, limit, or slow down progress in and/or deployment/use of AI capabilities.
Questions I'm trying to answer now: 1) Is my impression from the limited sample correct? 2) If so, how best to correct this communications gap (and prevent similar gaps in the future) between the two groups of people working on AI risk?
I appreciate how you turned the most useful private info into public conversation while largely minimising the amount of private info that had to become public.
To respond directly, yes, your observation matches my impression of folks working on governance issues who aren’t very involved in technical alignment (with the exception of Bostrom). I have no simple answer to the latter question.
I have a feeling it's not that simple. See the last part of “Generate evidence of difficulty” as a research purpose on biases. So for example I know at least one person who quit from an AI safety org (in part) because they became convinced that it's too difficult to achieve safe, competitive AI (or at least the approach pursued by the org wasn't going to work). Another person privately told me they have little idea how their research will eventually contribute to a safe, competitive AI, but hasn't written anything like that publicly AFAIK. (And note that I don't actually have that many opportunities to speak privately with other AI safety researchers.) Another thing is that most AI safety researchers probably don't think it's part of their job to "generate evidence of difficulty" so I have to convince them of that first.
Unless these problems are solved, I might be able to convince a few safety researchers to go to governance researchers and tell them they think it's not possible to get safe, competitive AI, but their concerns will probably just be dismissed as outliers. I think a better step forward would be to build a private forum where these kinds of concerns can be more frankly discussed, as well as a culture where doing so is normative. This addresses some of the possible biases and I'm still not sure about the others.
My question is, who is thinking directly about how to achieve such coordination (aside from FHI's Center for the Governance of AI, which I'm aware of) and where are they talking about it?
OpenAI has a policy team (this 80,000 Hours podcast episode is an interview with three people from that team), and I think their research areas include models for coordination between top AI labs, and improving publication norms in AI (e.g. maybe striving for norms that are more like those in computer security, where people are expected to follow some responsible disclosure process when publishing about new vulnerabilities). For example, the way OpenAI is releasing their new language model GPT-2 seems like a useful way to learn about the usefulness/feasibility of new publication norms in AI (see the "Release Strategy" section here).
I think related work is also being done at the Centre for the Study of Existential Risk (CSER).
I want to focus on your second question: "Human coordination ability seems within an order of magnitude of what's needed for AI safety. Why the coincidence? (Why isn’t it much higher or lower?)"
Bottom line up front: Humanity has faced a few potentially existential crises in the past; world wars, nuclear standoffs, and the threat of biological warfare. The fact that we survived those, plus selection bias, seems like a sufficient explanation of why we are near the threshold for our current crises.
I think this is a straightforward argument. At the same time, I'm not going to get deep into the anthropic reasoning, which is critical here, but I'm not clear enough on to discuss clearly. (Side note: Stuart Armstrong recently mentioned to me that there are reasons I'm not yet familiar with for why anthropic shadows aren't large, which is assumed in the below model.)
If we assume that large scale risks are distributed in some manner, such as from Bostrom's urn of technologies (See: Vulnerable World Hypothesis - PDF,) we should expect that the attributes of the problems, including the coordination needed to withstand / avoid them, are distributed with some mean and variance. Whatever that mean and variance is, we expect that there should be more "easy" risks (near or below the mean) than "hard" ones. Unless the tail is very, very fat, this means that we are likely to see several moderate risks before we see more extreme ones. For a toy model, let's assume risks show up at random yearly, and follow a standard normal distribution in terms of capability needed. If we had capability in the low single digits, we would be wiped out already with high probability. Given that we've come worryingly close, however, it seems clear that we aren't in the high double digits either.
Given all of that, and the selection bias of asking the question when faced with larger risks, I think it's a posteriori likely that most salient risks we face are close to our level of ability to overcome.
Oh interesting, I wasn't aware of this prize. Where are these papers being discussed? It seems like it's mostly in person, at conferences, and through published papers? Are you aware of an online forum similar to LW/AF where such papers and ideas are being discussed?
ETA: Are the papers being discussed, or are people just publishing their own papers and not really commenting on each other's ideas?
RE the title, a quick list:
I think a lot of orgs that are more focused on social issues which can or do arise from present day AI / ADM (automated decision making) technology should be thinking more about global coordination, but seem focused on national (or subnational, or EU) level policy. It seems valuable to make the most compelling case for stronger international coordination efforts to these actors. Examples of this kind of org that I have in mind are AINow and Montreal AI ethics institute (MAIEI).
As mentioned in other comments, there are many private conversations among people concerned about AI-Xrisk, and (IMO, legitimate) info-hazards / unilateralist curse concerns loom large. It seems prudent to make progress on those meta-level issues (i.e. how to engage the public and policymakers on AI(-Xrisk) coordination efforts) as a community as quickly as possible, because:
My answers to your 6 questions:
1. Hopefully the effect will be transient and minimal.
2. I strongly disagree. I think we (ultimately) need much better coordination.
3. Good question. As an incomplete answer, I think personal connections and trust play a significant (possibly indispensable) role.
4. I don't know. Speculating/musing/rambling: the kinds of coordination where IT has made a big difference (recently, i.e. starting with the internet) are primarily economic and consumer-faced. For international coordination, the stakes are higher; it's geopolitics, not economics; you need effective international institutions to provide enforcement mechanisms.
5. Yes, but this doesn't seem like a crucial consideration (for the most part). Do you have specific examples in mind?
6. Social science and economics seem really valuable to me. Game theory, mechanism design, behavioral game theory. I imagine there's probably a lot of really valuable stuff on how people/orgs make collective decisions that the stakeholders are satisfied with in some other fields as well (psychology? sociology? anthropology?). We need experts in these fields (esp, I think the softer fields are underrepresented) to inform the AI-Xrisk community about existing findings and create research agendas.
For question 2, I think the human-initiated nature of AI risk could partially explain the small distance between ability and need. If we were completely incapable of working as a civilization, other civilizations might be a threat, but we wouldn’t have any AIs of our own, let alone general AIs.
> When humans made advances in coordination ability in the past, how was that accomplished? What are the best places to apply leverage today?
I am confused by the general lack of interest I've encountered in how joint stock corporations came to be and underwent selection to get us to where we are now. It may be I'm not looking in the right places. I know the founders of Mckinsey are quite interested in this.
4. Information technology has massively increased certain kinds of coordination (e.g., email, eBay, Facebook, Uber), but at the international relations level, IT seems to have made very little impact. Why?
I note the coordination is entirely at a lower-level than those companies: mostly individuals are using these services for coordination, as well as small groups. It seems like coordination innovations aren't bottom up, but rather top-down (even if the IT examples are mostly opt-in). This seems to match other large coordination improvements, like empire, monotheism, or corporations. There is no higher level of abstraction than governments from which to improve international relations, it seems to me.
Quite separately, we could ask: what are the specific challenges in international relations that IT could address? The problems mostly revolve around questions of trust, questions of the basic competence of human agents (diplomats, ambassadors, heads of state, etc), and fundamental conflicts of interest. International relations has an irreducible component of face-to-face personal relationships, so I would expect tools built around that or to facilitate it to be the most relevant.
That being said, it's also clear that Facebook and Uber aren't even trying to target problems related to international relations. We know contracting with multiple governments is achievable, because people like Google, Microsoft, and Palantir all manage it selling IT for intelligence purposes. Dominic Cummings has a blog post High performance government, ‘cognitive technologies’, Michael Nielsen, Bret Victor, & ‘Seeing Rooms’ that speculates about how international relations could be improved by making the stupendous complexity of the information at work more readily available to decision makers, both for educational purposes and in real time. Maybe there would be an opportunity for a Situation Room Company, or similar. Following on the personal relationship observation, perhaps something like Salesforce-but-for-diplomacy would have some value.
Many AI safety researchers these days are not aiming for a full solution to AI safety (e.g., the classic Friendly AI), but just trying to find good enough partial solutions that would buy time for or otherwise help improve global coordination on AI research (which in turn would buy more time for AI safety work), or trying to obtain partial solutions that would only make a difference if the world had a higher level of global coordination than it does today.
My question is, who is thinking directly about how to achieve such coordination (aside from FHI's Center for the Governance of AI, which I'm aware of) and where are they talking about it? I personally have a bunch of questions related to this topic (see below) and I'm not sure what's a good place to ask them. If there's not an existing online forum, it seems a good idea to start thinking about building one (which could perhaps be modeled after the AI Alignment Forum, or follow some other model).