A model of AI development

lukeprog

FHI has released a new tech report:

Armstrong, Bostrom, and Shulman. Racing to the Precipice: a Model of Artificial Intelligence Development.

Abstract:

This paper presents a simple model of an AI arms race, where several development teams race to build the first AI. Under the assumption that the first AI will be very powerful and transformative, each team is incentivized to finish first — by skimping on safety precautions if need be. This paper presents the Nash equilibrium of this process, where each team takes the correct amount of safety precautions in the arms race. Having extra development teams and extra enmity between teams can increase the danger of an AI-disaster, especially if risk taking is more important than skill in developing the AI. Surprisingly, information also increases the risks: the more teams know about each others’ capabilities (and about their own), the more the danger increases.

The paper is short and readable; discuss it here!

But my main reason for posting is to ask this question: What is the most similar work that you know of? I'd expect people to do this kind of thing for modeling nuclear security risks, and maybe other things, but I don't happen to know of other analyses like this.

FHI has released a new tech report:

Armstrong, Bostrom, and Shulman. Racing to the Precipice: a Model of Artificial Intelligence Development.

Abstract:

This paper presents a simple model of an AI arms race, where several development teams race to build the first AI. Under the assumption that the first AI will be very powerful and transformative, each team is incentivized to finish first — by skimping on safety precautions if need be. This paper presents the Nash equilibrium of this process, where each team takes the correct amount of safety precautions in the arms race. Having extra development teams and extra enmity between teams can increase the danger of an AI-disaster, especially if risk taking is more important than skill in developing the AI. Surprisingly, information also increases the risks: the more teams know about each others’ capabilities (and about their own), the more the danger increases.

The paper is short and readable; discuss it here!

But my main reason for posting is to ask this question: What is the most similar work that you know of?

It's not tremendously similar, but for some reason I thought of the Diamond-Dybvig model of bank runs as a (distant) analogy. It has multiple equilibria: everyone might take money in & out of the bank as usual, or a bank run might kick off. The AI risk equivalent, I guess, would be a model where either every development team exercises optimal caution (whatever that would be), or every team rushes to be first. That said, I don't know whether any realistic-ish model of AI development would have those particular equilibria.

As for the FHI paper, I'm glad its abstract mentions the model's prediction that more information can increase the risk. That's a cute result.

I wonder what'd happen in a model that incorporates time passing over multiple rounds. The teams' decisions in each round could expose information about their judgements of capabilities & risks. Might lead to an intractable model, though.

26

A model of AI development

26

26

26

A model of AI development

26

26