Strategic research on AI risk

lukeprog

Series: How to Purchase AI Risk Reduction

Norman Rasmussen's analysis of the safety of nuclear power plants, written before any nuclear accidents had occurred, correctly predicted several details of the Three Mile Island incident in ways that that previous experts had not (see McGrayne 2011, p. 180). Had Rasmussen's analysis been heeded, the Three Mile Island incident might not have occurred.

This is the kind of strategic analysis, risk analysis, and technological forecasting that could help us to pivot the world in important ways.

Our AI risk situation is very complicated. There are many uncertainties about the future, and many interacting strategic variables. Though it is often hard to see whether a strategic analysis will pay off, the alternative is to act blindly.

Here are some examples of strategic research that may help (or have already helped) to inform our attempts to shape the future:

FHI's Whole Brain Emulation roadmap and SI's WBE discussion at the Summit 2011 workshop.
Nick Bostrom's forthcoming book on machine superintelligence.
Global Catastrophic Risks, which locates AI risk in the context of other catastrophic risks.
A model of AI risk currently being developed in MATLAB by Anna Salamon and others.
A study of past researchers who abandoned certain kinds of research when they came to believe it might be dangerous, and what might have caused such action. (This project is underway at SI.)

Here are some additional projects of strategic research that could help inform x-risk decisions, if funding were available to perform them:

A study of opportunities for differential technological development, and how to actually achieve them.
A study of microeconomic models of WBEs and self-improving systems.
A study of which research topics should and should not be discussed in public for the purposes of x-risk prevention. (E.g. we may wish to keep AGI discoveries secret for the same reason we'd want to keep the DNA of a synthetically developed supervirus secret, but we may wish to publish research on safe AGI goals because they are safe for a broader community to work on. But it's often difficult to see whether a subject fits into one category or the other.)

I'll note that for as long as FHI is working on AI risk, FHI probably has an advantage over SI in producing actionable strategic research, given past successes like the WBE roadmap and the GCR volume. But SI is also performing actionable strategic research, as described above.

Why no love for this project?

http://www.theuncertainfuture.com/

My perception as an outsider is that SI put a fair amount of manpower into it, finished it, submitted it to hacker news, and then folks largely forgot about it. Is it even linked from the SI website?

I do have it in my to-do list to consider the possibility of mining the work in that project for a paper about predicting AI.

I see "Java" and close the browser tab.

Norman Rasmussen's analysis of the safety of nuclear power plants, written before any nuclear accidents had occurred, correctly predicted several details of the Three Mile Island incident in ways that that previous experts had not (see McGrayne 2011, p. 180).

Is there any way that a policy maker could have known in advance to pay attention to Rasmussen rather than other experts? Is this a case of retroactively selecting the predictor who happened to be right out of a large group of varied, but roughly equally justified, predictors, or did Rasmussen use systematically better methods for making his predictions?

It's worth noting that stories of catastrophes that were successfully averted because someone listened to an expert may be hard to find.

If an expert tells you to add a safety mechanism, and you end up using that mechanism, you know that the expert helped you.

Right, but the story won't be written up, or will be harder to find.

Or the expert caused you to waste money on a needless safety mechanism.

I mean a safety mechanism like a button that shuts down the assembly line. If someone gets caught in the machinery and you push the button to prevent them from getting (more) hurt, you will be happy the expert told you to install that button.

Aha. I was reading "use" as "install", not "activate during emergency". I agree.

Is there any way that a policy maker could have known in advance to pay attention to Rasmussen rather than other experts?

Yes. Rasmussen used Bayes, while everyone else used the methods of (1) Frequentism or (2) Experts Must Have Great Intuitions.

All else being equal, I would put more trust in the report that uses Bayesian statistics than a report that uses Frequentist statistics, but I wouldn't expect that strong an effect from that alone. (I would expect a strong increase in accuracy for using any kind of statistics over intuition.)

Following your link, I notice that Rasmussen's report used a fault tree. I would expect that the consideration of failure modes of each component of a nuclear reactor played a huge role in his accuracy, and that Bayesian and Frequentist statistics would largely agree how to get individual failure rates from historical data and how to synthesize this information into a failure rate for the whole reactor. Assuming the other experts did not also use fault trees, I would credit the fault trees more than Bayes for Rasmussen's success. (And if they did, I would wonder where they went wrong.)

This is not a convincing argument to a policy maker.

A model of AI risk currently being developed in MATLAB by Anna Salamon and others

I hadn't heard about this one. Is there a list somewhere of all the little projects that the SI is working on? (Or should there be)? Posts like this one (and the monthly status reports) are very useful, but since each post only lists some of the things going on, I'm worried that I'll miss something interesting. Or that there's something I thought the SI were working on but which had been quietly dropped.