AI Forecasting Resolution Council (Forecasting infrastructure, part 2)

Bird Concept; Ben Goldhaber

UPDATE October 18, 2020: The AI Resolution Forecasting Council has been retired due to lack of demand for its services. I'm keeping this post up for historical value.

---------------

This post introduces the AI Forecasting Resolution Council, a group of researchers with technical expertise in AI who will allow us to expand the space of effectively forecastable questions. It is the second part in a series of blog posts which motivate and introduce pieces of infrastructure intended to improve our ability to forecast novel and uncertain domains like AI.

The Council is currently in beta, and we're launching early to get feedback from the community and quickly figure out how useful it is.

Background and motivation

A key challenge in (AI) forecasting is to write good questions. This is tricky because we want questions which both capture important uncertainties, and are sufficiently concrete that we can resolve them and award points to forecasters in hindsight.

Here are some example questions within AI that make this especially difficult:

Counterfactual questions

Suppose in 2000 you use “superhuman Othello from self-play” as a benchmark of a certain kind of impressive AI progress, and forecast it to be possible by 2020. It seems you were correct -- very plausibly the AlphaZero architecture should work for this. However, in a strict sense your forecast was wrong -- because no one has actually bothered to build a powerful Othello agent without relying on handcrafted evaluation functions (EDIT: thanks to Vanessa Kosoy for pointing out that otherwise superhuman Othello systems exist).

So if a calibrated forecaster faces this question in 2000, considerations regarding who will bother to pursue what project “screen off” considerations regarding fundamental drivers of AI progress and their gradients. Yet the latter concern is arguably more interesting.

This problem could be solved if we instead forecasted the question “If someone were to run an experiment using the AI technology available in 2020, given certain resource constraints, would it seem with >95% confidence, that they’d be able to create a superhuman Othello agent that learnt only from self-play?”

Doing so requires a way of evaluating the truth value of that counterfactual, such as by asking a group of experts.

Similarity questions

Suppose we try to capture performance by appealing to a particular benchmark. There's a risk that the community will change its focus to another benchmark. We don’t want forecasters to spend their effort thinking about whether this change will occur, as opposed to fundamental question about the speed of progress (even if we would want to track such sociological facts about which benchmarks were prominent, that should be handled by a different question where it’s clear that this is the intent).

So to avoid this we need a sufficiently formal way of doing things like comparing performance of algorithms across multiple benchmarks (for example, if RL agents are trained on a new version of Dota, can we compare performance to OpenAI Five’s on Dota 2?).

Definition-of-terms questions

This is more straightforward and related to the AI Forecasting Dictionary. For example, how do we sufficiently clearly define what counts as “hard-coded domain knowledge”, and how much reward shaping you can add before the system no longer learns from “first principles”?

Valuation questions

Not all important uncertainties we care about might be able to be turned into a concretely operationalised future event. For example, instead of trying to operationalise how plausible the IDA agenda will seem in 3 years by making a long, detailed specification of the outcome of various experiments, we might just ask “How plausible will IDA seem to this evaluator in 3 years?” and then try to forecast that claim.

Making this work will require carefully choosing the evaluators such that, for example, it is generally easier and less costly to forecast the underlying event than the opinions of the evaluator, and that we trust that the evaluation actually tracks some important, natural, hard-to-define measure.

Prediction-driven evaluation is a deep topic, yet if we could make it work it is potentially very powerful. See e.g. this post for more details.

AI Forecasting Resolution Council

As a step towards solving the above problems, we’re setting up the AI Forecasting Resolution Council, a group of researchers with technical expertise in AI, who are volunteering their judgement to resolve questions like the above.

The services of the council are available to any forecasting project, and all operations for the council will be managed by Parallel Forecast. In case there is more demand for resolutions than can be filled, Parallel will decide which requests to meet.

We think that this Council will create streamlined, standardised procedures for dealing with tricky cases like the above, thereby greatly expanding the space of effectively forecastable questions.

There are still many questions to be figured out regarding incentives, mechanism design, and question operationalisation, and we think that by setting up the Resolution Council, we are laying some groundwork to begin experimenting in this direction; and discover best practices and ideas for new, exciting experiments.

The initial members of the council are:

Daniel Filan (CHAI)
Chris Cundy (Stanford)
Gavin Leech (Bristol)
William Saunders (Ought)

We expect to be adding several more members over the coming months.

The database of previous verdicts and upcoming resolution requests can be found here.

How to use the council if you run a forecasting project

If you’re attempting to forecast AI and have a problem that could be solved by querying the expert council at a future state, let us know by filling in this resolution request form.

How to join the council

If you have technical expertise in AI and would be interested in contributing to help expand the space of forecastable questions, let us know using this form.

There is no limit on the number of judges, since we can always randomise who will vote on each distinct verdict.

[-]Vanessa Kosoy6y70

Suppose in 2000 you use “superhuman Othello from self-play” as a benchmark of a certain kind of impressive AI progress, and forecast it to be possible by 2020. It seems you were correct -- very plausibly the AlphaZero architecture should work for this. However, in a strict sense your forecast was wrong -- because no one has actually bothered to build a powerful Othello agent.

This might be a bad example? Quoting the Wikipedia: "There are many Othello programs... that can be downloaded from the Internet for free. These programs, when run on any up-to-date computer, can play games in which the best human players are easily defeated." Well, arguably they are not "from self-play" because they use hand-craft evaluation functions. But, "no one has actually bothered to build a powerful Othello agent" seems just plain wrong.

[-]Bird Concept6y20

Thanks for pointing that out. I was aware of such superhuman programs, but the last sentence failed to make the self-play condition sufficiently clear. Have updated it now to reflect this.

LESSWRONG
LW