How important are accurate AI timelines for the optimal spending schedule on AI risk interventions?

Tristan Cook

27 How important are accurate AI timelines for the optimal spending schedule on AI risk interventions?

by Tristan Cook

16th Dec 2022

8 min read

2

27

Summary

I present an extension to my optimal timing of spending on AGI safety model for calculating the value of information of AGI timelines via informing one’s spending schedule.

I show, using my best guess of the model parameters, that for an AI risk funder uncertain between a ‘short timelines’ model and ‘medium timelines’ model:

Updating from near certainty in medium timelines to short timelines (and following the new optimal spending strategy) leads to a 40% increase in utility.
Updating from near certainty in short timelines to medium timelines (and following the new optimal spending strategy) leads to a 20% increase in utility.

The gains are greater when considering a model of the community’s capacity, rather than capital.

I also show that small changes in one’s credence in short or medium timelines has relatively little impact on one’s optimal spending schedule, especially when one starts out with roughly equal credence in each^[1].

You can enter your own parameters - such as AGI timelines, discount rate and diminishing returns to spending - here.

In an appendix I apply a basic model to consider the opportunity cost of timelines work. This model does not assume novel research is done.

The setup

Suppose you have two ‘models’ of AGI timelines and $B$ , with credence $p_{A}$ in $A$ and $(1 - p_{A})$ in $B$ ^[2]. You use your mixture distribution for AGI timelines $p_{A} \cdot A + (1 - p_{A}) \cdot B$ to calculate the optimal spending schedule for AI risk interventions.

You could do some thinking and come to some credence $p_{A}^{'}$ in $A$ and $(1 - p_{A}^{'})$ in $B$ . How much better is the optimal spending schedule as a result of $p_{A}^{'}$ to the optimal spending schedule as a result of $p_{A}$ , both supposing $p_{A}^{'}$ ?

Writing $S_{p_{A}}$ for optimal spending schedule according to $p_{A}$ and $U (S_{p_{A}} | p_{A})$ for utility of $S_{p_{A}}$ supposing $P_{A}$ , I compare $U (S_{p_{A}^{'}} | p_{A}^{'})$ to $U (S_{p_{A}} | p_{A}^{'})$ (the former, by definition of optimality is greater or equal to the latter).

In the model, utility is the discount adjusted probability we ‘succeed’ with making AGI go well. The maximum utility is $\int_{0}^{t} p (t) \cdot e^{- δ t} d t \leq 1$ where $p (t)$ are the AGI timelines and $d$ is the discount rate.

Example results for short and medium timelines

I take $A$ and $B$ as the following, and other parameters as described here. (See the appendix for some basic statistics on the mixtures of $A$ and $B$ .

I compute the results for both the ‘main’ model and the ‘alternate’ model described in the previous post.

The main model gives the optimal spending rate (in $ per year) for every time step on (1) research and (2) influence (being able to get AI risk reduction ideas implemented).
- Supposing there was only one funder of AI risk interventions, they should follow this spending strategy.
The alternate model gives the optimal ‘crunch’ rate at each time. At each time point we can either invest in our own capacity (be it through training ,hiring, some types of research, investing) or ‘crunch’ - spend capacity to produce work directly beneficial to reducing AI risk.
- This model gives more abstract results and is more applicable for individuals and teams.

I first show two examples of the optimal spending schedule for each of the models: one that is optimal supposing $A$ , short timelines, and one that is optimal supposing medium timelines, $B$ .

Results & analysis

Disclaimer: these results are based on my guesses of the model parameters. Further, the models are of course not without limitations. I expect the results to be directionally equal, but lower, for the robust spend-save model. Overall, I’m less confident in these results than the previous spending results.

For each pairing (main capital spending model, alternate community capacity spending model) x AGI success is (easy, medium, hard) I show three graphs. The leftmost plot shows the % increase in discount-adjusted probability of success (utility). The central plot shows this % increase after first factoring out the utilons we get for free - our probability of success if we contributed nothing. The rightmost plot shows the absolute increase in utility.

Both the main and alternate model show that:

The greatest gains are achievable in the hard case where we are highly confident in short timelines but mistaken. The easy case has the lowest gains: this is likely because we can do well regardless of our spending schedule.
There are greater gains to be had when one is very confident in short timelines than when one is very confident in medium to long timelines,
The maximum increases in utility are comparable to the gain in utility achieved from the approximate gain the community can make from moving from its current spending strategy to the optimal spending strategy.
There is little to no gain in utility when one’s credence changes in $A$ by less than 10% (this is the bottom-left to top-right diagonal band of purple).
The main model has greater difference between strategies than the alternate model (compare the examples above) which I believe leads to the rectangular-like areas in the plots: small increases in $p_{A}$ can push greatly towards early spending.
The alternate model sees greater utility gains from accurate timelines than the main model. This is likely due to:
- slightly greater marginal returns to ‘crunching’ than there are to spending capital in the main model
- greater opportunity cost of spending early (since there are higher returns to saving).

In practise

Naturally, those with low credal resilience in their AGI timelines have better returns to work on timelines.

The greatest potential gains of timelines work is for people already highly convinced of short timelines. This is particularly true if they are sacrificing gains in capacity now in order to ‘crunch’, or spending at a high rate now (sacrificing a greater amount of capital later). However, it seems that:

many crunch-like activities may also be building capacity (e.g. building skills necessary for crunchtime), especially because it is relatively neglected in the community
that the current community spending rate is much lower than the optimal schedule implied even by medium timelines, and so marginally pushing for greater spending is supported regardless of one's credence in short timelines.

In the other direction, the value of information moving from high confidence in medium timelines to high confidence in short timelines are likely to be higher than the results suggest because the current spending rate and crunching rate are too low. That is, if you become convinced of short timelines, your marginal spending/crunching has lower diminishing returns (because, by your lights, the rest of the community is at a suboptimally low spending/crunching rate).

Acknowledgements: thanks to Daniel Kokotajlo for the idea, comments and suggestions. Thanks to Tom Barnes for comments. All remaining errors are my own.

Appendix

Statistics about the mixture of A and B

Potentially useful for calculating your own $p_{A}$ .

$p_{A}$	25th percentile	Median	75th percentile
1 ( $A$ )	2025	2027	2030
0.75	2025	2028	2034
0.5	2026	2030	2041
0.25	2027	2033	2051
0 ( $B$ )	2030	2039	2060

Toy model

This model is highly flawed but potentially illustrative. Please take it with a massive grain of salt!

Suppose someone deliberates for years and updates from to . Had they not done the deliberation, they would have done years - optimal according to - leading to units of output. After the update, each year they generate units of output per year. Suppose this last until AGI arrives, which in expectation is .^[9] The person does not regret having done this if

.

Taking and as above, and gives the following:^[10]

Note The white areas show $T > 2$ . I hide these because I don’t condition the expected time remaining on $T$ years having passed.

The asymmetry in the plot due to the fact that in the case one in fact has longer than they’d guessed, their impact multiplier applies for more years (this could make sense if there are things you can do down that grow over time, even if in 2042 both the you-who-did-timelines-work-in-2022 and the you who didn’t have similar AGI timelines).

To use these results, one must have a prior over how their credence in A will change over the duration $T$ (in expectation it must not change at all though!). Someone with low credal resilience will have a 'wider' distribution than someone with higher credal resilience. One could simplify this step further by supposing discrete credences. For example, one could have 1/4 probability of staying at their current credence in A of 0.6, 1/4 credence in moving to 0.2 and 1/2 credence in moving to 0.8.

The toy model could be greatly improved. For example, one could model the situation as spending non-time resources, in which the above limitation does not occur. Further, one could allow for parallel work on timelines.

^{^}
For example, moving from 40-60% credence in short to medium, or vice versa, has very little gains to the optimal spending schedule.
^{^}
For example, A is ‘scale is all you need to AGI’ and B is ‘we need more difficult insights’. Or ‘ $A$ ’ is your independent impression, which you could become more confident in, and ‘ $B$ ’ is deference towards others (having accounted for information cascades etc)
^{^}
25% probability of AGI going well if it arrived this year, and slope parameter $l = 0.15$
^{^}
10% probability of AGI going well if it arrived this year, and slope parameter $l = 0.10$
^{^}
4% probability of AGI going well if it arrived this year, and and slope parameter $l = 0.05$
^{^}
25% probability of AGI going well if it arrived after a year and the community had been crunching at rate 1 unit of capacity per year (we start with one unit of capacity) and the slope parameter $κ = 5$
^{^}
10% probability of AGI going well if it arrived after a year and the community had been crunching at rate 1 unit of capacity per year (we start with one unit of capacity) and the slope parameter $κ = 2$
^{^}
5% probability of AGI going well if it arrived after a year and the community had been crunching at rate 1 unit of capacity per year (we start with one unit of capacity) and the slope parameter $κ = 0.5$
^{^}
Assuming $T$ is sufficiently small such that conditioning on no AGI in that time makes little difference.
^{^}
This roughly approximates the results above