Toy model: convergent instrumental goals
tl;dr: Toy model to illustrate convergent instrumental goals.
Steve Omohundro identified 'AI drives' (also called 'Convergent Instrumental goals') that almost all intelligent agents would converge to:Self-improve
- Be rational
- Protect utility function
- Prevent counterfeit utility
- Self-protective
- Acquire resources and use them efficiently
This post will attempt to illustrate some of these drives, by building on the previous toy model of the control problem, which was further improved by Jaan Tallinn.
Systemic risk: a moral tale of ten insurance companies
Once upon a time...
Imagine there were ten insurance sectors, each sector being a different large risk (or possibly the same risks, in different geographical areas). All of these risks are taken to be independent.
To simplify, we assume that all the risks follow the same yearly payout distributions. The details of the distribution doesn't matter much for the argument, but in this toy model, the payouts follow the discrete binomial distribution with n=10 and p=0.5, with millions of pounds as the unit:

This means that the probability that each sector pays out £n million each year is (0.5)10 . 10!/(n!(10-n)!).
All these companies are bound by Solvency II-like requirements, that mandate that they have to be 99.5% sure to payout all their policies in a given year - or, put another way, that they only fail to payout once in every 200 years on average. To do so, in each sector, the insurance companies have to have capital totalling £9 million available every year (the red dashed line).
Assume that each sector expects £1 million in total yearly expected profit. Then since the expected payout is £5 million, each sector will charge £6 million a year in premiums. They must thus maintain a capital reserve of £3 million each year (they get £6 million in premiums, and must maintain a total of £9 million). They thus invest £3 million to get an expected profit of £1 million - a tidy profit!
Every two hundred years, one of the insurance sectors goes bust and has to be bailed out somehow; every hundred billion trillion years, all ten insurance sectors go bust all at the same time. We assume this is too big to be bailed out, and there's a grand collapse of the whole insurance industry with knock on effects throughout the economy.
But now assume that insurance companies are allowed to invest in each other's sectors. The most efficient way of doing so is to buy equally in each of the ten sectors. The payouts across the market as a whole are now described by the discrete binomial distribution with n=100 and p=0.5:

This is a much narrower distribution (relative to its mean). In order to have enough capital to payout 99.5% of the time, the whole industry needs only keep £63 million in capital (the red dashed line). Note that this is far less that the combined capital for each sector when they were separate, which would be ten times £9 million, or £90 million (the pink dashed line). There is thus a profit taking opportunity in this area (it comes from the fact that the standard deviation of X+Y is less that the standard deviation of X plus the standard deviation Y).
If the industry still expects to make an expected profit of £1 million per sector, this comes to £10 million total. The expected payout is £50 million, so they will charge £60 million in premium. To accomplish their Solvency II obligations, they still need to hold an extra £3 million in capital (since £63 million - £60 million = £3 million). However, this is now across the whole insurance industry, not just per sector.
Thus they expect profits of £10 million based on holding capital of £3 million - astronomical profits! Of course, that assumes that the insurance companies capture all the surplus from cross investing; in reality there would be competition, and a buyer surplus as well. But the general point is that there is a vast profit opportunity available from cross-investing, and thus if these investments are possible, they will be made. This conclusion is not dependent on the specific assumptions of the model, but captures the general result that insuring independent risks reduces total risk.
But note what has happened now: once every 200 years, an insurance company that has spread their investments across the ten sectors will be unable to payout what they owe. However, every company will be following this strategy! So when one goes bust, they all go bust. Thus the complete collapse of the insurance industry is no longer a one in hundred billion trillion year event, but a one in two hundred year event. The risk for each company has stayed the same (and their profits have gone up), but the systemic risk across the whole insurance industry has gone up tremendously.
...and they failed to live happily ever after for very much longer.
Toy problem: increase production or use production?
There is a class of problems that I noticed comes up again and again in various scenarios. Abstractly, you can formulate it like this: given a time limit, how much time should you spend increasing your production capacity, and then how much time should you use your production capacity to produce utility? Let's take a look at two version of this problem:
Version 1:
You have N days. You start with a production capacity C=0 and accumulated utility U=0. Each day you can either: 1) increase your production capacity (C=C+1) or 2) use your current production capacity to produce utility (U=U+C).
Question: On what days should you increase your production, and on what days should you produce utility to maximize total accumulated utility at the end of the N days?
It's trivial to prove that the optimal solution looks like increasing capacity for T days, and then switching to producing utility for N-T days. What is T? In this case it's really straight-forward to figure it out. We can compute final utility as U(T)=(N-T)*T. The maximum is at T=N/2. So, you should spend the first half increasing your production and the second half producing utility. Interesting...
Version 2:
You have N days. You start with a production capacity C=1 and accumulated utility U=0. Each day you can either: 1) increase your production capacity by a factor F (C=C*F), where F>1 or 2) use your current production capacity to produce utility (U=U+C).
Same question. Now the final utility is U(T)=F^T*(N-T). Doing basic calculus, we find the optimal T=max(0, N-1/ln(F)). A few interesting points you can take a way from this solution:
1) If your growth factor F is not large enough, you might have to stick with your original production capacity of 1 and never increase it. E.g. F=1.01 and N=100, where optimal T=max(0,-0.499171).
2) The bigger the N, the lower growth factor you can accept as being useful, i.e. T>0.
3) For most scenarios, you should spend 80-90% of the time increasing the production. Example. With larger F, T will approach N. This reminds me of Reducing Astronomical Waste post.
Questions for you: where have you seen these types of problems come up in your life? Is this a known class of problems?
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)