FAI Research Constraints and AGI Side Effects

The venture wants to make sure that t_f< t_g so that the eventual AI is friendly (assumption 3). With this, we find that:

$C_{0}=f_{global}G'_{e}-F'_{e}$

Where the values of C₀ and C₁ both include the friendliness ratio $f_{global}=\frac{F_{remaining}}{G_{remaining}}$ .

$C_{0}\approx f_{global}G'_{e}$

This implies a linear relationship between F'_i and G'_i. The more FAI research the FAI venture can produce, the more AGI research it is allowed to leak.

This gives us a clean way to go from a G'_i value the venture could expect to the F'_i it would need to be successful.

The C₀ value describes the absolute minimum amount of FAI research necessary in order to have a chance at a successful outcome. While the resulting acceptable leakage ratio at this point would be impossible to meet, the baseline is easy to calculate. Assuming that F'_e << f_globalG'_e, we can estimate that

If we wanted to instead calculate G'_i using F'_i, we could use the following equation. This may be more direct to the intentions of a venture (finding the acceptable amount of AGI leakage after estimating FAI productivity).

$G'_{i}<-(G'_{e}+\frac{F'_{e}}{f_{global}})+(\frac{1}{f_{global}})\cdot F'_{i}$

Model 2 Example

For example, let's imagine that the $f_{global}=10\frac{r_{f}}{r_{g}}$ and $G'_{e}=10\frac{r_{g}}{year}$ . In this case, $C_{0}=100\frac{r_{f}}{year}$ . This means that if the venture could make sure to leak exactly $0\frac{r_{g}}{year}$ , it would need to average a FAI research rate of 10 times that of the entire world's output of AGI research. This amount increases as 100 / (1 - 10 * l_project). If the venture expects an estimated leakage ratio of 0.05, they would need to double their research output to $C_{0}=200\frac{r_{f}}{year}$ , or 20 times global AGI output.

Figure 2. F'_i per unit of maximum permissible G'_i

What to do?

The numbers in the example above are a bit depressing. There is so much global AI research that it seems difficult to imagine the world averaging an even higher rate of FAI research, which would be necessary if the friendliness ratio is greater than 1.

There are some upsides. First, much hard AI work is done privately in technology companies without being published, limiting G'_i. Second, the numbers of r_g and r_f don't perfectly correlate with the difficulty to reach them. It may be that we have diminishing marginal returns with our current levels of r_g, so similar levels of r_f will be easier to reach.

It's possible that F_remaining may be surprisingly low or that G_remaining may be surprisingly high.

Projects with high leakage ratios don't have to be completely avoided or hidden. The G'_i value is specifically for research that will be in the hands of the group that eventually creates a AGI, so it would make sense that FAI research organizations could share high risk information between each other as long as it doesn't leak externally. The FAI venture mentioned above could be viewed as a collection of organizations rather than one specific one. It may even be difficult for AGI research implications to move externally, if the FAI academic literature is significantly separated from AGI academic literature. This logic provides a heuristic guide to choosing research projects, choosing if to publish research already done, and managing concentrations of information.

Model 2 Assumptions:

1-3. The same 3 assumptions for the previous model.

4. The rates of research creation will be fairly constant.

5. External and internal rates of research do not influence each other.

Conclusion

The friendliness ratio provides a high-level understanding of the amount of global FAI research per unit AGI research needed to create an FAI. The leakage ratio is the inverse of the friendliness ratio applied to a specific FAI project, to specify if that specific project is net friendliness positive. These can be used to understand the challenge for AGI research and tell if a particular project is net beneficial or net harmful.

To understand the challenges facing an FAI Venture, we found the simple equation

$C_{0}=f_{global}G'_{e}=F'_{e}$

where

This paper was focused on establishing the mentioned models instead of estimating input values. If the models are considered useful, there should be more research to estimate these numbers. The models could also be improved to incorporate uncertainty, the growing returns of research, and other important limitations that we haven't considered. Finally, the friendliness ratio concept naturally generalizes to other technology induced existential risks.

Appendix

a. Math manipulation for Model 2

$\frac{F_{remaining}}{F'_{i}+F'_{e}}<\frac{G_{remaining}}{G'_{i}+G'_{e}}$

$t_{f}<t_{g}$

$G'_{i}<(\frac{G_{remaining}\cdot F'_{e}}{F_{remaining}}-G'_{e})(\frac{G_{remaining}}{F_{remaining}}F'_{i})$

$F_{i}>(\frac{F_{remaining}\cdot G'_{e}}{G_{remaining}}-F'_{e})(\frac{F_{remaining}\cdot G'_{i}}{G_{remaining}})$

This last equation can be written as

$C_{0}=\frac{F_{remaining}\cdot G'_{e}}{G_{remaining}}-F'_{e}$

Where

$C_{1}=\frac{F_{remaining}}{G_{remaining}}$

Recalling the friendliness ratio, $f_{global}=\frac{F_{remaining}}{G_{remaining}}$ , we can simplify these constructs further.

$C_{0}=f_{global}G'_{e}-F'{e}$

$f_{global}=\frac{\mathit{F_{remaining}}}{\mathit{G_{remaining}}}$

References

[1] What is AGI? https://intelligence.org/2013/08/11/what-is-agi/, 2013, Luke Muehlhauser

[2] Intelligence Explosion FAQ, (https://intelligence.org/ie-faq/), MIRI

[3] Artificial Intelligence as a Positive and Negative Factor in Global Risk, 2008, Global Catastrophic Risks, Yudkowsky

[4] Aligning Superintelligence with Human Interest: A Technical Research Agenda, https://intelligence.org/files/TechnicalAgenda.pdf, Nate Soares and Benja Fellenstein, MIRI

[5] Superintelligence, 2014, Nick Bostrom

[6] The Challengeof Friendly AI, https//www.youtube.com/watch?v=nkB1e-JCgmY&noredirect=1 Yudkowsky, 2007

AI RiskAI

Personal Blog

27

New Comment

Rendering 0/59 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 10:53 PM

Moderation Log

27 FAI Research Constraints and AGI Side Effects

by JustinShovelain

3rd Jun 2015

8 min read

27

Ozzie Gooen and Justin Shovelain

Summary

Introduction

Model 1. The Friendliness and Leakage Ratios for an FAI Project

The Friendliness Ratio

Which threshold is higher? According to much of the research in this field, F_remaining. We need significantly more research to create a friendly AI than an unfriendly one.

Figure 1. Example research thresholds for AGI and FAI.

To understand the relationship between these thresholds, we use the following equation.

The Leakage Ratio

For specific projects it may be useful to have a measure that focuses directly on the negative outcome.

For this we can use the leakage ratio, which represents the amount of undesired AGI research created per unit of FAI research. It is simply the inverse of the friendliness ratio.

$l_{global}=\frac{\mathit{G_{threshold}}}{\mathit{F_{remaining}}}$

$l_{project}=\frac{\mathit{G_{project}}}{\mathit{F_{project}}}$

In order for a project to be net beneficial,

$l_{project}<l_{global}$

Estimating if a Project is Net Friendliness Positive

Question: How can one estimate if a project is net friendliness-positive?

A naive answer would be to make sure that it falls over the global friendliness ratio or under the global leakage ratio.

$\frac{F_{project}}{G_{project}}<f_{global}$

Later research would need to make up for this under-balance.

AI Research Example

	Description	AGI Research $G_{p}$	FAI Research $F_{p}$	Friendliness Ratio $f_{p}$	Leakage Ratio $l_{p}$
Project 1	Rat simulation	$10r_{g}$	$60r_{f}$	6	0.17
Project 2	Math Paper	$2r_{g}$	$22r_{f}$	11	0.09
Project 3	Technical FAI Advocacy	$1r_{g}$	$14r_{f}$	14	0.07

In this case, only Projects 2 and 3 have a leakage ratio of less than 0.1, meaning that only these would net beneficial. Even though Project 1 has generated safety research, it would be net negative.

Model 1 Assumptions:

1. There exists some threshold G_remaining of research necessary to generate an unfriendly artificial intelligence.

2. There exists some threshold F_remaining of research necessary to generate a friendly artificial intelligence.

3. If G_remaining is reached before F_remaining, a UFAI will be created. If after, an FAI will be created.

Model 2. AGI Leakage Limits of an FAI Venture

Question: How can an FAI venture ensure the creation of an FAI?

Let's imagine a group that plans to ensure that an FAI is created. We call this an FAI Venture.

G'_i = AGI research produced internally per year

F'_i = FAI research produced internally per year

G'_e = AGI research produced externally per year

F'_e = FAI research produced externally per year

We can understand that there exists times, t_f and t_g, which are the times at which the friendly and general remaining thresholds are met.

t_f = Year in which F_remaining is met

t_g = Year in which G_remaining is met

These times can be estimated as follows:

The venture wants to make sure that t_f< t_g so that the eventual AI is friendly (assumption 3). With this, we find that:

$C_{0}=f_{global}G'_{e}-F'_{e}$

Where the values of C₀ and C₁ both include the friendliness ratio $f_{global}=\frac{F_{remaining}}{G_{remaining}}$ .

$C_{0}\approx f_{global}G'_{e}$

This implies a linear relationship between F'_i and G'_i. The more FAI research the FAI venture can produce, the more AGI research it is allowed to leak.

This gives us a clean way to go from a G'_i value the venture could expect to the F'_i it would need to be successful.

$G'_{i}<-(G'_{e}+\frac{F'_{e}}{f_{global}})+(\frac{1}{f_{global}})\cdot F'_{i}$

Model 2 Example

Figure 2. F'_i per unit of maximum permissible G'_i

What to do?

It's possible that F_remaining may be surprisingly low or that G_remaining may be surprisingly high.

Model 2 Assumptions:

1-3. The same 3 assumptions for the previous model.

4. The rates of research creation will be fairly constant.

5. External and internal rates of research do not influence each other.

Conclusion

To understand the challenges facing an FAI Venture, we found the simple equation

$C_{0}=f_{global}G'_{e}=F'_{e}$

where

Appendix

a. Math manipulation for Model 2

$\frac{F_{remaining}}{F'_{i}+F'_{e}}<\frac{G_{remaining}}{G'_{i}+G'_{e}}$

$t_{f}<t_{g}$

$G'_{i}<(\frac{G_{remaining}\cdot F'_{e}}{F_{remaining}}-G'_{e})(\frac{G_{remaining}}{F_{remaining}}F'_{i})$

$F_{i}>(\frac{F_{remaining}\cdot G'_{e}}{G_{remaining}}-F'_{e})(\frac{F_{remaining}\cdot G'_{i}}{G_{remaining}})$

This last equation can be written as

$C_{0}=\frac{F_{remaining}\cdot G'_{e}}{G_{remaining}}-F'_{e}$

Where

$C_{1}=\frac{F_{remaining}}{G_{remaining}}$

Recalling the friendliness ratio, $f_{global}=\frac{F_{remaining}}{G_{remaining}}$ , we can simplify these constructs further.

$C_{0}=f_{global}G'_{e}-F'{e}$