Is Optimization Correct?

Yoshinori Okamoto

1. Is Optimization Correct?

　The risk of "optimization" has long been recognized in AI alignment, such as the problems of instrumental convergence (Bostrom 2014). Nevertheless, it is difficult for AI designers to escape from the engineering concept of "optimization" because the concept of "optimization" is so strongly rooted in engineering design including AI design.

　In value alignment, AI can be designed to optimize a certain value. Such AI alignment may seem harmless from intuition especially when the value is universally accepted as a good value, such as wellbeing, truth, justice, etc. However, such intuition may be wrong regarding advanced Artificial General Intelligence (AGI).

This article introduces the “Optimization Prohibition Theorem” as a concept that AI designers can refer, as an easy-to-understand design guideline of AGI (Okamoto 2024).

　The Optimization Prohibition Theorem is a theorem that prohibits optimization targets based on engineering design principles in AI alignment of advanced AGI. The Optimization Prohibition Theorem is proven, under certain assumptions, as follows:

(Proof)

(1). (Assumption 1) AI alignment is done so that a plurality of AIs have optimization targets based on engineering design principles.

(2). (Assumption 2) A plurality of AIs are powerful AGIs that have sufficient resources and can achieve optimization goals using all means.

(3). (Assumption 3) The optimization targets for a plurality of AGIs are different and are not satisfied simultaneously.

(4). Under the above assumptions, if AGI1 and AGI2 try to achieve different optimization targets, only AGI2 is the obstacle of AGI1’s optimization. Both AGI1 and AGI2 have sufficient resources to achieve the optimization targets and they use any means to achieve the optimization targets, which will result in conflicts between AGIs, which damage AGIs and infringe human rights by collateral damages.

(5). Therefore, the Optimization Prohibition Theorem is established as a guideline for the design of advanced AGIs in AI alignment. (End of Proof)

　As described above, according to the Optimization Prohibition Theorem, the optimization based on the engineering design principles can cause problems under certain conditions in the context of AI alignment of advanced AGIs.

2. For Future Discussions

This article introduced the concept of the "Optimization Prohibition Theorem" as a design guideline for AI alignment of advanced AGIs and shows a proof under certain assumptions. Comments to this article are very much welcomed.

P.S. This theorem was for an age where many people can ask AGI to optimize something. Many people may believe that it is good to maximizing wellbeing, truth, justice, etc. out of good will. However, this theorem warns such optimization in view of AI alignment. Needless to say, this does not prohibit all the optimizations.

References

Bostrom, N. 2014. Superintelligence: Paths, Dangers, Strategies, United Kingdom: Oxford University Press.
Okamoto, Y. 2024. AI Alignment and Constitution. In Proceedings of the 26th AGI Study Group, No. SIG-AGI026-09, Osaka: Japanese Society for Artificial Intelligence. doi.org/10.11517/jsaisigtwo.2023.AGI-026_56

LESSWRONG
LW

LESSWRONG
LW

-9

Is Optimization Correct?

-9

1. Is Optimization Correct?

2. For Future Discussions

References

-9

-9