Why modelling multi-objective homeostasis is essential for AI alignment (and how it helps with AI safety as well). Subtleties and open challenges.
I notice that there has been very little discussion on why and how considering homeostasis is significant, even essential for AI alignment and safety. The current post aims to contribute to or assamending that situation. In this post I will treat alignment and safety as explicitly separate subjects, which both benefit from homeostatic approaches. This text is a distillation and reorganisation of three of my older blog posts at Medium: * Making AI less dangerous: Using homeostasis-based goal structures (2017) * Project proposal: Corrigibility and interruptibility of homeostasis based agents (2018) * Diminishing returns and conjunctive goals: Mitigating Goodhart’s law with common sense. Towards corrigibility and interruptibility via the golden middle way. (2018) I will probably share more such distillations or weaves of my old writings in the future. Introduction Much of AI safety discussion revolves around the potential dangers posed by goal-driven artificial agents. In many of these discussions, the agent is assumed to maximise some utility metric over an unbounded timeframe. This simplification, while mathematically convenient, can yield pathological outcomes. A classic example is the so-called “paperclip maximiser”, a “utility monster” which steamrolls over other objectives to pursue a single goal (e.g. creating as many paperclips as possible) indefinitely. “Specification gaming”, Goodhart’s law, and even “instrumental convergence” are also closely related phenomena. However, in nature, organisms do not typically behave like pure maximisers. Instead, they operate under homeostasis: a principle of maintaining various internal and external variables (e.g. temperature, hunger, social interactions) within certain “good enough” ranges. Going far beyond those ranges — too hot, too hungry, too socially isolated — leads to dire consequences, so an organism continually balances multiple needs. Crucially, “too much of a g