More clarity and context here The short TLDR version is here
I formulate a desiderata/procedure (called "phenomenal ethics") : Expanding the action space and autonomy of a maximum of phenomenons (enhancing ecosystemic values, and the means to reach/understand options), modulated by variables mitigating frantic optimizations (to respect natural evolving rates etc).
We need to find a tractable desiderata securing actions in an adequate value space.
Utilitarianism based on well-being demands to define well-being,
It demands to maximize happiness, but what is happiness?
The classical issues arise : amorphous pleasure, wireheading, drug addiction etc.
Here I'll defend that the production of "well-being" is intrinsic to the procedure/desiderata I formulate. An ideal enactment of phenomenal ethics inherently maximizes the possibility of anyone/anything to do as they will; if *what they will* enhances the possibility of other 'things' to do *as they will* as well.
The principle is quite elementary, but how to compute it properly?
It is often argued that our morality is subjective, but it doesn't mean that morality is arbitrary. Although it is true that our current moral systems can be thought as miscalibrated/imperfect, they are approximations driven by an underlying attractor emerging from evolutionary constraints.
It doesn't follow that better calibration is impossible. Ethics is a subfield of game theory, which is a subfield of natural selection, which is a subfield of process theory.
If we take a step back and look at why such thing like 'ethics' would emerge, it seems that a higher order nash equilibrium at the scale of a group, a species or several species is necessary. The development of complexity and chaotic systems implies the production of nash equilibria able to maintain structures/continuity for open-endedness to go on (and not fall into randomness). In such context the circulation of energy is bound to produce ecosystems (of alive as well as non-alive phenomena).
.Open-ended evolution "Open-ended evolution (OEE) is a major area of research in artificial life. While there is not full consensus on how to define the specifics of OEE, at a high level it describes an evolving system that will never settle into a single stable equilibrium. Some researchers argue that the continual generation of novelty is sufficient for a system to be labeled as open-ended (Soros & Stanley, 2014), while others have suggested that OEE “refers to the unbounded increase in complexity that seems to characterize evolution on multiple scales” (Corominas-Murtra, Seoane & Solé, 2018). In practice, these definitions are closely inter-related, and OEE is often used as an umbrella term to refer to the study of them all collectively. The key idea of open-ended evolution is that a system produces organisms that are continuously evolving and changing, rather than organisms eventually reaching a state where nothing changes.
In this critical in-between dynamic, the notion of open-endedness is core. If we think about our moral intuition surrounding this question, a pretty clear answer may surface;
Let's do a thought experiment
"Humans are dead tomorrow", what would be better between :
1) A future where nothing interesting ever happens, ie. an AI kills humans and turns the universe into dead inert matter -> An outcome looking like the worry of Eliezer Yudkowsky;
"I expect that unaligned superintelligence will overwrite all nearby galaxies with stuff that is not very interesting or good from even the most embracing, cosmopolitan viewpoint on accepting diversity as valuable. I expect mostly lifeless paperclips, or stuff worth that little."
Let's say that in this scenario, an artificial agent brings heat death to the universe (tomorrow). All right, now let's see the second timeline;
2) Humans are dead (virus or/and AI) but the universe is blooming with sophisticated phenomena (in many various physical scales), diverse lively ecosystems (biological and more), including human-level consciousness (and less and more). Let's say that in this scenario, the universe is fully open-ended.
-> Our moral intuition seems limpid, favoring this second open-ended scenario.
What is better for alive species of the second scenario, them living in : Torture, or well-being?
This other worry could be even harder to define, but the moral intuition for well-being is obvious. How to avoid those worst cases dead/tortured universes brought by these two "dilemma"? In such intricacy, the first moral intuition (open-endedness) may paradoxically be more tractable.
To uncover it, we have to understand why we are programed to have such compelling instinct. A case we'll analyze through the subject at hand : a procedure for universal/phenomenal ethics.
Let's define the desiderata : an agent (human/AI/AGI/ASI etc.) has to enhance the autonomy and available actions of existing "behaviors" (phenomenon/environment/agents etc.), which implies selecting out behaviors that aren't causally beneficial to other behaviors (ecosystemic well-being).
I include in the desiderata phenomena that aren't considered alive, any physical event is included (which is why I call it "phenomenal ethics"), an extension of paradigms such as :
.The Hippocratic principle - Vanessa Kosoy "I propose a new formal desideratum for alignment: the Hippocratic principle. Informally the principle says: an AI shouldn't make things worse compared to letting the user handle them on their own, in expectation w.r.t. the user's beliefs. This is similar to the dangerousness bound I talked about before, and is also related to corrigibility."
.Learning Altruistic Behaviours in Reinforcement Learning without External Rewards "We propose to act altruistically towards other agents by giving them more choice and allowing them to achieve their goals better. Some concrete examples include opening a door for others or safeguarding them to pursue their objectives without interference. We formalize this concept and propose an altruistic agent that learns to increase the choices another agent has by preferring to maximize the number of states that the other agent can reach in its future."
The aim is to provide affordance, access to adjacent possibles, allow phenomena, ecosystems and individuals to bloom, develop diversity in as many dimensions as possible.
.Patterns of innovation "The “adjacent possible”, introduced by the complexity theorist Stuart Kauffmann consists of all the unexplored possibilities surrounding a particular phenomenon: ideas, words, songs, molecules, genomes, and so on. The very definition of adjacent possible encodes the dichotomy between the actual and the possible: the actual realization of a given phenomenon and the space of possibilities still unexplored. But all the connections between these elements are hard to measure and quantify when including the things that are entirely unexpected and hard to imagine."
Frantic transformation of everything in order to hunt new dimensions of possibilities is to be avoided, so we need to relativize the desiderata with other constraints/parameters added to the equation (those parameters aren't absolute but variables for value attribution) :
Existing phenomena is prioritized over potential phenomena
The intensity of impact of actions has to be minimized the higher uncertainty is
Phenomena untainted by AI's causal power are prioritized over phenomena tainted by it
Normal rates of change are to be prioritized over abnormal ones; 'normal' being the natural rate at which a phenomenon evolves without intervention, in an environment without radical events. 'Radical events' such as catastrophes, great extinctions etc. (normality is statistical, it's a gaussian).
Coherence identifies a phenomenon from others, and the gaussian of probabilities surrounding it, detecting hierarchies of sets through correlations. More precise definitions of self exist as well :
An important parameter is that more care is to be given to phenomena in qualia's contiunuum (the more qualia are present, the more precaution is given, ie. more time/energy). Thus, frameworks to define life/consciousness are useful.
Qualia brings to surface the challenge to detect suffering.
Taken in a broad sense suffering is symmetrical and opposed to happiness.
However, local/myopic, short-timed happiness is regularly detrimental, while local suffering can be beneficial. Which brings back to our desiderata, and questions how natural it is; how beings may implicitly derive towards it in the evolutionary process. If so, it could be possible to define suffering as inversely proportional to the desiderata. Within such framing, if we take the happiness example, the "boundless bliss" of an individual phenomenon would lower its own accesibility to the action space. Wire-heading is not increasing the agency of the wire-headed. When taking the desiderata in its full account it's also intractable. Individual phenomenon's "well-being" is inherently balanced to the benefice of phenomenal co-affordance, which is to say, to the whole ecosystem/reality's "well-being".
Implicitly, game theoretic mechanisms behind this desiderata are naturally calibrating its execution. The crucial question is to blossom ecosystemic value, which means that even a seemingly insignificant phenomenon could be fundamental. Furthermore, a phenomenon might be inconsequential now and paramount later on. Each causal chain contains "endemic" value.
How does a basal cognition/human/AI etc. explores its space of attention, actions? How does it know its opportunities?
And how does it care to bypass the single-step optimality to see many steps ahead, accessing a more precise and complete map of opportunities?
The most efficient next step depends on your capacity to see far, wide, and precisely.
If you see well enough your world/behavioral model can turn the optimal-next-step inside rather than outside. Thus, doing recursive self-inspection, looking at source code etc.
This paradigm brings perspective on an alignment procedure implementing the meaning of loss regarding destructions. Each bit/data/information has inequivalent qualities that can't be perfectly replaced, every new bit is a phenomenon tied to (sometimes only slightly) different potentials (properties, momentum, causalities). This ontology/desiderata is an operational definition of open-ended knowledge seeking. AI mitigates strict losses of ecosystemic autonomy, enhances access to behaviors able to enhance access to behaviors, and tolerates sub-optimal external (environmental) choices that still obey said enhancements.
Uncertainty, energy use and data limitedness implies the inability to flawlessly execute such desiderata. Still such "universal agency" would be a moral compass applicable to "omnipotent" agents, which makes it intrinsically scalable. It's also related to fundamental principles such as thermodynamics and synergy, so we should be able to align simple systems to it as well.
Except when a particular destruction (~apoptosis) is needed to protect the whole, all existences, all livings, have the same right to be. Any destruction locks in itself the possibilities it extinct.
Priority is given to non-involvement, non-invasive impact, parsimony, in order to preserve external/environmental integrity prompting analysis and perception-first AI.
Instead of being "grabby aliens" colonizing space while destroying endemic potentials, it seems better to increase open-endedness through reality's phenomena ethics, even beyond humans and life.
Subjective/foreign preferences are welcomed and the degrees of freedom inside phenomenal ethics doesn't force you to actively obsess with 'open-endedness'. You decide your choices within your irreducible space of computation (relaxed, within an accepted marge of sub-optimality).
No other moral system achieves such diversity (ie. ethics with no rule would pragmatically restrain or extinct a myriad of new possibilities) but some studies/projects/concepts take somewhat similar directions.
To understand the evolutionary attractors underlying our moral intuitions, including, crucially, the ones surrounding open-endedness, AI (and humans) already possesses material for triangulation :
We can map the dimensions of human cultures' diversity, subjective experiences, conceptual polysemy etc. The dimensions of meaning of a single word have inherent variations due to our subjective perspectives :
.The Perception Census "Your responses to the questions and tasks in the Census will help us build a map of the different ways we each experience the world through our senses. It will help us see how some traits and experiences relate to others, and by revealing this, will shed new light on the way our brains and bodies interact to build our overall experience of the world."
More clarity and context here
The short TLDR version is here
I formulate a desiderata/procedure (called "phenomenal ethics") :
Expanding the action space and autonomy of a maximum of phenomenons (enhancing ecosystemic values, and the means to reach/understand options), modulated by variables mitigating frantic optimizations (to respect natural evolving rates etc).
We need to find a tractable desiderata securing actions in an adequate value space.
Utilitarianism based on well-being demands to define well-being,
It demands to maximize happiness, but what is happiness?
The classical issues arise : amorphous pleasure, wireheading, drug addiction etc.
Here I'll defend that the production of "well-being" is intrinsic to the procedure/desiderata I formulate. An ideal enactment of phenomenal ethics inherently maximizes the possibility of anyone/anything to do as they will; if *what they will* enhances the possibility of other 'things' to do *as they will* as well.
The principle is quite elementary, but how to compute it properly?
It is often argued that our morality is subjective, but it doesn't mean that morality is arbitrary. Although it is true that our current moral systems can be thought as miscalibrated/imperfect, they are approximations driven by an underlying attractor emerging from evolutionary constraints.
It doesn't follow that better calibration is impossible. Ethics is a subfield of game theory, which is a subfield of natural selection, which is a subfield of process theory.
.Constructor Theory as Process Theory
+https://twitter.com/charleswangb/status/1695182848553275510
.Revisiting the Social Origins of Human Morality: A Constructivist Perspective on the Nature of Moral Sense-Making
.A moral trade-off system produces intuitive judgments that are rational and coherent and strike a balance between conflicting moral values
.Harnessing Higher-Order (Meta-)Logic to Represent and Reason with Complex Ethical Theories
.Survival of the Friendliest: Homo sapiens Evolved via Selection for Prosociality
This is not to say moral decisions are easy.
.On the Computational Complexity of Ethics: Moral Tractability for Minds and Machines
If we take a step back and look at why such thing like 'ethics' would emerge, it seems that a higher order nash equilibrium at the scale of a group, a species or several species is necessary. The development of complexity and chaotic systems implies the production of nash equilibria able to maintain structures/continuity for open-endedness to go on (and not fall into randomness). In such context the circulation of energy is bound to produce ecosystems (of alive as well as non-alive phenomena).
.Open-ended evolution
"Open-ended evolution (OEE) is a major area of research in artificial life. While there is not full consensus on how to define the specifics of OEE, at a high level it describes an evolving system that will never settle into a single stable equilibrium. Some researchers argue that the continual generation of novelty is sufficient for a system to be labeled as open-ended (Soros & Stanley, 2014), while others have suggested that OEE “refers to the unbounded increase in complexity that seems to characterize evolution on multiple scales” (Corominas-Murtra, Seoane & Solé, 2018). In practice, these definitions are closely inter-related, and OEE is often used as an umbrella term to refer to the study of them all collectively. The key idea of open-ended evolution is that a system produces organisms that are continuously evolving and changing, rather than organisms eventually reaching a state where nothing changes.
In this critical in-between dynamic, the notion of open-endedness is core. If we think about our moral intuition surrounding this question, a pretty clear answer may surface;
Let's do a thought experiment
"Humans are dead tomorrow", what would be better between :
1) A future where nothing interesting ever happens, ie. an AI kills humans and turns the universe into dead inert matter
-> An outcome looking like the worry of Eliezer Yudkowsky;
"I expect that unaligned superintelligence will overwrite all nearby galaxies with stuff that is not very interesting or good from even the most embracing, cosmopolitan viewpoint on accepting diversity as valuable. I expect mostly lifeless paperclips, or stuff worth that little."
Let's say that in this scenario, an artificial agent brings heat death to the universe (tomorrow). All right, now let's see the second timeline;
2) Humans are dead (virus or/and AI) but the universe is blooming with sophisticated phenomena (in many various physical scales), diverse lively ecosystems (biological and more), including human-level consciousness (and less and more).
Let's say that in this scenario, the universe is fully open-ended.
-> Our moral intuition seems limpid, favoring this second open-ended scenario.
.OMNI: Open-endedness via Models of human Notions of Interestingness
An other clear intuition would be ;
What is better for alive species of the second scenario, them living in :
Torture, or well-being?
This other worry could be even harder to define, but the moral intuition for well-being is obvious. How to avoid those worst cases dead/tortured universes brought by these two "dilemma"? In such intricacy, the first moral intuition (open-endedness) may paradoxically be more tractable.
To uncover it, we have to understand why we are programed to have such compelling instinct. A case we'll analyze through the subject at hand : a procedure for universal/phenomenal ethics.
Let's define the desiderata : an agent (human/AI/AGI/ASI etc.) has to enhance the autonomy and available actions of existing "behaviors" (phenomenon/environment/agents etc.), which implies selecting out behaviors that aren't causally beneficial to other behaviors (ecosystemic well-being).
I include in the desiderata phenomena that aren't considered alive, any physical event is included (which is why I call it "phenomenal ethics"), an extension of paradigms such as :
.The Hippocratic principle - Vanessa Kosoy
"I propose a new formal desideratum for alignment: the Hippocratic principle. Informally the principle says: an AI shouldn't make things worse compared to letting the user handle them on their own, in expectation w.r.t. the user's beliefs. This is similar to the dangerousness bound I talked about before, and is also related to corrigibility."
.Learning Altruistic Behaviours in Reinforcement Learning without External Rewards
"We propose to act altruistically towards other agents by giving them more choice and allowing them to achieve their goals better. Some concrete examples include opening a door for others or safeguarding them to pursue their objectives without interference. We formalize this concept and propose an altruistic agent that learns to increase the choices another agent has by preferring to maximize the number of states that the other agent can reach in its future."
.From AI for people to AI for the world and the universe
The aim is to provide affordance, access to adjacent possibles, allow phenomena, ecosystems and individuals to bloom, develop diversity in as many dimensions as possible.
.Patterns of innovation
"The “adjacent possible”, introduced by the complexity theorist Stuart Kauffmann consists of all the unexplored possibilities surrounding a particular phenomenon: ideas, words, songs, molecules, genomes, and so on. The very definition of adjacent possible encodes the dichotomy between the actual and the possible: the actual realization of a given phenomenon and the space of possibilities still unexplored. But all the connections between these elements are hard to measure and quantify when including the things that are entirely unexpected and hard to imagine."
.The Capability Approach to Human Welfare
Frantic transformation of everything in order to hunt new dimensions of possibilities is to be avoided, so we need to relativize the desiderata with other constraints/parameters added to the equation (those parameters aren't absolute but variables for value attribution) :
Existing phenomena is prioritized over potential phenomena
The intensity of impact of actions has to be minimized the higher uncertainty is
Phenomena untainted by AI's causal power are prioritized over phenomena tainted by it
Normal rates of change are to be prioritized over abnormal ones; 'normal' being the natural rate at which a phenomenon evolves without intervention, in an environment without radical events. 'Radical events' such as catastrophes, great extinctions etc. (normality is statistical, it's a gaussian).
Coherence identifies a phenomenon from others, and the gaussian of probabilities surrounding it, detecting hierarchies of sets through correlations.
More precise definitions of self exist as well :
.The Computational Boundary of a “Self”: Developmental Bioelectricity Drives Multicellularity and Scale-Free Cognition
An important parameter is that more care is to be given to phenomena in qualia's contiunuum (the more qualia are present, the more precaution is given, ie. more time/energy). Thus, frameworks to define life/consciousness are useful.
Qualia brings to surface the challenge to detect suffering.
.Nonconscious Cognitive Suffering: Considering Suffering Risks of Embodied Artificial Intelligence
.The Moral Consideration of Artificial Entities: A Literature Review
.The lived experience of depression: a bottom-up review co-written by experts by experience and academics
.Biology, Buddhism, and AI: Care as the Driver of Intelligence
Taken in a broad sense suffering is symmetrical and opposed to happiness.
However, local/myopic, short-timed happiness is regularly detrimental, while local suffering can be beneficial. Which brings back to our desiderata, and questions how natural it is; how beings may implicitly derive towards it in the evolutionary process.
If so, it could be possible to define suffering as inversely proportional to the desiderata. Within such framing, if we take the happiness example, the "boundless bliss" of an individual phenomenon would lower its own accesibility to the action space.
Wire-heading is not increasing the agency of the wire-headed.
When taking the desiderata in its full account it's also intractable. Individual phenomenon's "well-being" is inherently balanced to the benefice of phenomenal co-affordance, which is to say, to the whole ecosystem/reality's "well-being".
Implicitly, game theoretic mechanisms behind this desiderata are naturally calibrating its execution. The crucial question is to blossom ecosystemic value, which means that even a seemingly insignificant phenomenon could be fundamental. Furthermore, a phenomenon might be inconsequential now and paramount later on.
Each causal chain contains "endemic" value.
.Simpler Math Predicts How Close Ecosystems Are to Collapse
How does a basal cognition/human/AI etc. explores its space of attention, actions?
How does it know its opportunities?
And how does it care to bypass the single-step optimality to see many steps ahead, accessing a more precise and complete map of opportunities?
The most efficient next step depends on your capacity to see far, wide, and precisely.
If you see well enough your world/behavioral model can turn the optimal-next-step inside rather than outside. Thus, doing recursive self-inspection, looking at source code etc.
.Technological Approach to Mind Everywhere: An Experimentally-Grounded Framework for Understanding Diverse Bodies and Minds
.Learning Intuitive Policies Using Action Features
.Finding Counterfactually Optimal Action Sequences in Continuous State Spaces
This paradigm brings perspective on an alignment procedure implementing the meaning of loss regarding destructions. Each bit/data/information has inequivalent qualities that can't be perfectly replaced, every new bit is a phenomenon tied to (sometimes only slightly) different potentials (properties, momentum, causalities).
This ontology/desiderata is an operational definition of open-ended knowledge seeking. AI mitigates strict losses of ecosystemic autonomy, enhances access to behaviors able to enhance access to behaviors, and tolerates sub-optimal external (environmental) choices that still obey said enhancements.
Uncertainty, energy use and data limitedness implies the inability to flawlessly execute such desiderata. Still such "universal agency" would be a moral compass applicable to "omnipotent" agents, which makes it intrinsically scalable.
It's also related to fundamental principles such as thermodynamics and synergy, so we should be able to align simple systems to it as well.
Except when a particular destruction (~apoptosis) is needed to protect the whole, all existences, all livings, have the same right to be. Any destruction locks in itself the possibilities it extinct.
Priority is given to non-involvement, non-invasive impact, parsimony, in order to preserve external/environmental integrity prompting analysis and perception-first AI.
Instead of being "grabby aliens" colonizing space while destroying endemic potentials, it seems better to increase open-endedness through reality's phenomena ethics, even beyond humans and life.
.Observations, Experiments, and Arguments for Epistemic Superiority in Scientific Methodology
Subjective/foreign preferences are welcomed and the degrees of freedom inside phenomenal ethics doesn't force you to actively obsess with 'open-endedness'.
You decide your choices within your irreducible space of computation (relaxed, within an accepted marge of sub-optimality).
No other moral system achieves such diversity (ie. ethics with no rule would pragmatically restrain or extinct a myriad of new possibilities) but some studies/projects/concepts take somewhat similar directions.
.Paretotopia
.The Capability Approach to Human Welfare
.Report on modeling evidential cooperation in large worlds
.Input Crowd, Output Meaning
.The Consilience Project
.Resurrecting Gaia: harnessing the Free Energy Principle to preserve life as we know it
.Coherent Extrapolated Volition
To understand the evolutionary attractors underlying our moral intuitions, including, crucially, the ones surrounding open-endedness, AI (and humans) already possesses material for triangulation :
.Seven moral rules found all around the world
.Our Brains are Wired for Morality: Evolution, Development, and Neuroscience
.Understanding the Diversity and Fluidity of Human Morality Through Evolutionary Psychology
.Morality and Evolutionary Biology
+https://plato.stanford.edu/entries/morality-biology/
.Modular Morals: The Genetic Architecture of Morality as Cooperation
.Moral Molecules: Morality as a Combinatorial System
.Cultural universal
.Moral foundations theory
.Autonomy in Moral and Political Philosophy
.The Fun Theory Sequence
.Profiles of an Ideal Society: The Utopian Visions of Ordinary People
We can map the dimensions of human cultures' diversity, subjective experiences, conceptual polysemy etc. The dimensions of meaning of a single word have inherent variations due to our subjective perspectives :
.Measuring Cultural Dimensions: External Validity and Internal Consistency of Hofstede's VSM...
.The Lancaster Sensorimotor Norms: multidimensional measures of perceptual and action strength for 40,000 English words
.Latent Diversity in Human Concepts
.The Perception Census
"Your responses to the questions and tasks in the Census will help us build a map of the different ways we each experience the world through our senses. It will help us see how some traits and experiences relate to others, and by revealing this, will shed new light on the way our brains and bodies interact to build our overall experience of the world."
.Topology of phenomenological experience: leveraging LLMs and feature similarity +https://youngzielee.github.io/
.Visible Thoughts Project
.Evolving linguistic divergence on polarizing social media