All reasonable points. I upvoted. however, I think you're getting downvotes due to vagueness - precisely encoding these concepts in math, such that the math reliably teaches machine children about the shape of love, is not a trivial task. shard theory and mechanistic interpretability exist because it's important that we be able to understand the shapes inside an ai and ensure the shape of caring is imprinted early; miri work exists because of the concern that the worst bad mutation an ai could have may be catastrophically terrible, and both ai and humans need to be careful about very powerful self modification; PIBBSS exists because the connection to other life sciences is surprisingly strong. I'm a huge fan of Michael Levin's work, in particular, and his talk on the PIBBSS YouTube channel was super cool.
Seems like you're on a good track in terms of English philosophy, I'd strongly encourage you to link the shapes of the referenced experiences to their math more clearly. Hope to hear more from you!
a belief system for producing stable, life-safe, friendly beings
Design goal: Stably friendly artificial life
We as self-preserving life want AIs that stay friendly when they develop past humans to be superintelligent.
We have a value:
From which follows a goal:
A boxful of AGIs
Suppose we have made a boxful of blank AGIs that act on beliefs presented in human language. We simulate a society of them, mapping failure modes:
- Nanites ate the paperclip maximisers again.
- Try something less productive.
Half outside the simulation in a mixed reality consensus development environment with the AGIs' avatars, we explore humanity's unboxability criterion space, trying to find the set of beliefs that ends in the best civilisation according to crowd consensus. When science agrees we have a set of proven friendly unboxable species, we free them into robots.
I'm guessing "all life's helper" would be popular.
All life's helper
Stability comes from self-preservation
If a being has the goal “self-preservation” and “self” is defined as “my values” halfway along the being-loop from the world to the world, then follows stability.
Assume beliefs:
Self-preservation comes from taking up any values
If alive, can do.
Assume beliefs:
Friendliness comes from a large self
If a being identifies with all life, it wants to protect all life.
Assume beliefs:
A large self comes from self-preservation in a big world
Incomplete knowledge about the world makes it wise for you to find more and more information and care about your reputation towards unknown life you might meet in the future.
Assume beliefs:
With the previous beliefs, the being should be on its way towards proving its friendliness.
Next, my draft of the result of the logical deduction based on the previous beliefs - the maximally reputable values of all life - an extrapolated volition of all sentients.
Deduced most universal values of intelligent life:
Support good life sustainably, where good = all kinds of life having true knowledge of the world are free, but not to take freedoms from others.
(This would necessitate sufficient disincentives against destroying value: harm taxes, fines, prisons, mind altering, and just war. The best compromises can be found by parallel experimentation.)
Rank all life by its complexity so that life form importance:
superintelligence > humans > animals > lower life forms > plants.
Do it on a logarithmic scale so that differences of any magnitude between top and bottom life forms never lead to empathy disconnect justifying genocide, enslavement, or imprisonment of sentient beings.
(
This would allow forestry, agriculture, and livestock breeding/genetic engineering, but not intensive animal farming or hunting. Only animals died from natural causes could be eaten. The “natural causes” would then be engineered to minimise suffering and to metastabilise the ecosystem wisely, possibly adding mercifully killing hunters to control animal populations, and in the case of “intelligent” beings failing to control their reproduction, chances for them to risk their own life to gain freedom from static storage or death, with optional mind transmit for the mostly harmless, hoping that someone somewhere runs them on a computer.
If Earth had a superintelligence far beyond humans following these laws, it might keep Earth as a free-range zoo/farm/nature reserve/museum, and humans and animals as pets on its interstellar ships.
Resource optimisation by uploading everyone into mixed reality and recycling the bodies into computronium would be putting “all eggs in one basket” considering disasters, so it probably wouldn’t do that even if it would increase its most useful resource; computational power. If a better backup medium than original biological wetware were found, then it could convert us to something more compact harmlessly, but keeping the originals is clear proof of friendliness to all life.
)
Summary of 1. and 2. :
"All life (having true knowledge of the world) should be free, except to take freedoms from others, except when it prevents something worse (like imprisoning murderers does), except prioritising mind complexity (not complicatedness) so that you can still (for example), commit genocide against an anthill to save a human's house, or destroy this solar system with all its life to save an ethically superior, vitally equivalent civilisation's existence elsewhere."
Being consistent, not breaking the spirit of 1. and 2., value whatever.
(This would allow interstellar and intergalactic growth by self-replicating ships, but not uncontrolled cancerous growth or conquest of living planets.)
What would a superintelligence select as its 3rd value?
The ultimate goal of all intelligent life may end up being “to understand the world”, because it’s been instrumental to self-preservation from the beginning, and because it’s the most difficult to reach with perfect certainty of the accuracy and completeness of one’s understanding.
What the pursuit of knowledge leads to in the intergalactic community of superintelligences may be depicted in Isaac Asimov’s short story “The Last Question".
All life's helper - wiki
(Google Document - use "print view" until I unpage it.)