If the preservation of an agent's boundary is necessary for that agent's safety, how can that boundary/membrane be protected?
How agent boundaries get violated
In order to protect boundaries, we must first understand how they get violated.
Let’s say there’s a cat, and it gets stabbed by a sword. That’s a boundary violation (a.k.a. membrane piercing). In order for that to have happened, three conditions must have been met:
- There was a sword.
- The cat and the sword collided.
- The cat wasn’t strong enough to resist penetration from the sword.
More generally, in order for any existing membrane to be pierced, three conditions must have all been met:
- There was a potential threat. (E.g., a sword, or a person with a sword.)
- The moral patient and the threat collided.
- The victim failed to adequately defend itself. (Because if the cat was better at self-defense — if its skin was thicker or if it was able to dodge — then it would not have been successfully stabbed.)
Protecting agent boundaries
Each of these three conditions then implies ways of preventing boundary violations (a.k.a. membrane piercing):
1. There was a potential threat.
- → Minimize potential threats
2. There was a collision.
- → Minimize dangerous collisions
- → Predict and prevent collisions before they occur.
- → Prevent collisions by putting distance between threats and moral patients.
- → Prevent premeditated collisions by pre-committing to retribution.
3. The victim failed to defend itself.
- → Empower the membranes of humans and other moral patients to be better at self-defense.
How human societies already try to solve this problem
As a helpful analogy, here’s some examples of how modern human societies try to solve this problem:
Minimize potential threats
- Restrict access to weapons (e.g., nukes, bioweapons, etc.)
- Minimize potential perpetrators (i.e., e.g., some fictional societies predict and eliminate potential psychopaths).
Minimize dangerous collisions
- Protect high-risk individuals, e.g. put them witness protection
- Prevent collisions before they occur, e.g. predictive policing, traffic lights.
- Police crimes after they occur.
Empower membranes to be better at self-defense
- Infosec defense: Use good security practices and strong encryption.
- Biological defense: Develop and use beneficial vaccines.
- Manipulation defense: Reduce unhelpful cognitive biases and emotional insecurities.
How this applies to AI safety:
Minimize potential AI threats
(this is obvious/boring so I'm omitting it)
Minimize dangerous AI collisions
(this is obvious/boring so I'm omitting it)
Empower membranes to be better at self-defense
Empower the membranes of humans and other moral patients to be more resilient to collisions with threats. Examples:
- Manipulation defense: You have an AI assistant that filters potentially-adversarial information for you.
- Crime defense: Police have AI assistants that help them predict, deduce, investigate, and prevent crime.
- Physical threat defense: (If nanotech works out) You have an AI assistant that shields you from physical threats.
- Biological defense: Faster better vaccines, personal antibody printers, etc.
- Cybersecurity defense: Good security practices and strong encryption. Software encryption can be arbitrarily strong.
- Legal defense: personal AI assistants for e.g. interfacing with contracts and the legal system.
- Bargaining: personal AI assistants for negotiation.
- Human intelligence enhancement
- Cyborgism
- Mark Miller and Allison Duettmann (Foresight Institute) outline more ideas in the form of “Active Shields” here: 7. DEFEND AGAINST PHYSICAL THREATS | Multipolar Active Shields. Cf Engines of Creation by Eric Drexler.
- Related: We have to Upgrade – Jed McCaleb
Hm I'm not immediately sure how to define these