I'm writing a post about S-risks, and I need access to some clean, established terminology/background material for discussing AI-based long-term outcomes for humanity.

My current (very limited) vocabulary can be summarized with the following categories: 

  1. Outcomes which are roughly maximally bad: Hyperexistential risk/S-risk/Unfriendly AI/Existential risk
  2. Outcomes which are nontrivially worse than paperclipping-equivalents but better than approximate minimization of human utility: Hyperexistential risk/S-risk/Unfriendly AI/Existential risk
  3. Outcomes which are produced by agents essentially orthogonal to human values: Paperclipping/Unfriendly AI/Existential risk
  4. Outcomes which are nontrivially better than paperclipping but worse than Friendly AI: ???
  5. Outcomes which are roughly maximally good: Friendly AI

The problems are manifold: 

  • I haven't read any discussion which specifically addresses parts 1 or 2. I have read general discussion of parts 1 and 2 combined under the names of "Outcomes worse than death", "Hyperexistential risk", "S-risk", etc.
  • My current terminology overlaps too strongly to use to uniquely identify outcomes 1 and 2.
  • I have no terminology or background information for outcome 4.

I've done a small amount of investigation and determined less brainpower would be wasted by just asking for links.

New Answer
New Comment