OpenAI have announced the approach they intend to use, to ensure humans stay in control of AIs smarter than they are: Our goal is to build a roughly human-level automated alignment researcher. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence. To align...
8. Believable Promises Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: How valuable is it, for an AI to be able to believably pre-commit to carrying out a later action that won't, at that later time, benefit it? Links to...
7. Metamorphosis Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: Under what circumstances can an AI be trusted to negotiate in good faith an alteration to its core values? Links to all the articles in the series: 1. Optimum number...
6. Trustworthy Computing Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: How to use non-open-ended task programs to permit a computing environment in which AIs could trust each other enough to cooperate on ganging up on defectors sufficiently effectively that...
5. The advantage of not being open-ended Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: When setting a computer a task, there are advantages to defining the task in such a way that a finite budget of some resource (such...
4. Environments for killing AIs Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: Killing a rogue AI may be impossible in our current environment, but we can change that by changing the environment. Links to all the articles in the...
3. Defect or Cooperate Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: If the risks from being punished for not cooperating are high enough, then even for some types of Paperclip Maximiser that don't care at all about human survival,...