Douglas_Reay

Superalignment

OpenAI have announced the approach they intend to use, to ensure humans stay in control of AIs smarter than they are: Our goal is to build a roughly human-level automated alignment researcher. We can then use vast amounts of compute to scale our efforts, and iteratively align superintelligence. To align...

Nov 18, 2023-4

Believable Promises

8. Believable Promises Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: How valuable is it, for an AI to be able to believably pre-commit to carrying out a later action that won't, at that later time, benefit it? Links to...

Apr 16, 20185

Metamorphosis

7. Metamorphosis Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: Under what circumstances can an AI be trusted to negotiate in good faith an alteration to its core values? Links to all the articles in the series: 1. Optimum number...

Apr 12, 20182

Trustworthy Computing

6. Trustworthy Computing Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: How to use non-open-ended task programs to permit a computing environment in which AIs could trust each other enough to cooperate on ganging up on defectors sufficiently effectively that...

Apr 10, 20189

The advantage of not being open-ended

5. The advantage of not being open-ended Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: When setting a computer a task, there are advantages to defining the task in such a way that a finite budget of some resource (such...

Mar 18, 201811

Environments for killing AIs

4. Environments for killing AIs Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: Killing a rogue AI may be impossible in our current environment, but we can change that by changing the environment. Links to all the articles in the...

Mar 17, 20183

Defect or Cooperate

3. Defect or Cooperate Summary of entire Series: An alternative approach to designing Friendly Artificial Intelligence computer systems. Summary of this Article: If the risks from being punished for not cooperating are high enough, then even for some types of Paperclip Maximiser that don't care at all about human survival,...

Mar 16, 20184

LESSWRONG
LW

LESSWRONG
LW

Douglas_Reay

Douglas_Reay

How minimal is our intelligence?

Examine your assumptions

How to deal with someone in a LessWrong meeting being creepy

LiveJournal Memes

Douglas_Reay

How minimal is our intelligence?

Examine your assumptions

How to deal with someone in a LessWrong meeting being creepy

LiveJournal Memes

Superalignment

Believable Promises

Metamorphosis

Trustworthy Computing

The advantage of not being open-ended

Environments for killing AIs

Defect or Cooperate