Corrigibility as Constrained Optimisation
This post is coauthored with Ryan Carey. Much of the work on developing a corrigible agent has focused on ensuring that an AI will not manipulate the shutdown button or any other kind of device that the human operator would use to control it. Suppose, however, that the AI lacked...
Apr 11, 201915