Post 4 of Towards Causal Foundations of Safe AGI, preceded by Post 1: Introduction, Post 2: Causality, Post 3: Agency, and Post 4: Incentives. By Francis Rhys Ward, Tom Everitt, Sebastian Benthall, James Fox, Matt MacDermott, Milad Kazemi, Ryan Carey representing the Causal Incentives Working Group. Thanks also to Toby...
Post 4 of Towards Causal Foundations of Safe AGI, preceded by Post 1: Introduction, Post 2: Causality, and Post 3: Agency. By Tom Everitt, James Fox, Ryan Carey, Matt MacDermott, Sebastian Benthall, and Jon Richens, representing the Causal Incentives Working Group. Thanks also to Toby Shevlane and Aliya Ahmad. “Show...
Post 2 of Towards Causal Foundations of Safe AGI, see also Post 1 Introduction. By Lewis Hammond, Tom Everitt, Jon Richens, Francis Rhys Ward, Ryan Carey, Sebastian Benthall, and James Fox, representing the Causal Incentives Working Group. Thanks also to Alexis Bellot, Toby Shevlane, and Aliya Ahmad. Causal models are...
By Tom Everitt, Lewis Hammond, Rhys Ward, Ryan Carey, James Fox, Sebastian Benthall, Matt MacDermott and Shreshth Malik representing the Causal Incentives Working Group. Thanks also to Toby Shevlane, MH Tessler, Aliya Ahmad, Zac Kenton, Maria Loks-Thompson, and Alexis Bellot. Over the next few years, society, organisations, and individuals will...
I've put a preprint up on arXiv that this community might find relevant. It's an argument from over a year ago, so it may be dated. I haven't been keeping up with the field much since I wrote it, so I welcome any feedback especially on where the crux of...
For the sake of argument, let's consider an agent to be autonomous if: * It has sensors and actuators (important for an agent) * It has an internal representation of its goals. I will call this internal representation its desires. * It has some kind of internal planning function that...
Suppose you want to predict the behavior of an agent. I stand corrected. To make the prediction, as a predictor you need: * observations of the agent * the capacity to model the agent to a sufficient degree of accuracy "Sufficient accuracy" here is a threshold on, for example, KL...