This is a Part III of a long essay. Part I introduced the concept of morality-as-cooperation (MAC) in human societies. Part II discussed moral reasoning and introduced a framework for moral experimentation. Part III: Failure modes Part I described how human morality has evolved over time to become ever more...
This is a Part II of a long essay. Part I introduced the concept of morality-as-cooperation (MAC), and discussed how the principle could be used to understand moral judgements in human societies. Part III will discuss failure modes. Part II: Theory and Experiment The prior discussion of morality was human-centric,...
Abstract The AI alignment problem is usually specified in terms of power and control. Given a single, solitary AGI, how can we constrain its behavior so that its actions remain aligned with human interests? Unfortunately, the answer, to a first approximation, appears to be "we can't." There are myriad reasons,...
This was originally supposed to be a response to the new AGI Safety FAQ-in-progress, but it got a bit too long. Anonymous writes: > A lot of the AI risk arguments seem to come... with a very particular transhumanist aesthetic about the future (nanotech, ... etc.). I find these things...