[Link] Suffering-focused AI safety: Why “fail-safe” measures might be particularly promising
The Foundational Research Institute just published a new paper: "Suffering-focused AI safety: Why “fail-safe” measures might be our top intervention".
It is important to consider that [AI outcomes] can go wrong to very different degrees. For value systems that place primary importance on the prevention of suffering, this aspect is crucial: the best way to avoid bad-case scenarios specifically may not be to try and get everything right. Instead, it makes sense to focus on the worst outcomes (in terms of the suffering they would contain) and on tractable methods to avert them. As others are trying to shoot for a best-case outcome (and hopefully they will succeed!), it is important that some people also take care of addressing the biggest risks. This perspective to AI safety is especially promising both because it is currently neglected and because it is easier to avoid a subset of outcomes rather than to shoot for one highly specific outcome. Finally, it is something that people with many different value systems could get behind.
Nick Beckstead: On the Overwhelming Importance of Shaping the Far Future
Nick Beckstead: On the Overwhelming Importance of Shaping the Far Future
ABSTRACT: In slogan form, the thesis of this dissertation is that shaping the far future is overwhelmingly important. More precisely, I argue that:
Main Thesis: From a global perspective, what matters most (in expectation) is that we do what is best (in expectation) for the general trajectory along which our descendants develop over the coming millions of years or longer.
The first chapter introduces some key concepts, clarifies the main thesis, and outlines what follows in later chapters. Some of the key concepts include: existential risk, the world's development trajectory, proximate benefits and ripple effects, speeding up development, trajectory changes, and the distinction between broad and targeted attempts to shape the far future. The second chapter is a defense of some methodological assumptions for developing normative theories which makes my thesis more plausible. In the third chapter, I introduce and begin to defend some key empirical and normative assumptions which, if true, strongly support my main thesis. In the fourth and fifth chapters, I argue against two of the strongest objections to my arguments. These objections come from population ethics, and are based on Person-Affecting Views and views according to which additional lives have diminishing marginal value. I argue that these views face extreme difficulties and cannot plausibly be used to rebut my arguments. In the sixth and seventh chapters, I discuss a decision-theoretic paradox which is relevant to my arguments. The simplest plausible theoretical assumptions which support my main thesis imply a view I call fanaticism, according to which any non-zero probability of an infinitely good outcome, no matter how small, is better than any probability of a finitely good outcome. I argue that denying fanaticism is inconsistent with other normative principles that seem very obvious, so that we are faced with a paradox. I have no solution to the paradox; I instead argue that we should continue to use our inconsistent principles, but we should use them tastefully. We should do this because, currently, we know of no consistent set of principles which does better.
[If there's already been a discussion post about this, my apologies, I couldn't find it.]
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)