The paperclip maximizer (PM) encounters an alien civilization and causes lots of suffering warring with it
I don't think that it is likely that it will encounter anything that has equal resources and if it does that suffering would occur (see below).
PM decides there's a chance that it's in a simulation run by a sadistic being who will punish it (prevent it from making paperclips) unless it creates trillions of conscious beings and tortures them
That seems like one of the problems that have to be solved in order to build an AI that transforms the universe into an inanimate state. But I think it is much easier to make an AI not simulate any other agents than to create a friendly AI. Much more can go wrong by creating a friendly AI, including the possibility that it tortures trillions of beings. In the case of a transformer you just have to make sure that it values an universe that is as close as possible to a state where no computation takes place and that does not engage in any kind of trade, acausal or otherwise.
PM is itself capable of suffering
I believe that any sort of morally significant suffering is an effect of (natural) evolution, and may in fact be dependent on that. I think that the kind of maximizer that SI has in mind is more akin to a transformation process that isn't consciousness, does not have emotions and cannot suffer. If those qualities would be necessary requirements then I don't think that we will build an artificial general intelligence any time soon and that if we do it will happen slowly and not be able to undergo dangerous recursive self-improvement.
somebody steals PM's source code before it's launched, and makes a sadistic AI
I think that this is more likely to be the case with friendly AI research because it takes longer.
Suppose you buy the argument that humanity faces both the risk of AI-caused extinction and the opportunity to shape an AI-built utopia. What should we do about that? As Wei Dai asks, "In what direction should we nudge the future, to maximize the chances and impact of a positive intelligence explosion?"
This post serves as a table of contents and an introduction for an ongoing strategic analysis of AI risk and opportunity.
Contents:
Why discuss AI safety strategy?
The main reason to discuss AI safety strategy is, of course, to draw on a wide spectrum of human expertise and processing power to clarify our understanding of the factors at play and the expected value of particular interventions we could invest in: raising awareness of safety concerns, forming a Friendly AI team, differential technological development, investigating AGI confinement methods, and others.
Discussing AI safety strategy is also a challenging exercise in applied rationality. The relevant issues are complex and uncertain, but we need to take advantage of the fact that rationality is faster than science: we can't "try" a bunch of intelligence explosions and see which one works best. We'll have to predict in advance how the future will develop and what we can do about it.
Core readings
Before engaging with this series, I recommend you read at least the following articles:
Example questions
Which strategic questions would we like to answer? Muehlhauser (2011) elaborates on the following questions:
Salamon & Muehlhauser (2013) list several other questions gathered from the participants of a workshop following Singularity Summit 2011, including:
These are the kinds of questions we will be tackling in this series of posts for Less Wrong Discussion, in order to improve our predictions about which direction we can nudge the future to maximize the chances of a positive intelligence explosion.