I think the problem is in the definition of `optimum'. In order to be able to call a state optimum, you must presuppose the laws of physics in order to rule out any better states that are physically impossible. Once we recognize this, it seems that any society must either achieve an optimum or suffer an existential disaster (not necessarily extinction). Value is fragile, but minds are powerful and if they ever get on the right track they will never get off, baring problems that are impossible to foresee.
The only cases that remain to be considered are extinction and non-extinction existential risk. I'm pretty sure that my value system in indifferent between the existence and nonexistence of a region with no conscious life, but there is no reason for other value systems to share that property. I am unsure how the average value system would judge its surroundings, partially because I am unsure what to average over. Even a group that manages to optimize its surroundings may describe its universe's existence as bad due to the existence of variables that it cannot optimize or other factors, such as a general dislike of anything existing.
If an existential risk does not fully wipe out its species, there is a chance that an optimization process will survive, but with different values from its parent species. On average, the parent species would probably regard this as better than extinction, because the optimization process would share some of its values, while being indifferent to the rest. As weak evidence that this applies to our species, there are many fictional distopias that, while much worse than our current world, seem preferable to extinction.
Related To: Eliezer's Zombies Sequence, Alicorn's Pain
Today you volunteered for what was billed as an experiment in moral psychology. You enter into a small room with a video monitor, a red light, and a button. Before you entered, you were told that you'll be paid $100 for participating in the experiment, but for every time you hit that button, $10 will be deducted. On the monitor, you see a person sitting in another room, and you appear to have a two-way audio connection with him. That person is tied down to his chair, with what appears to be electrical leads attached to him. He now explains to you that your red light will soon turn on, which means he will be feeling excruciating pain. But if you press the button in front of you, his pain will stop for a minute, after which the red light will turn on again. The experiment will end in ten minutes.
You're not sure whether to believe him, but pretty soon the red light does turn on, and the person in the monitor cries out in pain, and starts struggling against his restraints. You hesitate for a second, but it looks and sounds very convincing to you, so you quickly hit the button. The person in the monitor breaths a big sigh of relief and thanks you profusely. You make some small talk with him, and soon the red light turns on again. You repeat this ten times and then are released from the room. As you're about to leave, the experimenter tells you that there was no actual person behind the video monitor. Instead, the audio/video stream you experienced was generated by one of the following ECPs (exotic computational processes).
Then she asks, would you like to repeat this experiment for another chance at earning $100?
Presumably, you answer "yes", because you think that despite appearances, none of these ECPs actually do feel pain when the red light turns on. (To some of these ECPs, your button presses would constitute positive reinforcement or lack of negative reinforcement, but mere negative reinforcement, when happening to others, doesn't seem to be a strong moral disvalue.) Intuitively this seems to be the obvious correct answer, but how to describe the difference between actual pain and the appearance of pain or mere negative reinforcement, at the level of bits or atoms, if we were specifying the utility function of a potentially super-intelligent AI? (If we cannot even clearly define what seems to be one of the simplest values, then the approach of trying to manually specify such a utility function would appear completely hopeless.)
One idea to try to understand the nature of pain is to sample the space of possible minds, look for those that seem to be feeling pain, and check if the underlying computations have anything in common. But as in the above thought experiment, there are minds that can convincingly simulate the appearance of pain without really feeling it.
Another idea is that perhaps what is bad about pain is that it is a strong negative reinforcement as experienced by a conscious mind. This would be compatible with the thought experiment above, since (intuitively) ECPs 1, 2, and 4 are not conscious, and 3 does not experience strong negative reinforcements. Unfortunately it also implies that fully defining pain as a moral disvalue is at least as hard as the problem of consciousness, so this line of investigation seems to be at an immediate impasse, at least for the moment. (But does anyone see an argument that this is clearly not the right approach?)
What other approaches might work, hopefully without running into one or more problems already known to be hard?