This analogy seems like a good one. Let me try extending it a bit. Suppose that in our ancestral environment the only things banana shaped were bananas, and the ability to perceive yellowness had no other fitness benefits. Then wouldn't it be surprising that we even evolved the ability to perceive yellowness, much less to care about it?
In our actual EEA, there were no human-shaped objects that were not humans, so if caring about humans was adaptive, evolution could have just made us care about, say, human-shaped objects that are alive and act intelligently. Why did we evolve the ability (i.e., intuition) to determine whether something is conscious, and to care about that?
Believing that other people are conscious doesn't require any special selection pressure: it falls out of the general ability to understand their utterances as referring to something that's "actually out there", which is useful for other reasons. Also we seem to have a generalized adaptation that says "if all previously encountered instances possessed a certain trait, but this instance doesn't, then begin doubting if this instance is genuine".
Related To: Eliezer's Zombies Sequence, Alicorn's Pain
Today you volunteered for what was billed as an experiment in moral psychology. You enter into a small room with a video monitor, a red light, and a button. Before you entered, you were told that you'll be paid $100 for participating in the experiment, but for every time you hit that button, $10 will be deducted. On the monitor, you see a person sitting in another room, and you appear to have a two-way audio connection with him. That person is tied down to his chair, with what appears to be electrical leads attached to him. He now explains to you that your red light will soon turn on, which means he will be feeling excruciating pain. But if you press the button in front of you, his pain will stop for a minute, after which the red light will turn on again. The experiment will end in ten minutes.
You're not sure whether to believe him, but pretty soon the red light does turn on, and the person in the monitor cries out in pain, and starts struggling against his restraints. You hesitate for a second, but it looks and sounds very convincing to you, so you quickly hit the button. The person in the monitor breaths a big sigh of relief and thanks you profusely. You make some small talk with him, and soon the red light turns on again. You repeat this ten times and then are released from the room. As you're about to leave, the experimenter tells you that there was no actual person behind the video monitor. Instead, the audio/video stream you experienced was generated by one of the following ECPs (exotic computational processes).
Then she asks, would you like to repeat this experiment for another chance at earning $100?
Presumably, you answer "yes", because you think that despite appearances, none of these ECPs actually do feel pain when the red light turns on. (To some of these ECPs, your button presses would constitute positive reinforcement or lack of negative reinforcement, but mere negative reinforcement, when happening to others, doesn't seem to be a strong moral disvalue.) Intuitively this seems to be the obvious correct answer, but how to describe the difference between actual pain and the appearance of pain or mere negative reinforcement, at the level of bits or atoms, if we were specifying the utility function of a potentially super-intelligent AI? (If we cannot even clearly define what seems to be one of the simplest values, then the approach of trying to manually specify such a utility function would appear completely hopeless.)
One idea to try to understand the nature of pain is to sample the space of possible minds, look for those that seem to be feeling pain, and check if the underlying computations have anything in common. But as in the above thought experiment, there are minds that can convincingly simulate the appearance of pain without really feeling it.
Another idea is that perhaps what is bad about pain is that it is a strong negative reinforcement as experienced by a conscious mind. This would be compatible with the thought experiment above, since (intuitively) ECPs 1, 2, and 4 are not conscious, and 3 does not experience strong negative reinforcements. Unfortunately it also implies that fully defining pain as a moral disvalue is at least as hard as the problem of consciousness, so this line of investigation seems to be at an immediate impasse, at least for the moment. (But does anyone see an argument that this is clearly not the right approach?)
What other approaches might work, hopefully without running into one or more problems already known to be hard?