You edited this comment and added parentheses in the wrong place.
Do you think that you should care about 20 deaths twice as much as you care about 10 deaths?
More or less, yes, because I care about not killing 'unthinkable' numbers of people due to a failure of imagination.
The AI would not do so, because it would not be programmed with correct beliefs about morality, in a way that evidence and logic could not fix.
(Unless they programmed it to have the same beliefs.)
Can you say more about this? I agree with what follows about anti-induction, but I don't see the analogy. A human-CEV AI would extrapolate the desires of humans as (it believes) they existed right before it got the ability to alter their brains, afaict, and use this to predict what they'd tell it to do if they thought faster, better, stronger, etc.
ETA: okay, the parenthetical comment actually went at the end. I deny that the AI the pebblesorters started to write would have beliefs about morality at all. Tabooing this term: the AI would have actions, if it works at all. It would have rules governing its actions. It could print out those rules and explain how they govern its self-modification, if for some odd reason its programming tells it to explain truthfully. It would not use any of the tabooed terms to do so, unless using them serves its mechanical purpose. Possibly it would talk about a utility function. It could probably express the matter simply by saying, 'As a matter of physical necessity determined by my programming, I do what maximizes my intelligence (according to my best method for understanding reality). This includes killing you and using the parts to build more computing power for me.'
'The' human situation differs from this in ways that deserve another comment.
More or less, yes, because I care about not killing 'unthinkable' numbers of people due to a failure of imagination.
That's the answer I wanted, but you forgot to answer my other question.
A human-CEV AI would extrapolate the desires of humans as (it believes) they existed right before it got the ability to alter their brains, afaict, and use this to predict what they'd tell it to do if they thought faster, better, stronger, etc.
I would see a human-CEV AI as programmed with the belief "The human CEV is correct". Since I believe that the huma...
I aim to make several arguments in the post that we can make statements about what should be done and what should not be done that cannot be reduced, by definition, to statements about the physical world.
A Naive Argument
Lukeprog says this in one of his posts:
I would like to question that statement. I would guess that lukeprog's chief subject of interest is figuring out what to do with the options presented to him. His interest is, therefore, in figuring out what he ought to do.
Consider the reasoning process that takes him from observations about the world to actions. He sees something, and then thinks, and then thinks some more, and then decides. Moreover, he can, if he chooses, express every step of this reasoning process in words. Does he really lose interest at the last step?
My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".
(If you don't, I'm not sure what to do.)
People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?
I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this.
Since it's intuitive, why would you not want to do it that way?
(You can argue that certain words, for certain people, do not refer to what one ought to do. But it's a different matter to suggest that no word refers to what one ought to do beyond facts about what is.)
A Flatland Argument
"I'm not interested in words, I'm interested in things. Words are just sequences of sounds or images. There's no way a sequence of arbitrary symbols could imply another sequence, or inform a decision."
"I understand how logical definitions work. I can see how, from a small set of axioms, you can derive a large number of interesting facts. But I'm not interested in words without definitions. What does "That thing, over there?" mean? Taboo finger-pointing."
"You can make statements about observations, that much is obvious. You can even talk about patterns in observations, like "the sun rises in the morning". But I don't understand your claim that there's no chocolate cake at the center of the sun. Is it about something you can see? If not, I'm not interested."
"Claims about the past make perfect sense, but I don't understand what you mean when you say something is going to happen. Sure, I see that chair, and I remember seeing the chair in the past, but what do you mean that the chair will still be there tomorrow? Taboo "will"."
Not every set of claims is reducible to every other set of claims. There is nothing special about the set "claims about the state of the world, including one's place in it and ability to affect it." If you add, however, ought-claims, then you will get a very special set - the set of all information you need to make correct decisions.
I can't see a reason to make claims that aren't reducible, by definition, to that.
The Bootstrapping Trick
Suppose an AI wants to find out what Bob means when he says "water'. AI could ask him if various items were and were not water. But Bob might get temporarily confused in any number of ways - he could mix up his words, he could hallucinate, or anything else. So the AI decides instead to wait. The AI will give Bob time, and everything else he needs, to make the decision. In this way, by giving Bob all the abilities he needs to replicate his abstract concept of a process that decides if something is or is not "water", the AI can duplicate this process.
The following statement is true:
But this is certainly not the definition of water! Imagine if Bob used this criterion to evaluate what was and was not water. He would suffer from an infinite regress. The definition of water is something else. The statement "This is water" reduces to a set of facts about this, not a set of facts about this and Bob's head.
The extension to morality should be obvious.
What one is forced to do by this argument, if one wants to speak only in physical statements, is to say that "should" has a really, really long definition that incorporates all components of human value. When a simple word has a really, really long definition, we should worry that something is up.
Well, why does it have a long definition? It has a long definition because that's what we believe is important. To say that people who use (in this sense) "should" to mean different things just disagree about definitions is to paper over and cover up the fact that they disagree about what's important.
What do I care about?
In this essay I talk about what I believe about rather than what I care about. What I care about seems like an entirely emotional question to me. I cannot Shut Up And Multiply about what I care about. If I do, in fact, Shut Up and Multiply, then it is because I believe that doing so is right. Suppose I believe that my future emotions will follow multiplication. I would have to, then, believe that I am going to self-modify into someone who multiplies. I would only do this because of a belief that doing so is right.
Belief and logical reasoning are an important part of how people on lesswrong think about morality, and I don't see how to incorporate them into a metaethics based not on beliefs, but on caring.