A Defense of Naive Metaethics
I aim to make several arguments in the post that we can make statements about what should be done and what should not be done that cannot be reduced, by definition, to statements about the physical world.
A Naive Argument
Lukeprog says this in one of his posts:
If someone makes a claim of the 'ought' type, either they are talking about the world of is, or they are talking about the world of is not. If they are talking about the world of is not, then I quickly lose interest because the world of is not isn't my subject of interest.
I would like to question that statement. I would guess that lukeprog's chief subject of interest is figuring out what to do with the options presented to him. His interest is, therefore, in figuring out what he ought to do.
Consider the reasoning process that takes him from observations about the world to actions. He sees something, and then thinks, and then thinks some more, and then decides. Moreover, he can, if he chooses, express every step of this reasoning process in words. Does he really lose interest at the last step?
My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".
(If you don't, I'm not sure what to do.)
People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?
I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this.
Since it's intuitive, why would you not want to do it that way?
(You can argue that certain words, for certain people, do not refer to what one ought to do. But it's a different matter to suggest that no word refers to what one ought to do beyond facts about what is.)
A Flatland Argument
"I'm not interested in words, I'm interested in things. Words are just sequences of sounds or images. There's no way a sequence of arbitrary symbols could imply another sequence, or inform a decision."
"I understand how logical definitions work. I can see how, from a small set of axioms, you can derive a large number of interesting facts. But I'm not interested in words without definitions. What does "That thing, over there?" mean? Taboo finger-pointing."
"You can make statements about observations, that much is obvious. You can even talk about patterns in observations, like "the sun rises in the morning". But I don't understand your claim that there's no chocolate cake at the center of the sun. Is it about something you can see? If not, I'm not interested."
"Claims about the past make perfect sense, but I don't understand what you mean when you say something is going to happen. Sure, I see that chair, and I remember seeing the chair in the past, but what do you mean that the chair will still be there tomorrow? Taboo "will"."
Not every set of claims is reducible to every other set of claims. There is nothing special about the set "claims about the state of the world, including one's place in it and ability to affect it." If you add, however, ought-claims, then you will get a very special set - the set of all information you need to make correct decisions.
I can't see a reason to make claims that aren't reducible, by definition, to that.
The Bootstrapping Trick
Suppose an AI wants to find out what Bob means when he says "water'. AI could ask him if various items were and were not water. But Bob might get temporarily confused in any number of ways - he could mix up his words, he could hallucinate, or anything else. So the AI decides instead to wait. The AI will give Bob time, and everything else he needs, to make the decision. In this way, by giving Bob all the abilities he needs to replicate his abstract concept of a process that decides if something is or is not "water", the AI can duplicate this process.
The following statement is true:
A substance is water (in Bob's language) if and only if Bob, given all the time, intelligence, and other resources he wants, decides that it is water.
But this is certainly not the definition of water! Imagine if Bob used this criterion to evaluate what was and was not water. He would suffer from an infinite regress. The definition of water is something else. The statement "This is water" reduces to a set of facts about this, not a set of facts about this and Bob's head.
The extension to morality should be obvious.
What one is forced to do by this argument, if one wants to speak only in physical statements, is to say that "should" has a really, really long definition that incorporates all components of human value. When a simple word has a really, really long definition, we should worry that something is up.
Well, why does it have a long definition? It has a long definition because that's what we believe is important. To say that people who use (in this sense) "should" to mean different things just disagree about definitions is to paper over and cover up the fact that they disagree about what's important.
What do I care about?
In this essay I talk about what I believe about rather than what I care about. What I care about seems like an entirely emotional question to me. I cannot Shut Up And Multiply about what I care about. If I do, in fact, Shut Up and Multiply, then it is because I believe that doing so is right. Suppose I believe that my future emotions will follow multiplication. I would have to, then, believe that I am going to self-modify into someone who multiplies. I would only do this because of a belief that doing so is right.
Belief and logical reasoning are an important part of how people on lesswrong think about morality, and I don't see how to incorporate them into a metaethics based not on beliefs, but on caring.
What is/are the definition(s) of "Should"?
My Model
Consider an AI. This AI goes out into the world, observing things and doing things. This is a special AI, though. In converting observations into actions, it first transforms them into beliefs in some kind of propositional language. This may or may not be the optimal way to build an AI. Regardless, that's how it works.
The AI has a database, filled with propositions. The AI also has some code.
- It has code for turning propositions into logically equivalent propositions.
- It has code for turning observations about the world into propositions about these observations, like "The pixel at location 343, 429 of image #8765 is red"
- It has code for turning propositions about observations into propositions about the state of the world, like "The apple in front of my camera is red."
- It has code for turning those propositions into propositions that express prediction, like "There will still be a red apple there until someone moves it."
- It has code for turning those propositions into propositions about shouldness, like "I should tell the scientists about that apple."
- It has code for turning propositions about shouldness into actions.
What is special about this code is that it can't be expressed as propositions. Just as one can't argue morality into a rock, the AI doesn't function if it doesn't have this code, no matter what propositions are stored in its memory. The classic example of this would be the Tortoise, from What the Tortoise said to Achilles.
Axioms - Assumptions and Definitions
When we, however, observe the AI, we can put everything into words. We can express its starting state, both the propositions and the code, as a set of axioms. We can then watch as it logically draws conclusions from these axioms, ending on a decision of what action to take.
The important thing, therefore, is to check that its initial axioms are correct. There is a key distinction here, because it seems like there are two kinds of axioms going on. Consider two from Euclid:
First, his definition of a point, which I believe is "a point is that which has no part". It does not seem like this could be wrong. If it turns out that nothing has no part, then that just means that there aren't any points.
Second, one of the postulates, like the parallel postulate. This one could be wrong. Because of General Relativity, when applied to the real world, it is, in fact, wrong.
This boundary can be somewhat fluid when we interpret the propositions in a new light. For instance, if we prefix each axiom with, "In a Euclidean space, ", then the whole system is just the definition of a Euclidean space.
A necessary condition for some axiom to be a definition of a term is that you can't draw any new conclusions that don't involve that term from the definition. This also applies to groups of definitions. That is:
"Stars are the things that produce the lights we see in the sky at night that stay fixed relative to each other, and also the Sun" and "Stars are gigantic balls of plasma undergoing nuclear reactions" can both be definitions of the word "Star", since we know them to be equivalent. If, however, we did not know those concepts to be equivalent (if we needed to be really really really confident in our conclusions, for instance) then only one of those could be the definition.
What are the AI's definitions?
Logic: It seems to me that this code both defines logical terms and makes assumptions about their properties. In logic it is very hard to tell the difference.
Observation: Here it is easier. The meaning of a statement like "I see something red" is given by the causal process that leads to seeing something red. This code, therefore, provides a definition.
The world: What is the meaning of a statement about the world? This meaning, as we know from the litany of Tarski, comes from the world. One can't define facts about the world into existence, so this code must consist of assumptions.
Predictions: The meaning of a prediction, clearly, is not determined by the process used to create it. It's determined by what sort of events would confirm or falsify a prediction. The code that deduces predictions from states of the world might include part of the definition of those states. For instance, red things are defined as those things predicted to create the appearance of redness. It cannot, though, define the future - it can only assume that its process for producing predictions is accurate.
Shouldness: Now we come to the fun part. There are two separate ways to define "should" here. We can define it by the code that produces should statements, or by the code that uses them.
When you have two different definitions, one thing you can do is to decide that they're defining two different words. Let's call the first one AI_should and the second Actually_Should.
ETA: Another way to state this is "Which kinds of statement can be reduced, through definition, to which others kinds of statements?" I argue that while one can reduce facts about observations to facts about the world (as long as qualia don't exist), one cannot reduce statements backwards - something new is introduced at each step. People think that we should be able to reduce ethical statements to the previous kinds of statements. Why, when so many other reductions are not possible?
Is the second definition acceptable?
The first claim I would like to make is that Actually_Should is a well-defined term - that AI_should and me_should and you_should are not all that there is.
One counter-claim that can be made against it is that it is ambiguous. But if it were, then one would presumably be able to clarify the definition with additional statements about it. But this cannot be done, because then one would be able to define certain actions into existence.
Is the second definition better?
I'm not sure "better" is well-defined here, but I'm going to argue that it is. I think these arguments should at least shed light on the first question.
The first thing that strikes me as off about AI_should is that if you use it, then your definition will be very long and complex (because Human Value is Complex) but your assumption will be short (because the code that takes "You should do X" and then does X can be minimal). This is backwards - definitions should be clean, elegant, and simple, while assumptions often have to be messy and complex.
The second thing is that it doesn't seem to be very useful for communication. When I tell someone something, it is usually because I want them to do something. I might tell someone that there is a deadly snake nearby, so that they will be wary. I may not always know exactly what someone will do with information, but I can guess that some information will probably improve the quality of their decisions.
How useful, then to communicate directly in "Actually_should", to say to Alice the proposition that, if Alice believed it, would cause her to do X, from which she conclude that "Bob believes the proposition that, if I believed it, would cause me to do X".
If, on the other hand, Bob says that Alice Bob_should do X, then Alice might respond with "that's interesting, but I don't care", and would not be able to respond with "You're wrong." This would paper over the very real moral disagreement between them. This is often advisable in social situations, but rarely useful epistemically.
How do we know that there is a disagreement? Well, if the issue is significant enough, both sides would feel justified in starting a war over it. This is true even if they agree on Alice_Should and Bob_Should and so on for everyone on Earth. That seems like a pretty real disagreement to me.
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)