I aim to make several arguments in the post that we can make statements about what should be done and what should not be done that cannot be reduced, by definition, to statements about the physical world.

A Naive Argument

Lukeprog says this in one of his posts:

If someone makes a claim of the 'ought' type, either they are talking about the world of is, or they are talking about the world of is not. If they are talking about the world of is not, then I quickly lose interest because the world of is not isn't my subject of interest.

I would like to question that statement. I would guess that lukeprog's chief subject of interest is figuring out what to do with the options presented to him. His interest is, therefore, in figuring out what he ought to do.

 Consider the reasoning process that takes him from observations about the world to actions. He sees something, and then thinks, and then thinks some more, and then decides. Moreover, he can, if he chooses, express every step of this reasoning process in words. Does he really lose interest at the last step?

My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".

(If you don't, I'm not sure what to do.)

People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?

 I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this.

Since it's intuitive, why would you not want to do it that way?

(You can argue that certain words, for certain people, do not refer to what one ought to do. But it's a different matter to suggest that no word refers to what one ought to do beyond facts about what is.)

A Flatland Argument

"I'm not interested in words, I'm interested in things. Words are just sequences of sounds or images. There's no way a sequence of arbitrary symbols could imply another sequence, or inform a decision."

"I understand how logical definitions work. I can see how, from a small set of axioms, you can derive a large number of interesting facts. But I'm not interested in words without definitions. What does "That thing, over there?" mean? Taboo finger-pointing." 

"You can make statements about observations, that much is obvious. You can even talk about patterns in observations, like "the sun rises in the morning". But I don't understand your claim that there's no chocolate cake at the center of the sun. Is it about something you can see? If not, I'm not interested."

"Claims about the past make perfect sense, but I don't understand what you mean when you say something is going to happen. Sure, I see that chair, and I remember seeing the chair in the past, but what do you mean that the chair will still be there tomorrow? Taboo "will"."

Not every set of claims is reducible to every other set of claims. There is nothing special about the set "claims about the state of the world, including one's place in it and ability to affect it." If you add, however, ought-claims, then you will get a very special set - the set of all information you need to make correct decisions.

I can't see a reason to make claims that aren't reducible, by definition, to that.

The Bootstrapping Trick

Suppose an AI wants to find out what Bob means when he says "water'. AI could ask him if various items were and were not water. But Bob might get temporarily confused in any number of ways - he could mix up his words, he could hallucinate, or anything else. So the AI decides instead to wait. The AI will give Bob time, and everything else he needs, to make the decision. In this way, by giving Bob all the abilities he needs to replicate his abstract concept of a process that decides if something is or is not "water", the AI can duplicate this process.

The following statement is true:

A substance is water (in Bob's language) if and only if Bob, given all the time, intelligence, and other resources he wants, decides that it is water. 

But this is certainly not the definition of water! Imagine if Bob used this criterion to evaluate what was and was not water. He would suffer from an infinite regress. The definition of water is something else. The statement "This is water" reduces to a set of facts about this, not a set of facts about this and Bob's head. 

The extension to morality should be obvious.

What one is forced to do by this argument, if one wants to speak only in physical statements, is to say that "should" has a really, really long definition that incorporates all components of human value. When a simple word has a really, really long definition, we should worry that something is up.

Well, why does it have a long definition? It has a long definition because that's what we believe is important. To say that people who use (in this sense) "should" to mean different things just disagree about definitions is to paper over and cover up the fact that they disagree about what's important.

What do I care about?

In this essay I talk about what I believe about rather than what I care about. What I care about seems like an entirely emotional question to me. I cannot Shut Up And Multiply about what I care about. If I do, in fact, Shut Up and Multiply, then it is because I believe that doing so is right. Suppose I believe that my future emotions will follow multiplication. I would have to, then, believe that I am going to self-modify into someone who multiplies. I would only do this because of a belief that doing so is right. 

Belief and logical reasoning are an important part of how people on lesswrong think about morality, and I don't see how to incorporate them into a metaethics based not on beliefs, but on caring.

 

A Defense of Naive Metaethics
New Comment
295 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

I share your skepticism about Luke's statement (but I've been waiting to criticize until he finishes his sequence to see if he addresses the problems later).

My goal here is to get people to feel the intuition that "I ought to do X" means something, and that thing is not "I think I ought to do X" or "I would think that I ought to do X if I were smarter and some other stuff".

To help pump that intuition, consider this analogy:

"X is true" (where X is a mathematical statement) means something, and that thing is not "I think X is true" or "I would think that X is true if I were smarter and some other stuff".

On the other hand, I think it's also possible that "I ought to do X" doesn't really mean anything. See my What does a calculator mean by "2"?. (ETA: To clarify, I mean some usages of "ought" may not really mean anything. There are some usages that clearly do, for example "If you want to accomplish X, then you ought to do Y" can in principle be straightforwardly reduced to a mathematical statement about decision theory, assuming that our current strong intuition that there is such a thing as "the right decision theory" is correct.)

4lukeprog
Wei Dai, I would prefer to hear the source of your skepticism now, if possible. I anticipate not actually disagreeing. I anticipate that we will argue it out and discover that we agree but that my way of expressing my position was not clear to you at first. And then I anticipate using this information to improve the clarity of my future posts.

I'll first try to restate your position in order to check my understanding. Let me know if I don't do it justice.

People use "should" in several different ways. Most of these ways can be "reducible to physics", or in other words can be restated as talking about how our universe is, without losing any of the intended meaning. Some of these ways can't be so reduced (they are talking about the world of "is not") but those usages are simply meaningless and can be safely ignored.

I agree that many usages of "should" can be reduced to physics. (Or perhaps instead to mathematics.) But there may be other usages that can't be so reduced, and which are not clearly safe to ignore. Originally I was planning to wait for you to list the usages of "should" that can be reduced, and then show that there are other usages that are not obviously talking about "the world of is" but are not clearly meaningless either. (Of course I hope that your reductions do cover all of the important/interesting usages, but I'm not expecting that to be the case.)

Since you ask for my criticism now, I'll just give an example that seems to be one of the hardest to... (read more)

4lukeprog
I'm not planning to list all the reductions of normative language. There are too many. People use normative language in too many ways. Also, I should clarify that when I talk about reducing ought statements into physical statements, I'm including logic. On my view, logic is just a feature of the language we use to talk about physical facts. (More on that if needed.) I'm not sure I would say "most." What do you mean by "safe to ignore"? If you're talking about something that doesn't reduce (even theoretically) into physics and/or a logical-mathematical function, then what are you talking about? Fiction? Magic? Those are fine things to talk about, as long as we understand we're talking about fiction or magic. What about this is hard to reduce? We can ask for what you mean by 'should' in this question, and reduce it if possible. Perhaps what you have in mind isn't reducible (divine commands), but then your question is without an answer. Or perhaps you're asking the question in the sense of "Please fix my broken question for me. I don't know what I mean by 'should'. Would you please do a stack trace on the cognitive algorithms that generated that question, fix my question, and then answer it for me?" And in that case we're doing empathic metaethics. I'm still confused as to what your objection is. Will you clarify?
4Wei Dai
You said that you're not interested in an "ought" sentence if it reduces to talking about the world of is not. I was trying to make the same point by "safe to ignore". I don't know, but I don't think it's a good idea to assume that only things that are reducible to physics and/or math are worth talking about. I mean it's a good working assumption to guide your search for possible meanings of "should", but why declare that you're not "interested" in anything else? Couldn't you make that decision on a case by case basis, just in case there is a meaning of "should" that talks about something else besides physics and/or math and its interestingness will be apparent once you see it? Maybe I should have waited until you finish your sequence after all, because I don't know what "doing empathic metaethics" actually entails at this point. How are you proposing to "fix my question"? It's not as if there is a design spec buried somewhere in my brain, and you can check my actual code against the design spec to see where the bug is... Do you want to pick up this conversation after you explain it in more detail?
3lukeprog
Maybe this is because I'm fairly confident of physicalism? Of course I'll change my mind if presented with enough evidence, but I'm not anticipating such a surprise. 'Interest' wasn't the best word for me to use. I'll have to fix that. All I was trying to say is that if somebody uses 'ought' to refer to something that isn't physical or logical, then this punts the discussion back to a debate over physicalism, which isn't the topic of my already-too-long 'Pluralistic Moral Reductionism' post. Surely, many people use 'ought' to refer to things non-reducible to physics or logic, and they may even be interesting (as in fiction), but in the search for true statements that use 'ought' language they are not 'interesting', unless physicalism is false (which is a different discussion, then). Does that make sense? I'll explain empathic metaethics in more detail later, but I hope we can get some clarity on this part right now.
2Wei Dai
First I would call myself a radical platonist instead of a physicalist. (If all universes that exist mathematically also exist physically, perhaps it could be said that there is no difference between platonism and physicalism, but I think most people who call themselves physicalists would deny that premise.) So I think it's likely that everything "interesting" can be reduced to math, but given the history of philosophy I don't think I should be very confident in that. See my recent How To Be More Confident... That You're Wrong.
2lukeprog
Right, I'm pretty partial to Tegmark, too. So what I call physicalism is compatible with Tegmark. But could you perhaps give an example of what it would mean to reduce normative language to a logical-mathematical function - even a silly one?
3Wei Dai
(It's late and I'm thinking up this example on the spot, so let me know if it doesn't make sense.) Suppose I'm in a restaurant and I say to my dinner companion Bob, "I'm too tired to think tonight. You know me pretty well. What do you think I should order?" From the answer I get, I can infer (when I'm not so tired) a set of joint constraints on what Bob believes to be my preferences, what decision theory he applied on my behalf, and the outcome of his (possibly subconscious) computation. If there is little uncertainty about my preferences and the decision theory involved, then the information conveyed by "you should order X" in this context just reduces to a mathematical statement about (for example) what the arg max of a set of weighted averages is. (I notice an interesting subtlety here. Even though what I infer from "you should order X" is (1) "according to Bob's computation, the arg max of ... is X", what Bob means by "you should order X" must be (2) "the arg max of ... is X", because if he means (1), then "you should order X" would be true even if Bob made an error in his computation.)
1lukeprog
Yeah, that's definitely compatible with what I'm talking about when I talk about reducing normative language to natural language (that is, to math/logic + physics). Do you think any disagreements or confusion remains in this thread?
6Wei Dai
Having thought more about these matters over the last couple of weeks, I've come to realize that my analysis in the grandparent comment is not very good, and also that I'm confused about the relationship between semantics (i.e., study of meaning) and reductionism. First, I learned that it's important (and I failed) to distinguish between (A) the meaning of a sentence (in some context), (B) the set of inferences that can be drawn from it, and (C) what information the speaker intends to convey. For example, suppose Alice says to Bob, "It's raining outside. You should wear your rainboots." The information that Alice really wants to convey by "it's raining outside" is that there are puddles on the ground. That, along with for example "it's probably not sunny" and "I will get wet if I don't use an umbrella", belongs to the set of inferences that can be drawn from the sentence. But clearly the meaning of "it's raining outside" is distinct from either of these. Similarly, the fact that Bob can infer that there are puddles on the ground from "you should wear your rainboots" does not show that "you should wear your rainboots" means "there are puddles on the ground". Nor does it seem to make sense to say that "you should wear your rainboots" reduces to "there are puddles on the ground" (why should it, when clearly "it's raining outside" doesn't reduce that way?), which, by analogy, calls into question my claim in the grandparent comment that "you should order X" reduces to "the arg max of ... is X". But I'm confused about what reductionism even means in the context of semantics. The Eliezer post that you linked to from Pluralistic Moral Reductionism defined "reductionism" as: But that appears to be a position about ontology, and it not clear to me what implications it has for semantics, especially for the semantics of normative language. (I know you posted a reading list for reductionism, which I have not gone though except to skim the encyclopedia entry. Please let me k
1lukeprog
Excellent. We should totally be clarifying such things. There are many things we might intend to communicate when we talk about the 'meaning' of a word or phrase or sentence. Let's consider some possible concepts of 'the meaning of a sentence', in the context of declarative sentences only: (1) The 'meaning of a sentence' is what the speaker intended to assert, that assertion being captured by truth conditions the speaker would endorse when asked for them. (2) The 'meaning of a sentence' is what the sentence asserts if the assertion is captured by truth conditions that are fixed by the sentence's syntax and the first definition of each word that is provided by the Oxford English Dictionary. (3) The 'meaning of a sentence' is what the speaker intended to assert, that assertion being captured by truth conditions determined by a full analysis of the cognitive algorithms that produced the sentence (which are not accessible to the speaker). There are several other possibilities, even just for declarative sentences. I tried to make it clear that when doing austere metaethics, I was taking #1 to be the meaning of a declarative moral judgment (e.g. "Murder is wrong!"), at least when the speaker of such sentences intended them to be declarative (rather than intending them to be, say, merely emotive or in other ways 'non-cognitive'). The advantage of this is that we can actually answer (to some degree, in many cases) the question of what a moral judgment 'means' (in the austere metaethics sense), and thus evaluate whether it is true or untrue. After some questioning of the speaker, we might determine that meaning~1 of "Murder is wrong" in a particular case is actually "Murder is forbidden by Yahweh", in which case we can evaluate the speaker's sentence as untrue given its truth conditions (given its meaning~1). But we may very well want to know instead what is 'right' or 'wrong' or 'good' or 'bad' when evaluating sentences that use those words using the third sense of
5Wei Dai
As I indicated in a recent comment, I don't really see the point of austere metaethics. Meaning~1 just doesn't seem that interesting, given that meaning~1 is not likely to be closely related to actual meaning, as in your example when someone thinks that by "Murder is wrong" they are asserting "Murder is forbidden by Yahweh". Empathic metaethics is much more interesting, of course, but I do not understand why you seem to assume that if we delve into the cognitive algorithms that produce a sentence like "murder is wrong" we will be able to obtain a list of truth conditions. For example if I examine the algorithms behind an Eliza bot that sometimes says "murder is wrong" I'm certainly not going to obtain a list of truth conditions. It seems clear that information/beliefs about math and physics definitely influence the production of normative sentences in humans, but it's much less clear that those sentences can be said to assert facts about math and physics. Can you show me an example of such idea transfer? (Depending on what ideas you want to transfer, perhaps you do not need to "fully" solve metaethics, in which case our interests might diverge at some point.) This is probably a good idea. (Nesov previously made a general suggestion along those lines.)
2lukeprog
What do you mean by 'actual meaning'? The point of pluralistic moral reductionism (austere metaethics) is to resolve lots of confused debates in metaethics that arise from doing metaethics (implicitly or explicitly) in the context of traditional conceptual analysis. It's clearing away the dust and confusion from such debates so that we can move on to figure out what I think is more important: empathic metaethics. I don't assume this. Whether this can be done is an open research question. My entire post 'Pluralistic Moral Reductionism' is an example of such idea transfer. First I specified that one way we can talk about morality is to stipulate what we mean by terms like 'morally good', so as to resolve debates about morality in the same way that we resolve a hypothetical debate about 'sound' by stipulating our definitions of 'sound.' Then I worked through the implications of that approach to metaethics, and suggested toward the end that it wasn't the only approach to metaethics, and that we'll explore empathic metaethics in a later post.
8Wei Dai
I don't know how to explain "actual meaning", but it seems intuitively obvious to me that the actual meaning of "murder is wrong" is not "murder is forbidden by Yahweh", even if the speaker of the sentence believes that murder is wrong because murder is forbidden by Yahweh. Do you disagree with this? But the way we actually resolved the debate about 'sound' is by reaching the understanding that there are two distinct concepts (acoustic vibrations and auditory experience) that are related in a certain way and also happen to share the same signifier. If, prior to reaching this understanding, you ask people to stipulate a definition for 'sound' when they use it, they will give you confused answers. I think saying "let's resolve confusions in metaethics by asking people to stipulating definitions for 'morally good'", before we reach a similar level of understanding regarding morality, is to likewise put the cart before the horse.
1lukeprog
That doesn't seem intuitively obvious to me, which illustrates one reason why I prefer to taboo terms rather than bash my intuitions against the intuitions of others in an endless game of intuitionist conceptual analysis. :) Perhaps the most common 'foundational' family of theories of meaning in linguistics and philosophy of language belong to the mentalist program, according to which semantic content is determined by the mental contents of the speaker, not by an abstract analysis of symbol forms taken out of context from their speaker. One straightforward application of a mentalist approach to meaning would conclude that if the speaker was assuming (or mentally representing) a judgment of moral wrongness in the sense of forbidden-by-God, then the meaning of the speaker's sentence refers in part to the demands of an imagined deity. But "reaching this understanding" with regard to morality was precisely the goal of 'Conceptual Analysis and Moral Theory' and 'Pluralistic Moral Reductionism.' I repeatedly made the point that people regularly use a narrow family of signifiers ('morally good', 'morally right', etc.) to call out a wide range of distinct concepts (divine attitudes, consequentialist predictions, deontological judgments, etc.), and that this leads to exactly the kind of confusion encountered by two people who are both using the signifier 'sound' to call upon two distinct concepts (acoustic vibrations and auditory experience).
6Wei Dai
With regard to "sound", the two concepts are complementary, and people can easily agree that "sound" sometimes refers to one or the other or often both of these concepts. The same is not true in the "morality" case. The concepts you list seem mutually exclusive, and most people have a strong intuition that "morality" can correctly refer to at most one of them. For example a consequentialist will argue that a deontologist is wrong when he asserts that "morality" means "adhering to rules X, Y, Z". Similarly a divine command theorist will not answer "well, that's true" if an egoist says "murdering Bob (in a way that serves my interests) is right, and I stipulate 'right' to mean 'serving my interests'". It appears to me confusion here is not being caused mainly by linguistic ambiguity, i.e., people using the same word to refer to different things, which can be easily cleared up once pointed out. I see the situation as being closer to the following: in many cases, people are using "morality" to refer to the same concept, and are disagreeing over the nature of that concept. Some people think it's equivalent to or closely related to the concept of divine attitudes, and others think it has more to do with well-being of conscious creatures, etc.
3cousin_it
When many people agree that murder is wrong but disagree on the reasons why, you can argue that they're referring to the same concept of morality but confused about its nature. But what about less clear-cut statements, like "women should be able to vote"? Many people in the past would've disagreed with that. Would you say they're referring to a different concept of morality?
2lukeprog
I'm not sure what it means to say that people have the same concept of morality but disagree on many of its most fundamental properties. Do you know how to elucidate that? I tried to explain some of the cause of persistent moral debate (as opposed to e.g. sound debate) in this way:
3Wei Dai
Let me try an analogy. Consider someone who believes in the phlogiston theory of fire, and another person who believes in the oxidation theory. They are having a substantive disagreement about the nature of fire, and not merely causing unnecessary confusion by using the same word "fire" to refer to different things. And if the phlogiston theorist were to say "by 'fire' I mean the release of phlogiston" then that would just be wrong, and would be adding to the confusion instead of helping to resolve it. I think the situation with "morality" is closer to this than to the "sound" example. (ETA: I could also try to define "same concept" more directly, for example as occupying roughly the same position in the graph of relationships between one's concepts, or playing approximately the same role in one's cognitive algorithms, but I'd rather not take an exact position on what "same concept" means if I can avoid it, since I have mostly just an intuitive understanding of it.)
0lukeprog
This is the exact debate currently being hashed out by Richard Joyce and Stephen Finlay (whom I interviewed here). A while back I wrote an article that can serve as a good entry point into the debate, here. A response from Joyce is here and here. Finlay replies again here. I tend to side with Finlay, though I suspect not for all the same reasons. Recently, Joyce has admitted that both languages can work, but he'll (personally) talk the language of error theory rather than the language of moral naturalism.
2Wei Dai
I'm having trouble understanding how the debate between Joyce and Finlay, over Error Theory, is the same as ours. (Did you perhaps reply to the wrong comment?)
-1lukeprog
Sorry, let me make it clearer... The core of their debate concerns whether certain features are 'essential' to the concept of morality, and thus concerns whether people share the same concept of morality, and what it would mean to say that people share the concept of morality, and what the implications of that are. Phlogiston is even one of the primary examples used throughout the debate. (Also, witches!)
2Wei Dai
I'm still not getting it. From what I can tell, both Joyce and Finlay implicitly assume that most people are referring to the same concept by "morality". They do use phlogiston as an example, but seemingly in a very different way from me, to illustrate different points. Also, two of the papers you link to by Joyce don't cite Finlay at all and I think may not even be part of the debate. Actually the last paper you link to by Joyce (which doesn't cite Finlay) does seem relevant to our discussion. For example this paragraph: I will read that paper over more carefully, and in the mean time, please let me know if you still think the other papers are also relevant, and point to specific passages if yes.
0lukeprog
This article by Joyce doesn't cite Finlay, but its central topic is 'concessive strategies' for responding to Mackie, and Finlay is a leading figure in concessive strategies for responding to Mackie. Joyce also doesn't cite Finlay here, but it discusses how two people who accept that Mackie's suspect properties fail to refer might nevertheless speak two different languages about whether moral properties exist (as Joyce and Finlay do). One way of expressing the central debate between them is to say that they are arguing over whether certain features (like moral 'absolutism' or 'objectivity') are 'essential' to moral concepts. (Without the assumption of absolutism, is X a 'moral' concept?) Another way to say that is to say that they are arguing over the boundaries of moral concepts; whether people can be said to share the 'same' concept of morality but disagree on some of its features, or whether this disagreement means they have 'different' concepts of morality. But really, I'm just trying to get clear on what you might mean by saying that people have the 'same' concept of morality while disagreeing on fundamental features, and what you think the implications are. I'm sorry my pointers to the literature weren't too helpful.
4Wei Dai
Unfortunately I'm not sure how to explain it better than I already did. But I did notice that Richard Chappell made a similar point (while criticizing Eliezer): Does his version makes any more sense?
3Vladimir_Nesov
Chappell's discussion makes more and more sense to me lately. Many previously central reasons for disagreement turn out to be my misunderstanding, but I haven't re-read enough to form a new opinion yet.
0lukeprog
Sure, except he doesn't make any arguments for his position. He just says: I don't think normative debates are always "merely verbal". I just think they are very often 'merely verbal', and that there are multiple concepts of normativity in use. Chappel and I, for example, seem to have different intuitions (see comments) about what normativity amounts to.
6Wei Dai
Let's say a deontologist and a consequentialist are on the board of SIAI, and they are debating which kind of seed AI the Institute should build. D: We should build a deontic AI. C: We should build a consequentialist AI. Surely their disagreement is substantive. But if by "we should do X", the deontologist just means "X is obligatory (by deontic logic) if you assume axiomatic imperatives Y and Z." and the consequentialist just means "X maximizes expected utility under utility function Y according to decision theory Z" then they are talking past each other and their disagreement is "merely verbal". Yet these are the kinds of meanings you seem to think their normative language do have. Don't you think there's something wrong about that? (ETA: To any bystanders still following this argument, I feel like I'm starting to repeat myself without making much progress in resolving this disagreement. Any suggestion what to do?)
0Peterdjones
I completely agree with what you are saying. Disagreement requires shared meaning. Cons. and Deont. are rival theories, not alternative meanings. Good question. There's a lot of momentum behind the "meaning theory".
0lukeprog
If the deontologist and the consequentialist have previously stipulated different definitions for 'should' as used in sentences D and C, then they aren't necessarily disagreeing with each other by having one state D and the other state C. But perhaps we aren't considering propositions D and C using meaning_stipulated. Perhaps we decide to consider propositions D and C using meaning-cognitive-algorithm. And perhaps a completed cognitive neuroscience would show us that they both mean the same thing by 'should' in the meaning-cognitive-algorithm sense. And in that case they would be having a substantive disagreement, when using meaning-cognitive-algorithm to determine the truth conditions of D and C. Thus: meaning-stipulated of D is X, meaning-stipulated of C is Y, but X and Y need not be mutually exclusive. meaning-cognitive-algorithm of D is A, meaning-cognitive-algorithm of C is B, and in my story above A and B are mutually exclusive. Since people have different ideas about what 'meaning' is, I'm skipping past that worry by tabooing 'meaning.' [Damn I wish LW would let me use underscores or subscripts instead of hyphens!]
3Vladimir_Nesov
You_can_do_that, just use backslash '\' to escape '\_' the underscores, although people quoting your text would need to repeat the trick.
0lukeprog
Thanks!
2Wei Dai
Suppose the deontologist and the consequentialist have previously stipulated different definitions for 'should' as used in sentences D and C, but if you ask them they also say that they are disagreeing with each other in a substantive way. They must be wrong about either what their sentences mean, or about whether their disagreement is substantive, right? (*) I think it's more likely that they're wrong about what their sentences mean, because meanings of normative sentences are confusing and lack of substantive disagreement in this particular scenario seems very unlikely. (*) If we replace "mean" in this sentence by "mean_stipulated", then it no longer makes sense, since clearly it's possible that their sentences mean_stipulated D and C, and that their disagreement is substantive. Actually now that I think about it, I'm not sure that "mean" can ever be correctly taboo'ed into "mean_stipulated". For example, suppose Bob says "By 'sound' I mean acoustic waves. Sorry, I misspoke, actually by 'sound' I mean auditory experiences. [some time later] To recall, by 'sound' I mean auditory experiences." The first "mean" does not mean "mean_stipulated" since Bob hadn't stipulated any meanings yet when he said that. The second "mean" does not mean "mean_stipulated" since otherwise that sentence would just be stating a plain falsehood. The third "mean" must mean the same thing as the second "mean", so it's also not "mean_stipulated". To continue along this line, suppose Alice inserts after the first sentence, "Bob, that sounds wrong. I think by 'sound' you mean auditory experiences." Obviously not "mean_stipulated" here. Alternatively, suppose Bob only says the first sentence, and nobody bothers to correct him because they've all heard the lecture several times and know that Bob means auditory experiences by 'sound', and think that everyone else knows. Except that Carol is new and doesn't know, and write in her notes "In this lecture, 'sound' means acoustic waves." in her note
0lukeprog
It seems a desperate move to say that stipulative meaning just isn't a kind of meaning wielded by humans. I use it all the time, it's used in law, it's used in other fields, it's taught in textbooks... If you think stipulative meaning just isn't a legitimate kind of meaning commonly used by humans, I don't know what to say. I agree, but 'tabooing' 'meaning' to mean (in some cases) 'stipulated meaning' shouldn't be objectionable because, as I said above, it's a very commonly used kind of 'meaning.' We can also taboo 'meaning' to refer to other types of meaning. And like I said, there often is substantive disagreement. I was just trying to say that sometimes there isn't substantive disagreement, and we can figure out whether or not we're having a substantive disagreement by playing a little Taboo (and by checking anticipations). This is precisely the kind of use for which playing Taboo was originally proposed:
0Wei Dai
To come back to this point, what if we can't translate a disagreement into disagreement over anticipations (which is the case in many debates over rationality and morality), nor do the participants know how to correctly Taboo (i.e., they don't know how to capture the meanings of certain key words), but there still seems to be substantive disagreement or the participants themselves claim they do have a substantive disagreement? Earlier, in another context, I suggested that we extend Eliezer's "make beliefs pay rent in anticipated experiences" into "make beliefs pay rent in decision making". Perhaps we can apply that here as well, and say that a substantive disagreement is one that implies a difference in what to do, in at least one possible circumstance. What do you think?
0Vladimir_Nesov
But I missed your point in the previous response. The idea of disagreement about decisions in the same sense as usual disagreement about anticipation caused by errors/uncertainty is interesting. This is not bargaining about outcome, for the object under consideration is agents' belief, not the fact the belief is about. The agents could work on correct belief about a fact even in the absence of reliable access to the fact itself, reaching agreement.
0Vladimir_Nesov
It seems that "what to do" has to refer to properties of a fixed fact, so disagreement is bargaining over what actually gets determined, and so probably doesn't even involve different anticipations.
0lukeprog
Wei Dai & Vladimir Nesov, Both your suggestions sound plausible. I'll have to think about it more when I have time to work more on this problem, probably in the context of a planned LW post on Chalmer's Verbal Disputes paper. Right now I have to get back to some other projects.
0lukeprog
Also perhaps of interest is Schroeder's paper, A Recipe for Concept Similarity.
0Wei Dai
But that assumes that two sides of the disagreement are both Taboo'ing correctly. How can you tell? (You do agree that Taboo is hard and people can easily get it wrong, yes?) ETA: Do you want to try to hash this out via online chat? I added you to my Google Chat contacts a few days ago, but it's still showing "awaiting authorization".
-1lukeprog
Not sure what 'correctly' means, here. I'd feel safer saying they were both Tabooing 'acceptably'. In the above example, Albert and Barry were both Tabooing 'acceptably.' It would have been strange and unhelpful if one of them had Tabooed 'sound' to mean 'rodents on the moon'. But Tabooing 'sounds' to talk about auditory experiences or acoustic vibrations is fine, because those are two commonly used meanings for 'sound'. Likewise, 'stipulated meaning' and 'intuitive meaning' and a few other things are commonly used meanings of 'meaning.' If you're saying that there's "only one correct meaning for 'meaning'" or "only one correct meaning for 'ought'", then I'm not sure what to make of that, since humans employ the word-tool 'meaning' and the word-tool 'ought' in a variety of ways. If whatever you're saying predicts otherwise, then what you're saying is empirically incorrect. But that's so obvious that I keep assuming you must be saying something else. Also relevant: Another point. Switching back to a particular 'conventional' meaning that doesn't match the stipulative meaning you just gave a word is one of the ways words can be wrong (#4). And frankly, I'm worried that we are falling prey to the 14th way words can be wrong: And, the 17th way words can be wrong: Now, I suspect you may be trying to say that I'm committing mistake #20: But I've pointed out that, for example, stipulative meaning is a very common usage of 'meaning'...
2Wei Dai
Could you please take a look at this example, and tell me whether you think they are Tabooing "acceptably"?
2lukeprog
That's a great example. I'll reproduce it here for readability of this thread: I'd rather not talk about 'wrong'; that makes things messier. But let me offer a few comments on what happened: 1. If this conversation occurred at a decision theory meetup known to have an even mix of CDTers and EDTers, then it was perhaps inefficient (for communication) for either of them to use 'rational' to mean either CDT-rational or EDT-rational. That strategy was only going to cause confusion until Tabooing occurred. 2. If this conversation occurred at a decision theory meetup for CDTers, then person A might be forgiven for assuming the other person would think of 'rational' in terms of 'CDT-rational'. But then person A used Tabooing to discover that an EDTer had snuck into the party, and they don't disagree about the solutions to Newcomb's problem recommended by EDT and CDT. 3. In either case, once they've had the conversation quoted above, they are correct that they don't disagree about the solutions to Newcomb's problem recommended by EDT and CDT. Instead, their disagreement lies elsewhere. They still disagree about what action has the highest expected value when an agent is faced with Newcomb's dilemma. Now that they've cleared up their momentary confusion about 'rational', they can move on to discuss the point at which they really do disagree. Tabooing for the win.
4Wei Dai
An action does not naturally "have" an expected value, it is assigned an expected value by a combination of decision theory, prior, and utility function, so we can't describe their disagreement as "about what action has the highest expected value". It seems that we can only describe their disagreement as about "what is rational" or "what is the correct decision theory" because we don't know how to Taboo "rational" or "correct" in a way that preserves the substantive nature of their disagreement. (BTW, I guess we could define "have" to mean "assigned by the correct decision theory/prior/utility function" but that doesn't help.) But how do they (or you) know that they actually do disagree? According to their Taboo transcript, they do not disagree. It seems that there must be an alternative way to detect substantive disagreement, other than by asking people to Taboo? ETA: If people actually disagree, but through the process of Tabooing conclude that they do not disagree (like in the above example), that should count as a lose for Tabooing, right? In the case of "morality", why do you trust the process of Tabooing so much that you do not give this possibility much credence?
0lukeprog
Fair enough. Let me try again: "They still disagree about what action is most likely to fulfill the agents desires when the agent is faced with Newcomb's dilemma." Or something like that. According to their Taboo transcript, they don't disagree over the solutions of Newcomb's problem recommended by EDT and CDT. But they might still disagree about whether EDT or CDT is most likely to fulfill the agent's desires when faced with Newcomb's problem. Yes. Ask about anticipations. That didn't happen in this example. They do not, in fact, disagree over the solutions to Newcomb's problem recommended by EDT and CDT. If they disagree, it's about something else, like who is the tallest living person on Earth or which action is most likely to fulfill an agent's desires when faced with Newcomb's dilemma. Of course Tabooing can go wrong, but it's a useful tool. So is testing for differences of anticipation, though that can also go wrong. No, I think it's quite plausible that Tabooing can be done wrong when talking about morality. In fact, it may be more likely to go wrong there than anywhere else. But it's also better to Taboo than to simply not use such a test for surface-level confusion. It's also another option to not Taboo and instead propose that we try to decode the cognitive algorithms involved in order to get a clearer picture of our intuitive notion of moral terms than we can get using introspection and intuition.
0Vladimir_Nesov
This introduces even more assumptions into the picture. Why fulfillment of desires or specifically agent desires is relevant? Why is "most likely" in there? You are trying to make things precise at the expense of accuracy, that's the big taboo failure mode, increasingly obscure lost purposes.
0lukeprog
I'm just providing an example. It's not my story. I invite you or Wei Dai to say what it is the two speakers disagree about even after they agree about the conclusions of CDT and EDT for Newcomb's problem. If all you can say is that they disagree about what they 'should' do, or what it would be 'rational' to do, then we'll have to talk about things at that level of understanding, but that will be tricky.
2Vladimir_Nesov
What other levels of understanding do we have? The question needs to be addressed on its own terms. Very tricky. There are ways of making this better, platonism extended to everything seems to help a lot, for example. Toy models of epistemic and decision-theoretic primitives also clarify things, training intuition.
0lukeprog
We're making progress on what it means for brains to value things, for example. Or we can talk in an ends-relational sense, and specify ends. Or we can keep things even more vague but then we can't say much at all about 'ought' or 'rational'.
0Vladimir_Nesov
The problem is that it doesn't look any better than figuring out what CDT or EDT recommend. What the brain recommends is not automatically relevant to the question of what should be done.
0lukeprog
If by 'should' in this sense you mean the 'intended' meaning of 'should' that we don't have access too, then I agree.
0lukeprog
Note: Wei Dai and I chatted for a while, and this resulted in three new clarifying paragraphs at the end of the is-ought section of my post 'Pluralistic Moral Reductionism.
1Wei Dai
Some remaining issues: Even given your disclaimer, I suspect we still disagree on the merits of Taboo as it apply to metaethics. Have you tried having others who are metaethically confused play Taboo in real life, and if so, did it help? People like Eliezer and Drescher, von Neumann and Savage, have been able to make clear progress in understanding the nature of rationality, and the methods they used did not involve much (if any) neuroscience. On "morality" we don't have such past successes to guide us, but your focus on neuroscience still seems misguided according to my intuitions.
0lukeprog
Yes. The most common result is that people come to realize they don't know what they mean by 'morally good', unless they are theists. If it looks like I'm focusing on neuroscience, I think that's an accident of looking at work I've produced in a 4-month period rather than over a longer period (that hasn't occurred yet). I don't think neuroscience is as central to metaethics or rationality as my recent output might suggest. Humans with meat-brains are strange agents who will make up a tiny minority of rational and moral agents in the history of intelligent agents in our light-cone (unless we bring an end to intelligent agents in our light-cone).
2Wei Dai
Huh, I think that would have been good to mention in one of your posts. (Unless you did and I failed to notice it.) It occurs to me that with a bit of tweaking to Austere Metaethics (which I'll call Interim Metaethics), we can help everyone realize that they don't know what they mean by "morally good". For example: Deontologist: Should we build a deontic seed AI? Interim Metaethicist: What do you mean by "should X"? Deontologist: "X is obligatory (by deontic logic) if you assume axiomatic imperatives Y and Z." Interim Metaethicist: Are you sure? If that's really what you mean, then when a consequentialist says "should X" he probably means "X maximizes expected utility according to decision theory Y and utility function Z". In which case the two of you do not actually disagree. But you do disagree with him, right? Deontologist: Good point. I guess I don't really mean that by "should". I'm confused. (Doesn't that seem like an improvement over Austere Metaethics?)
2lukeprog
I guess one difference between us is that I don't see anything particularly 'wrong' with using stipulative definitions as long as you're aware that they don't match the intended meaning (that we don't have access to yet), whereas you like to characterize stipulative definitions as 'wrong' when they don't match the intended meaning. But perhaps I should add one post before my empathic metaethics post which stresses that the stipulative definitions of 'austere metaethics' don't match the intended meaning - and we can make this point by using all the standard thought experiments that deontologists and utilitarians and virtue ethicists and contractarian theorists use against each other.
0Wei Dai
After the above conversation, wouldn't the deontologist want to figure out what he actually means by "should" and what its properties are? Why would he want to continue to use the stipulated definition that he knows he doesn't actually mean? I mean I can imagine something like: Deontologist: I guess I don't really mean that by "should", but I need to publish a few more papers for tenure, so please just help me figure out whether we should build an deontic seed AI under that stipulated definition of "should", so I can finish my paper and submit it to the Journal of Machine Deontology. But even in this case it would make more sense for him to avoid "stipulative definition" and instead say Deontologist: Ok, by "should" I actually mean a concept that I can't define at this point. But I guess it has something to do with deontic logic, and it would be useful to explore the properties of deontic logic in more detail. So, can you please help me figure out whether building a deontic seed AI is obligatory (by deontic logic) if we assume axiomatic imperatives Y and Z? This way, he clarifies to himself and others that ""X is obligatory (by deontic logic) if you assume axiomatic imperatives Y and Z." is not what he means by "should X", but instead a guess about the nature of morality (a concept that we can't yet precisely define). Perhaps you'd answer that a stipulated meaning is just that, a guess about the nature of something. But as you know, words have connotations, and I think the connotation of "guess" is more appropriate here than "meaning".
-3lukeprog
The problem is that we have to act in the world now. We can't wait around for metaethics and decision theory to be solved. Thus, science books have glossaries in the back full of highly useful operationalized and stiuplated definitions for hundreds of terms, whether or not they match the intended meanings (that we don't have access to) of those terms for person A, or the intended meanings of those terms for person B, or the intended meanings for those terms for person C. I think this glossary business is a familiar enough practice that calling that thing a glossary of 'meanings' instead of a glossary of 'guesses at meanings' is fine. Maybe 'meaning' doesn't have the connotations for me that it has for you. Science needs doing, laws need to be written and enforced, narrow AIs need to be programmed, best practices in medicine need to be written, agents need to act... all before metaethics and decision theory are solved. In a great many cases, we need to have meaning_stipulated before we can figure out meaning_intended.
1Wei Dai
Sigh... Maybe I should just put a sticky note on my monitor that says REMEMBER: You probably don't actually disagree with Luke, because whenever he says "X means Z by Y", he might just mean "X stipulated Y to mean Z", which in turn is just another way of saying "X guesses that the nature of Y is Z".
-1lukeprog
That might work. We humans have different intuitions about the meanings of terms and the nature of meaning itself, and thus we're all speaking slightly different languages. We always need to translate between our languages, which is where Taboo and testing for anticipations come in handy. I'm using the concept of meaning from linguistics, which seems fair to me. In linguistics, stipulated meaning is most definitely a kind of meaning (and not merely a kind of guessing at meaning), for it is often "what is expressed by the writer or speaker, and what is conveyed to the reader or listener, provided that they talk about the same thing."
3Vladimir_Nesov
Whatever the case, this language looks confusing/misleading enough to avoid. It conflates the actual search for intended meaning with all those irrelevant stipulations, and assigns misleading connotations to the words referring to these things. In Eliezer's sequences, the term was "fake utility function". The presence of "fake" in the term is important, it reminds of incorrectness of the view. So far, you've managed to confuse me and Wei with this terminology alone, probably many others as well.
0lukeprog
Perhaps, though I've gotten comments from others that it was highly clarifying for them. Maybe they're more used to the meaning of 'meaning' from linguistics. Does this new paragraph at the end of this section in PMR help?
0Vladimir_Nesov
It's not clear from this paragraph whether "intuitive concept" refers to the oafish tools in human brain (which have the same problems as stipulated definitions, including irrelevance) or the intended meaning that those tools seek. Conceptual analysis, as I understand, is concerned with analysis of the imperfect intuitive tools, so it's also unclear in what capacity you mention conceptual analysis here. (I do think this and other changes will probably make new readers less confused.)
0lukeprog
Here's the way I'm thinking about it. Roger has an intuitive concept of 'morally good', the intended meaning of which he doesn't fully have access to (but it could be discovered by something like CEV). Roger is confused enough to think that his intuitive concept of 'morally good' is 'that which produces the greatest pleasure for the greatest number'. The conceptual analyst comes along and says: "Suppose that an advanced team of neuroscientists and computer scientists could hook everyone's brains up to a machine that gave each of them maximal, beyond-orgasmic pleasure for the rest of their abnormally long lives. Then they will blast each person and their pleasure machine into deep space at near light-speed so that each person could never be interfered with. Would this be morally good?" ROGER: Huh. I guess that's not quite what I mean by 'morally good'. I think what I mean by 'morally good' is 'that which produces the greatest subjective satisfaction of wants in the greatest number'. CONCEPTUAL ANALYST: Okay, then. Suppose that an advanced team of neuroscientists and computer scientists could hook everyone's brains up to 'The Matrix' and made them believe and feel that all their wants were being satisfied, for the rest of their abnormally long lives. Then they will blast each person and their pleasure machine into deep space at near light-speed so that each person could never be interfered with. Would this be morally good? ROGER: No, I guess that's not what I mean, either. What I really mean is... And around and around we go, for centuries. The problem with trying to access our intended meaning for 'morally good' by this intuitive process is that it brings into play, as you say, all the 'oafish tools' in the human brain. And philosophers have historically not paid much attention to the science of how intuitions work. Does that make sense?
0Vladimir_Nesov
That intuition says the same thing as "pleasure-maximization", or that intended meaning can be captured as "pleasure-maximization"? Even if intuition is saying exactly "pleasure-maximization", it's not necessarily the intended meaning, and so it's unclear why one would try to replicate the intuitive tool, rather than search for a characterization of the intended meaning that is better than the intuitive tool. This is the distinction I was complaining about. (This is an isolated point unrelated to the rest of your comment.)
3lukeprog
Understood. I think I'm trying to figure out if there's a better way to talk about this 'intended meaning' (that we don't yet have access to) than to say 'intended meaning' or 'intuitive meaning'. But maybe I'll just have to say 'intended meaning (that we don't yet have access to)'. New paragraph version:
0Vladimir_Nesov
You think this applies to figuring out decision theory for FAI? If not, how is that relevant in this context?
0lukeprog
Vladimir, I've been very clear many times that 'austere metaethics' is for clearing up certain types of confusions, but won't do anything to solve FAI, which is why we need 'empathic metaethics'.
0Vladimir_Nesov
I was discussing that particular comment, not rehashing the intention behind 'austere metaethics'. More specifically, you made a statement "We can't wait around for metaethics and decision theory to be solved." It's not clear to me what purpose is being served by what alternative action to "waiting around for metaethics to be solved". It looks like you were responding to Wei's invitation to justify the use of word "meaning" instead of "guess", but it's not clear how your response relates to that question.
0lukeprog
Like I said over here, I'm using the concept of 'meaning' from linguistics. I'm hoping that fewer people are confused by my use of 'meaning' as employed in the field that studies meaning than if I had used 'meaning' in a more narrow and less standard way, like Wei Dai's. Perhaps I'm wrong about that, but I'm not sure. My comment above about how "we have to act in the world now" gives one reason why, I suspect, the linguist's sense of 'meaning' includes stipulated meaning, and why stipulated meaning is so common. In any case, I think you and Wei Dai have helped me think about how to be more clear to more people by adding such clarifications as this.
0Vladimir_Nesov
(This is similar to my reaction expressed here.)
0Vladimir_Nesov
In those paragraphs, you add intuition as an alternative to stipulated meaning. But this is not what we are talking about, we are talking about some unknown, but normative meaning that can't be presently stipulated, and is referred partly through intuition in a way that is more accurate than any currently available stipulation. What intuition tells is as irrelevant as what the various stipulations tell, what matters is the thing that the imperfect intuition refers. This idea doesn't require a notion of automated stipulation ("empathic" discussion).
0lukeprog
"some unknown, but normative meaning that can't be presently stipulated" is what I meant by "intuitive meaning" in this case. I've never thought of 'empathic' discussion as 'automated stipulation'. What do you mean by that? Even our stipulated definitions are only promissory notes for meaning. Luckily, stipulated definitions can be quite useful for achieving our goals. Figuring out what we 'really want', or what we 'rationally ought to do' when faced with Newcomb's problem, would also be useful. Such terms are carry even more vague promissory notes for meaning than stipulated definitions, and yet they are worth pursuing.
3Vladimir_Nesov
My understanding of this topic is as follows. Treat intuition as just another stipulated definition, that happens to be expressed as a pattern of mind activity, as opposed to a sequence of words. The intuition itself doesn't define the thing it refers to, it can be slightly wrong, or very wrong. The same goes for words. Both intuition and various words we might find are tools for referring to some abstract structure (intended meaning), that is not accurately captured by any of these tools. The purpose of intuition, and of words, is in capturing this structure accurately, accessing its properties. We can develop better understanding by inventing new words, training new intuitions, etc. None of these tools hold a privileged position with respect to the target structure, some of them just happen to more carefully refer to it. At the beginning of any investigation, we would typically only have intuitions, which specify the problem that needs solving. They are inaccurate fuzzy lumps of confusion, too. At the same time, any early attempt at finding better tools will be unsuccessful, explicit definitions will fail to capture the intended meaning, even as intuition doesn't capture it precisely. Attempts at guiding intuition to better precision can likewise make it a less accurate tool for accessing the original meaning. On the other hand, when the topic is well-understood, we might find an explicit definition that is much better than the original intuition. We might train new intuitions that reflect the new explicit definition, and are much better tools than the original intuition.
0lukeprog
As far as I can tell, I agree with all of this.
5Vladimir_Nesov
And as far as I can tell, you don't agree. You express agreement too much, like your stipulated-meaning thought experiments, this is one of the problems. But I'd probably need a significantly more clear presentation of what feels wrong to make progress on our disagreement.
0lukeprog
I look forward to it. I'm not sure what you mean by "you agree too much", though. Like I said, as far as I can tell I agree with everything in this comment of yours.
1Vladimir_Nesov
I agree with Wei. There is no reason to talk about "highest expected value" specifically, that would be merely a less clear option on the same list as CDT and EDT recommendations. We need to find the correct decision instead, expected value or not. Playing Eliezer-post-ping-pong, you are almost demanding "But what do you mean by truth?". When an idea is unclear, there will be ways of stipulating a precise but even less accurate definition. Thus, you move away from the truth, even as you increase the clarity of discussion and defensibility of the arguments.
0lukeprog
I updated the bit about expected value here. No, I agree there are important things to investigate for which we don't have clear definitions. That's why I keep talking about 'empathic metaethics.' Also, by 'less accurate definition' do you just mean that a stipulated definition can differ from the intuitive definition that we don't have access to? Well of course. But why privilege the intuitive definition by saying a stipulated definition is 'less accurate' than it is? I suspect that intuitive definitions are often much less successful at capturing an empirical cluster than some stipulated definitions. Example: 'planet'.
2Vladimir_Nesov
Not "just". Not every change is an improvement, but every improvement is a change. There can be better definitions of whatever the intuitions are talking about, and they will differ from the intuitive definitions. But when the purpose of discussion is referred by an unclear intuition with no other easy ways to reach it, stipulating a different definition would normally be a change that is not an improvement. It's not easy to find a more successful definition of the same thing. You can't always just say "taboo" and pick the best thought that decades of careful research failed to rule out. Sometimes the intuitive definition is still better, or, more to the point, the precise explicit definition still misses the point.
0Vladimir_Nesov
(They perhaps shouldn't have done that.)
2Vladimir_Nesov
An analogy for "sharing common understanding of morality". In the sound example, even though the arguers talk about different situations in a confusingly ambiguous way, they share a common understanding of what facts hold in reality. If they were additionally ignorant about reality in different ways (even though there would still be the same truth about reality, they just wouldn't have reliable access to it), that would bring the situation closer to what we have with morality.
0lukeprog
Can you elaborate this a bit more? I don't follow.
-7Peterdjones
0Vladimir_Nesov
Even by getting such confused answers out in the open, we might get them to break out of complacency and recognize the presence of confusion. (Fat chance, of course.)
0Vladimir_Nesov
This makes sense. My impression of the part of the sequence written so far would've been significantly affected if I had this intention understood (I don't fully believe it now, but more so than I did before reading your comment).
0lukeprog
What is 'it', here? My intention? If you have doubts that my intention has been (for many months) to first clear away the dust and confusion of mainstream metaethics so that we can focus more clearly on the more important problems of metaethics, you can ask Anna Salamon, because I spoke to her about my intentions for the sequence before I put up the first post in the sequence. I think I spoke to others about my intentions, too, but I can't remember which parts of my intentions I spoke about to which people (besides Anna). There's also this comment from me more than a month ago.
0Vladimir_Nesov
I believe that you believe it, but I'm not sure it's so. There are many reasons for any event. Specifically, you use austere debating in real arguments, which suggests that you place more weight on the method than just as a tool for exposing confusion. (You seem to have reacted emotionally to a question of simple fact, and thus conflated the fact with your position on the fact, which status intuitions love to make people do. I think it's a bad practice.)
0lukeprog
What do you mean by 'austere debating'? Do you just mean tabooing terms and then arguing about facts and anticipations? If so then yes, I do that all the time...
3Wei Dai
I'm not sure if we totally agree, but if there is any disagreement left in this thread, I don't think it's substantial enough to keep discussing at this point. I'd rather that we move on to talking about how you propose to do empathic metaethics. BTW, I'd like to give another example that shows the difficulty of reducing (some usages of) normative language to math/physics. Suppose I'm facing Newcomb's problem, and I say to my friend Bob, "I'm confused. What should I do?" Bob happens to be a causal decision theorist, so he says "You should two-box." It's clear that Bob can not just mean "the arg max of ... is 'two-box'" (where ... is the formula given by CDT), since presumably "you should two-box" is false and "the arg max of ... is 'two-box'" is true. Instead he probably means something like "CDT is the correct decision theory, and the arg max of ... is 'two-box'", but how do we reduce the first part of this sentence to physics/math?
1lukeprog
I'm not saying that reducing to physics/math is easy. Even ought language stipulated to refer to, say, the well-being of conscious creatures is pretty hard to reduce. We just don't have that understanding yet. But it sure seems to be pointing to things that are computed by physics. We just don't know the details. I'm just trying to say that if I'm right about reductionism, and somebody uses ought language in a way that isn't likely to reduce to physics/math, then their ought language isn't likely to refer successfully. We can hold off the rest of the dialogue until after another post or two; I appreciate your help so far. As a result of my dialogue with you, Sawin, and Nesov, I'm going to rewrite the is-ought part of 'Pluralistic Moral Reductionism' for clarity.
0Will_Sawin
Do you accept the conclusion I draw from my version of this argument?
0Wei Dai
I agree with you up to this part: I made the same argument (perhaps not very clearly) at http://lesswrong.com/lw/44i/another_argument_against_eliezers_metaethics/ But I'm confused by the rest of your argument, and don't understand what conclusion you're trying to draw apart from "CEV can't be the definition of morality". For example you say: I don't understand why believing something to be important implies that it has a long definition.
2Will_Sawin
Ah. So this is what I am saying. If you say "I define should as [Eliezers long list of human values]" then I say: "That's a long definition. How did you pick that definition?" and you say: 'Well, I took whatever I thought was morally important, and put it into the definition." In the part you quote I am arguing that (or at least claiming that) other responses to my query are wrong. I would then continue: "Using the long definition is obscuring what you really mean when you say 'should'. You really mean 'what's important', not [the long list of things I think are important]. So why not just define it as that?"
3Vladimir_Nesov
One more way to describe this idea. I ask, "What is morality?", and you say, "I don't know, but I use this brain thing here to figure out facts about it; it errs sometimes, but can provide limited guidance. Why do I believe this "brain" is talking about morality? It says it does, and it doesn't know of a better tool for that purpose presently available. By the way, it's reporting that are morally relevant, and is probably right."
0Wei Dai
Where do you get "is probably right" from? I don't think you can get that if you take an outside view and consider how often a human brain is right when it reports on philosophical matters in a similar state of confusion...
0Vladimir_Nesov
Salt to taste, the specific estimate is irrelevant to my point, so long as the brain is seen as collecting at least some moral information, and not defining the whole of morality. The level of certainty in brain's moral judgment won't be stellar, but more reliable for simpler judgments. Here, I referred "morally relevant", which is a rather weak matter-of-priority kind of judgment, as opposed to deciding which of the given options are better.
0Will_Sawin
Beautiful. I would draw more attention to the "Why.... ? It says it does" bit, but that seems right.
1Vladimir_Nesov
You'd need the FAI able to change its mind as well, which requires that you retain this option in its epistemology. To attack the communication issue from a different angle, could you give examples of the kinds of facts you deny? (Don't say "god" or "magic", give a concrete example.)
0lukeprog
Yes, we need the FAI to be able to change its mind about physicalism. I don't think I've ever been clear about what people mean to assert when they talk about things that don't reduce to physics/math. Rather, people describe something non-natural or supernatural and I think, "Yeah, that just sounds confused." Specific examples of things I deny because of my physicalism are Moore's non-natural goods and Chalmers' conception of consciousness.
1Peterdjones
SInce you can't actually reduce[*] 99.99% of your vocabulary, you're either so confused you couldn't possibly think or communicate...or you're only confused about the nature of confusion. [*] Try reducing "shopping" to quarks, electrons and photons.You can't do it, and if you could, it would tell you nothing useful. Yet there is nothing that is not made of quarks,electrons and photons involved.
0Vladimir_Nesov
Not much better than "magic", doesn't help.
0lukeprog
Is this because you're not familiar with Moore on non-natural goods and Chalmers on consciousness, or because you agree with me that those ideas are just confused?
1Vladimir_Nesov
They are not precise enough to carefully examine. I can understand the distinction between a crumbling bridge and 3^^^^3>3^^^3, it's much less clear what kind of thing "Chalmers' view on consciousness" is. I guess I could say that I don't see these things as facts at all unless I understand them, and some things are too confusing to expect understanding them (my superpower is to remain confused by things I haven't properly understood!). (To compare, a lot of trouble with words is incorrectly assuming that they mean the same thing in different contexts, and then trying to answer questions about their meaning. But they might lack a fixed meaning, or any meaning at all. So the first step before trying to figure out whether something is true is understanding what is meant by that something.)
-2Peterdjones
How are you on dark matter? (No new idea is going to be precise, because precise definitions come from established theories, and established theories come from speculative theories, and speculative theories are theories about something that is defined relatively vaguely. The Oxygen theory of combustion was a theory about "how burning works"-- it was not, circularly, the Oxygen theory of Oxidisation).
-1Peterdjones
Dude, you really need to start distinguishing between reducible-in-principle and usefully-reducible and doesn't need-reducing.
2Will_Sawin
That's making a pre-existing assumption that everyone speaks in physics language. It's circular. Speaking in physic language about something that isn't in the actual physics is fiction. I'm not sure what magic is. What is physics language? Physics language consists of statements that you can cash out, along with a physical world, to get "true" or "false" What is moral language? Moral language consists of statements that you can cash out, along with a preference order on the set of physical worlds, to get "true" or "false". ETA: IF you don't accept this, the first step is accepting that the statement "Flibber fladoo." does not refer to anything in physics, and is not a fiction.
0lukeprog
No, of course lots of people use 'ought' terms and other terms without any reduction to physics in mind. All I'm saying is that if I'm right about reductionism, those uses of ought language will fail to refer. Sure, that's one way to use moral language. And your preference order is computed by physics.
6Will_Sawin
That's the way I'm talking about, so you should be able to ignore the other ways in your discussion with me. You are proposing a function MyOrder from {states of the world} to {preference orders} This gives you a natural function from {statements in moral language} to {statements in physical language} but this is not a reduction, it's not what those statements mean, because it's not what they're defined to mean.
2lukeprog
I think I must be using the term 'reduction' in a broader sense than you are. By reduction I just mean the translation of (in this case) normative language to natural language - cashing things out in terms of lower-level natural statements.
2Will_Sawin
But you can't reduce an arbitrary statement. You can only do so when you have a definition that allows you to reduce it. There are several potential functions from {statements in moral language} to {statements in physical language}. You are proposing that for each meaningful use of moral language, one such function must be correct by definition. I am saying, no, you can just make statements in moral language which do not correspond to any statements in physical language.
0lukeprog
Not what I meant to propose. I don't agree with that. Of course you can. People do it all the time. But if you're a physicalist (by which I mean to include Tegmarkian radical platonists), then those statements fail to successfully refer. That's all I'm saying.
4Will_Sawin
I am standing up for the usefulness and well-definedness of statements that fail to successfully refer.
0lukeprog
Okay, we're getting nearer to understanding each other, thanks. :) Perhaps you could give an example of a non-normative statement that is well-defined and useful even though it fails to refer? Perhaps then I can grok better where you're coming from. Elsewhere, you said: Goodness, no. I'm not arguing that all translations of 'ought' are equally useful as long as they successfully refer! But now you're talking about something different than the is-ought gap. You're talking about a gap between "hypothetical-ought-statements and categorical-ought-statements." Could you describe the gap, please? 'Categorical ought' in particular leaves me with uncertainty about what you mean, because that term is used in a wide variety of ways by philosophers, many of them incoherent. I genuinely appreciate you sticking this out with me. I know it's taking time for us to understand each other, but I expect serious fruit to come of mutual understanding.
0Will_Sawin
I don't think any exist, so I could not do so. I'm saying that the fact that you can use a word to have a meaning in class X does not provide much evidence that the other uses of that word have a meaning in class X. Hypothetical-ought statements are a certain kind of statement about the physical world. They're the kind that contain the word "ought", but they're just an arbitrary subset of the "is"-statements. Categorical-ought statements are statements of support for a preference order. (not statements about support.) Since no fact can imply a preference order, no is-statement can imply a categorical-ought-statement.
1Vladimir_Nesov
(Physical facts can inform you about what the right preference order is, if you expect that they are related to the moral facts.)
0Will_Sawin
perhaps the right thing to say is "No fact can alone imply a preference order."
1Vladimir_Nesov
But no fact can alone imply anything (in this sense), it's not a point specific to moral values, and in any case a trivial uninteresting point that is easily confused with a refutation of the statement I noted in the grandparent.
5torekp
No fact alone can imply anything: true and important. For example, a description of my brain at the neuronal level does not imply that I'm awake. To get the implication, we need to add a definition (or at least some rule) of "awake" in neuronal terms. And this definition will not capture the meaning of "awake." We could ask, "given that a brain is , is it awake?" and intuition will tell us that it is an open question. But that is beside the point, if what we want to know is whether the definition succeeds. The definition does not have to capture the meaning of "awake". It only needs to get the reference correct. Reduction doesn't typically involve capturing the meaning of the reduced terms - Is the (meta)ethical case special? If so, why and how?
0Wei Dai
Great question. It seems to me that normative ethics involves reducing the term "moral" without necessarily capturing the meaning, whereas metaethics involves capturing the meaning of the term. And the reason we want to capture the meaning is so that we know what it means to do normative ethics correctly (instead of just doing it by intuition, as we do now). It would also allow an AI to perform normative ethics (i.e., reduce "moral") for us, instead of humans reducing the term and programming a specific normative ethical theory into the AI.
0torekp
I doubt that metaethics can wholly capture the meaning of ethical terms, but I don't see that as a problem. It can still shed light on issues of epistemics, ontology, semantics, etc. And if you want help from an AI, any reduction that gets the reference correct will do, regardless of whether meaning is captured. A reduction need not be a full-blown normative ethical theory. It just needs to imply one, when combined with other truths.
0Vladimir_Nesov
This is not a problem in the same sense as astronomical waste that will occur during the rest of this year is not a problem: it's not possible to do something about it.
0Vladimir_Nesov
(I agree with your comment.) A formal logical definition often won't capture the full meaning of a mathematical structure (there may be non-standard models of the logical theory, and true statements it won't infer), yet it has the special power of allowing you to correctly infer lots of facts about that structure without knowing anything else about the intended meaning. If we are given just a little bit less, then the power to infer stuff gets reduced dramatically. It's important to get a definition of morality in a similar sense and for similar reasons: it won't capture the whole thing, yet it must be good enough to generate right actions even in currently unimaginable contexts.
6Wei Dai
Formal logic does seem very powerful, yet incomplete. Would you be willing to create an AI with such limited understanding of math or morality (assuming we can formalize an understanding of morality on par with math), given that it could well obtain supervisory power over humanity? One might justify it by arguing that it's better than the alternative of trying to achieve and capture fuller understanding, which would involve further delay and risk. See for example Tim Freeman's argument in this line, or my own. Another alternative is to build an upload-based FAI instead, like Stuart Armstrong's recent proposal. That is, use uploads as components in a larger system, with lots of safety checks. In a way Eliezer's FAI ideas can also be seen as heavily upload based, since CEV can be interpreted (as you did before) as uploads with safety checks. (So the question I'm asking can be phrased as, instead of just punting normative ethics to CEV, why not punt all of meta-math, decision theory, meta-ethics, etc., to a CEV-like construct?) Of course you're probably just as unsure of these issues as I am, but I'm curious what your current thoughts are.
2Vladimir_Nesov
Humans are also incomplete in this sense. We already have no way of capturing the whole problem statement. The goal is to capture it as well as possible using some reflective trick of looking at our own brains or behavior, which is probably way better than what an upload singleton that doesn't build a FAI is capable of. If there are uploads, they could be handed the task of solving the problem of FAI in the same sense in which we try to, but this doesn't get us any closer to the solution. There should probably be a charity dedicated to designing upload-based singletons as a kind of high-impact applied normative ethics effort (and SIAI might want to spawn one, since rational thinking about morality is important for this task; we don't want fatalistic acceptance of a possible Malthusian dystopia or unchecked moral drift), but this is not the same problem as FAI.
1Wei Dai
Humans are at least capable of making some philosophical progress, and until we solve meta-philosophy, no de novo AI is. Assuming that we don't solve meta-philosophy first, any de novo AIs we build will be more incomplete than humans. Do you agree? It gets closer to the solution in the sense that there is no longer a time pressure, since it's easier for an upload-singleton to ensure their own value stability, and they don't have to worry about people building uFAIs and other existential risks while they work on FAI. They can afford to try harder to get to the right solution than we can.
1Vladimir_Nesov
There is a time pressure from existential risk (also, astronomical waste). Just as in FAI vs. AGI race, we would have a race between FAI-building and AGI-building uploads (in the sense of "who runs first", but also literally while restricted by speed and costs). And fast-running uploads pose other risks as well, for example they could form an unfriendly singleton without even solving AGI, or build runaway nanotech. (Planning to make sure that we run a prepared upload FAI team before a singleton of any other nature can prevent it is an important contingency, someone should get on that in the coming decades, and better metaethical theory and rationality education can help in that task.)
4Wei Dai
I should have made myself clearer. What I meant was assuming that an organization interested in building FAI can first achieve an upload-singleton, it won't be facing competition from other uploads (since that's what "singleton" means). It will be facing significantly less time pressure than a similar organization trying to build FAI directly. (Delay will still cause astronomical waste due to physical resources falling away into event horizons and the like, but that seems negligible compared to the existential risks that we face now.)
2Vladimir_Nesov
But this assumption is rather unlikely/difficult to implement, so in the situation where we count on it, we've already lost a large portion of the future. Also, this course of action (unlikely to succeed as it is in any case) significantly benefits from massive funding to buy computational resources, which is a race. The other alternative, which is educating people in a way that increases the chances of a positive upload-driven outcome, is also a race, for development of better understanding of metaethics/rationality and for educating more people better.
0Vladimir_Nesov
Philosophical progress is just a special kind of physical action that we can perform, valuable for abstract reasons that feed into what constitutes our values. I don't see how this feature is fundamentally different from pointing to any other complicated aspect of human values and saying that AI must be able to make that distinction or destroy all value with its mining claws. Of course it must.
0Will_Sawin
Agreed, however, it is somewhat useful in pointing out a specific, common, type of bad argument.
0lukeprog
Okay, so you think that the only class of statements that are well-defined and useful but fail to refer is the class of normative statements? Why are they special in this regard? Agreed. What do you mean by this? Do you mean that a categorical-ought statement is a statement of support as in "I support preference-ordering X", as opposed to a statement about support as in "preference-ordering X is 'good' if 'good' is defined as 'maximizes Y'"? What do you mean by 'preference order' such that no fact can imply a preference order? I'm thinking of a preference order as a brain state, including parts of the preference ordering that are extrapolated from that brain state. Surely physical facts about that brain state and extrapolations from it imply (or entail, or whatever) the preference order...
0Will_Sawin
Because a positive ('is") statement + a normative ("ought) statement is enough information to determine an action, and once actions are determined you don't need further information. "information" may not be the right word. I believe "I ought to do X" if and only if I support preference-ordering X. I'm thinking of a preference order as just that: a map from the set of {states of the world} x {states of the world} to the set {>, =, <}. The brain state encodes a preference order but it does not constitute a preference order. I believe "this preference order is correct" if and only if there is an encoding in my brain of this preference order. Much like how: I believe "this fact is true" if and only if there is an encoding in my brain of this fact.
0lukeprog
I've continued our dialogue here.
0Vladimir_Nesov
What if it's encoded outside your brain, in a calculator for example, while your brain only knows that calculator shows indication "28" on display iff the fact is true? Or, say, I know that my computer contains a copy of "Understand" by Ted Chiang, even though I don't remember its complete text. Finally, some parts of my brain don't know what other parts of my brain know. The brain doesn't hold a privileged position with respect of where the data must be encoded to be referred, it can as easily point elsewhere.
0Will_Sawin
Well if I see the screen then there's an encoding of "28" in my brain. Not of the reason why 28 is true, but at least that the answer is "28". You believe that "the computer contains a copy of Understand", not "the computer contains a book with the following text: [text of Understand]". Obviously, on the level of detail in which the notion of "belief" starts breaking down, the notion of "belief" starts breaking down. But still, it remains; When we say that I know a fact, the statement of my fact is encoded in my brain. Not the referent, not an argument for that statement, just: a statement.
0Vladimir_Nesov
Yet you might not know the question. "28" only certifies that the question makes a true statement. Exactly. You don't know [text of Understand], yet you can reason about it, and use it in your designs. You can copy it elsewhere, and you'll know that it's the same thing somewhere else, all without having an explicit or any definition of the text, only diverse intuitions describing its various aspects and tools for performing operations on it. You can get an md5 sum of the text, for example, and make a decision depending on its value, and you can rely on the fact that this is an md5 sum of exactly the text of "Understand" and nothing else, even though you don't know what the text of "Understand" is. This sort of deep wisdom needs to be the enemy (it strikes me often enough). Acts as curiosity-stopper, covering the difficulty in understanding things more accurately. (What's "just a statement"?)
0Will_Sawin
In certain AI designs, this problem is trivial. In humans, this problem is not simple. The complexities of the human version of this problem do not have relevance to anything in this overarching discussion (that I am aware of).
-2Peterdjones
So you say. Many would say that you need the argument (proof, justification, evidence) for a true belief for it to qualify as knowledge.
0Will_Sawin
Obviously, this doesn't prevent me from saying that I know something without an argument.
-2Peterdjones
You can say that you are the Queen of Sheba. It remains the case that knowledge is not lucky guessing, so an argument, evidence or some other justification is required.
0Will_Sawin
Yes, but this is completely and totally irrelevant to the point I was making, that: I will profess that a statement, X, is true, if and only if "X" is encoded in a certain manner in my brain. Yet "X is true" does not mean "X is encoded in this manner in my brain."
0lukeprog
Been really busy, will respond to this in about a week. I want to read your earlier discussion post, first, too.
0Vladimir_Nesov
Encodings are relative to interpretations. Something has to decide that a particular fact encodes particular other fact. And brains don't have a fundamental role here, even if they might contain most of the available moral information, if you know how to get it. The way in which decisions are judged to be right or wrong based on moral facts and facts about the world, where both are partly inferred with use of empirical observations, doesn't fundamentally distinguish the moral facts from the facts about the world, so it's unclear how to draw a natural boundary that excludes non-moral facts without excluding moral facts also.
0Will_Sawin
My ideas work unless it's impossible to draw the other kind of boundary, including only facts about the world and not moral facts. Is it? If it's impossible, why?
0Vladimir_Nesov
It's the same boundary, just the other side. If you can learn of moral facts by observing things, if your knowledge refers to a joint description of moral and physical facts, state of your brain say as the physical counterpart, and so your understanding of moral facts benefits from better knowledge and further observation of physical facts, you shouldn't draw this boundary.
0Will_Sawin
There is an asymmetry. We can only make physical observations, not moral observations. This means that every state of knowledge about moral and physical facts maps to a state of knowledge about just physical facts, and the evolution of the 2nd is determined only by evidence, with no reference to moral facts.
0Vladimir_Nesov
To the extent we haven't defined what "moral observations" are exactly, so that the possibility isn't ruled out in a clear sense, I'd say that we can make moral observations, in the same sense in which we can make arithmetical observations by looking at a calculator display or consulting own understanding of mathematical facts maintained by brain.
0Will_Sawin
That is, by deducing mathematical facts from new physical facts. Can you deduce physical facts from new moral facts?
0Vladimir_Nesov
Not necessarily, you can just use physical equipment without having any understanding of how it operates or what it is, and the only facts you reason about are non-physical (even though you interact with physical facts, without reasoning about them). Why not?
0Will_Sawin
Because your only sources of new facts are your senses.
0Peterdjones
You can't infer new (to you) facts from information you already have? You can't just be told things? A martian,. being told that pre marital sex became less of an issue after the sixities might be able to deduce the physical fact that contraceptive technology was improved in the sixities.
2Will_Sawin
I guess you could but you couldn't be a perfect Bayesian. Generally, when one is told something, one becomes aware of this from one's senses, and then infers things from the physical fact that one is told. I'm definitely not saying this right. The larger point I'm trying to make is that it makes sense to consider an agent's physical beliefs and ignore their moral beliefs. That is a well-defined thing to do.
-2Peterdjones
Where does it say that? One needs good information, but the sense can err, and hearsay can be reliable. The sense are of course involved in acquiring second hand information, but there is still a categoreal difference between showing and telling. In order to achieve what?
2Will_Sawin
Simplicity, maybe?
-2Peterdjones
A simple way of doing what?
2Will_Sawin
Answering questions like "What are true beliefs? What is knowledge? How does science work?'
-2Peterdjones
How can you answer questions about true moral beliefs whilst ignoring moral beliefs?
0Will_Sawin
Well, that's one of the things you can't do whilst ignoring moral beliefs.
0wedrifid
All the same comprehension of the state of the world, including how beliefs about "true morals" remain accessible. They are simply considered to be physical facts about the construction of certain agents.
0Peterdjones
That's an answer to the question "how do you deduce moral beliefs from physical facts",not the question in hand: "how do you deduce moral beliefs from physical beliefs".
-4wedrifid
Physical beliefs are constructed from physical facts. Just like everything else!
0Peterdjones
But the context of the discussion was what can be inferred from physical beliefs.
0Vladimir_Nesov
Also your thoughts, your reasoning, which is machinery for perceiving abstract facts, including moral facts.
2Will_Sawin
How might one deduce new physical facts from new moral facts produced by abstract reasoning?
0Vladimir_Nesov
You can predict that (physical) human babies won't be eaten too often. Or that a calculator will have a physical configuration displaying something that you inferred abstractly.
2Will_Sawin
You can make those arguments in an entirely physical fashion. You don't need the morality. You do need the mathematical abstraction to bundle and unbundle physical facts.
0Vladimir_Nesov
You can use calculators without knowing abstract math too, but it makes sense to talk of mathematical facts independent of calculators.
2Will_Sawin
But it also makes sense to talk about calculators without abstract math. That's all I'm saying.
0Vladimir_Nesov
I agree. But it's probably not all that you're saying, since this possibility doesn't reveal problems with inferring physical facts from moral facts.
0Will_Sawin
There is a mapping from physical+moral belief structures to just-physical belief structures. Correct physical-moral deductions map to correct physical deductions. The end physical beliefs are purely explained by the beginning physical beliefs + new physical observations.
0Peterdjones
Meaning what? Are you saying you can get oughts form ises?
2Will_Sawin
No, I'm saying you can distinguish oughts from ises. I am saying that you can move from is to is to is and never touch upon oughts. That you can solve all is-problems while ignoring oughts.
1Vladimir_Nesov
Logic can be used to talk about non-physical facts. Do you allow referring to logic even where the logic is talking about non-physical facts, or do you only allow referring to the logic that is talking about physical facts? Or maybe you taboo intended interpretation, however non-physical, but still allow the symbolic game itself to be morally relevant?
0lukeprog
Alas, I think this is getting us into the problem of universals. :) With you, too, Vladimir, I suspect our anticipations do not differ, but our language for talking about these subtle things is slightly different, and thus it takes a bit of work for us to understand each other. By "logic referring to non-physical facts", do you have in mind something like "20+7=27"?
3Vladimir_Nesov
"3^^^^3 > 3^^^3", properties of higher cardinals, hyperreal numbers, facts about a GoL world, about universes with various oracles we don't have. Things for which you can't build a trivial analogy out of physical objects, like a pile of 27 rocks (which are not themselves simple, but this is not easy to appreciate in the context of this comparison).
0lukeprog
Certainly, one could reduce normative language into purely logical-mathematical facts, if that was how one was using normative language. But I haven't heard of people doing this. Have you? Would a reduction of 'ought' into purely mathematical statements ever connect up again to physics in a possible world? If so, could you give an example - even a silly one? Since it's hard to convey tone through text, let me explicitly state that my tone is a genuinely curious and collaboratively truth-seeking one. I suspect you've done more and better thinking on metaethics than I have, so I'm trying to gain what contributions from you I can.
2Vladimir_Nesov
Why do you talk of "language" so much? Suppose we didn't have language (and there was only ever a single person), I don't think the problem changes. Say, I would like to minimize ((X-2)*(X-2)+3)^^^3, where X is the number I'm going to observe on the screen. This is a pretty self-contained specification, and yet it refers to the world. The "logical" side of this can be regarded as a recipe, a symbolic representation of your goals. It also talks about a number that is too big to fit into the physical world.
0lukeprog
Okay, sure. We agree about this, then.
0Vladimir_Nesov
This would require that we both have positions that accurately reflect reality, or are somehow synchronously deluded. This is a confusing territory, I know that I don't know enough to be anywhere confident in my position, and even that position is too vague to be worth systematically communicating, or to describe some important phenomena (I'm working on that). I appreciate the difficulty of communication, but I don't believe that we would magically meet at the end without having to change our ideas in nontrivial ways.
0lukeprog
I just mean that our anticipations do not differ in a very local sense. As an example, imagine that we were using 'sound' in different ways like Albert and Barry. Surely Albert and Barry have different anticipations in many ways, but not with respect to the specific events closely related to the tree falling in a forest when nobody is around.
-2Peterdjones
Or maybe things that just don't usefully reduce.
0[anonymous]
I'd be very gracious if you could take a look at my recent question and the comments. Your statement is interesting to me. What is a counter-argument to the claim that the only way that one could claim that " "X is true" means something" is to unpack the statement "X is true" all the way down to amplitudes over configurations (perhaps in a subspace of configuration space that highly factorizes over 'statistically common arrangements of particles in human brains correlating to mathematical conclusions' or something. Where do the intuition-sympathizers stand on the issue of logical names? I don't think something like 'ought' can intuitively point to something that has ontological ramifications. If there is any "intuition" to it, why is it unsatisfactory to think it's merely an evolutionary effect? From the original post above, I find a point of contention with 'I ought to do X' does correspond to something that exists... namely, some distribution over configurations of human minds. It's a proposition like any other, like 'that sign is red' for example. You can track down a fully empirical and quantifiable descriptor of 'I ought to do X' with some sufficiently accurate model and measuring devices with sufficient precision. States of knowledge about what one 'ought' to do are states of knowledge like any others. When tracking down the physics of 'Ought', it will be fleshed out with some nuanced, perhaps situationally specific, definition that relates it to other existing entities. I guess more succinctly, there is no abstract concept of 'ought'. The label 'ought' just refers to an algorithm A, an outcome desired from that algorithm O, an input space of things the algorithm can operate on, X, an assessment of the probability that the outcome happens under the algorithm, P(A(X) = O). Up to the limit of sensory fidelity, this is all in principle experimentally detectable, no?
0Will_Sawin
I don't believe in an ontology of morals, only an epistemology of them. Do you think that "The sign is red" means something different from "I believe the sign is red"? (In the technical sense of believe, not the pop sense.) Do you think that "Murder is wrong" means something different from "I believe that murder is wrong."?
-2[anonymous]
The verb believe goes without saying when making claims about the world. To assert that 'the sign is red' is true would not make sense if I did not believe it, by definition. I would either be lying or unaware of my own mental state. To me, your question borders more on opinions and their consequences. Quoting from there: "But your beliefs are not about you; beliefs are about the world. Your beliefs should be your best available estimate of the way things are; anything else is a lie." What I'm trying to say is that the statement (Murder is wrong) implies the further slight linguistic variant (I believe murder is wrong) (modulo the possibility that someone is lying or mentally ill, etc.) The question then is whether (I believe murder is wrong) -> (murder is wrong). Ultimately, from the perspective of the person making these claims, the answer is 'yes'. It makes no sense for me to feel that my preferences are not universally and unequivocally true. I don't find this at odds with a situation where a notorious murderer who is caught, say Hannibal Lecter, can simultaneously choose his actions and say "murder is wrong". Maybe the person is mentally insane. But even if they aren't, they could simply choose a preference ordering such that the local wrongness of failing to gratify their desire to murder is worse than the local wrongness of murder itself in their society. Thus, they can see that to people who don't have the same preference for murdering someone for self-gratification, the computation of beliefs works out that (murder is wrong) is generally true, but not true when you substitute their local situations into their personal formula for computing the belief. In this case it just becomes an argument over words because the murderer is tacitly substituting his personal local definitions for things when making choices, but then using more general definitions when making statements of beliefs. In essence, the murderer believes it is not wrong for him to murder and g
2Will_Sawin
The difference is here Alice: "I bet you $500 that the sign is red" Bob: "OK" later, they find out it's blue Bob: "Pay up!" Alice: "I bet you $500 that I believe the sign is red" Bob: "OK" later, they find out it's blue Alice: "But I thought it was red! Pay up!" That's the difference between "X" and "I believe X". We say them in the same situation, but they mean different things. The way statements like "murder is wrong" communicate facts about preference orders is pretty ambiguous. But suppose someone says that "Murder is wrong, and this is more important than gratifying my desire, possible positive consequences of murder, and so on" and then murders, without changing their mind. Would they therefore be insane? If yes, you agree with me. Correct is at issue, not true. Why? Why do you say this? Does "i believe the sky is green" imply "the sky is green"? Sure, you believe that, when you believe X, X is probably true, but that's a belief, not a logical implication. I am suggesting a similar thing for morality. People believe that "(I believe murder is wrong) => (murder is wrong)" and that belief is not reducible to physics. Assertions aren't about the state of your mind! At least some of them are about the world - that thing, over there.
1[anonymous]
I don't understand this. If Alice bet Bob that she believed that the sign was red, then going and looking at the sign would in no way settle the bet. They would have to go look at her brain to settle that bet, because the claim, "I believe the sign is red" is a statement about the physics of Alice's brain. I want to think more about this and come up with a more coherent reply to the other points. I'm very intrigued. Also, I think that I accidentally hit the 'report' button when trying to reply. Please disregard any communication you might get about that. I'll take care of it if anyone happens to follow up.
0Will_Sawin
You are correct in your first paragraph, I oversimplified.
-2[anonymous]
I think this address this topic very well. The first person experience of belief is one in the same with fact-assertion. 'I ought to do X' refers to a 4-tuple of actions, outcomes, utility function, and conditional probability function. W.r.t. your question about whether a murderer who, prior to and immediately after committing murder, attests to believing that murder is wrong, I would say it is a mistaken question to bring their sanity into it. You can't decide that question without debating what is meant by 'sane'. How a person's preference ordering and resulting actions look from the outside does not necessarily reveal that the person failed to behave rationally, according to their utility function, on the inside. If I choose to label them as 'insane' for seeming to violate their own belief, this is just a verbal distinction about how I will label such third-person viewings of that occurrence. Really though, their preference ordering might have been temporarily suspended due to clouded judgment from rage or emotion. Or, they might not be telling the full truth about their preference ordering and may not even be aware of some aspects of it. The point is that beliefs are always statements of physics. If I say, "murder is wrong", I am referring to some quantified subset of states of matter and their consequences. If I say, "I believe murder is wrong", I am telling you that I assert that "murder is wrong" is true, which is a statement about my brain's chemistry.
2Will_Sawin
Everyone keeps saying that, but they never give convincing arguments for it.
1lukeprog
I also disagree with this.
-1[anonymous]
Pardon me, but I believe the burden of proof here is for you to supply something non-physical that's being specified and then produce evidence that this is the case. If the thing you're talking about is supposed to be outside of a magisterium of evidence, then I fail to see how your claim is any different than that we are zombies. At a coarse scale, we're both asking about the evidence that we observe, which is the first-person experience of assertions about beliefs. Over models that can explain this phenomenon, I am attempting to select the one with minimum message length, as a computer program for producing the experience of beliefs out of physical material can have some non-zero probability attached to it through evidence. How are we to assign probability to the explanation that beliefs do not point to things that physically exist? Is that claim falsifiable? Are there experiments we can do which depend on the result? If not, then the burden of proof here is squarely on you to present a convincing case why the same-old same-old punting to complicated physics is not good enough. If it's not good enough for you, and you insist on going further, that's fine. But physics is good enough for me here and that's not a cop out or an unjustified conclusion in the slightest.
2Will_Sawin
Suppose I say "X is red". That indicates something physical - it indicates that I believe X is red but it means something different, and also physical - it means that X is red Now suppose I say "X is wrong" That indicates something physical - it indicates that I believe X is wrong using the same-old, same-old principle, we include that it means something different. but there is nothing else physical that we could plausibly say it means.
-3[anonymous]
Why do you say this? Flesh out the definition of 'wrong' and you're done. 'Wrong' refers to arrangements of matter and their consequences. It doesn't attempt to refer to intrinsic properties of objects that exist apart from their physicality. If (cognitive object X) is (attribute Y) this just means that (arrangements of matter that correspond to what I give the label X) have (physical properties that I group together into the heading Y). It doesn't matter if you're saying "freedom is good" or "murder is wrong" or "that sign is red". 'Freedom' refers to arrangements of matter and physical laws governing them. 'Good' refers to local physical descriptions of the ways that things can yield fortunate outcomes, where fortunate outcomes can be further chased down in its physical meaning, etc. "X is wrong" unpacks to statements about the time evolution of physical systems. You can't simply say Have you gone and checked every possible physical thing? Have you done experiments showing that making correspondences between cognitive objects and physical arrangements of matter somehow "fails" to capture its "meaning"? This seems to me to be one of those times where you need to ask yourself: is it really the case that cognitive objects are not just linguistic devices for labeling arrangements of matter and laws governing the matter......... or do I just think that's the case?
2Will_Sawin
Your whole argument rests on this, since you have not provided a counterexample to my claim. You've just repeated the fact that there is some physical referent, over and over. This is not how burden of proof works! It would be simply impossible for me to check every possible physical thing. Is it, therefore, impossible for you to be convinced that I am right? I expect better from lesswrong posters.
0[anonymous]
This is what it means for a claim to fail falsifiability. It's easy to generate claims whose proof would only be constituted by fact-checking against every physical thing. This is a far cry from a decision-theoretic claim where, though we can't have perfect evidence, we can make useful quantifications of the evidence we do have and our uncertainty about it. The empty set has many interesting properties. It's impossible to quantify your claim without having all of the evidence up front. What I'm trying to say is that I can test the hypothesis of whether or not there is a physical referent. If someone says to me, "Is there or isn't there a physical referent?" and I have to respond, then I have to do so on the strength of evidence alone. I may not be able to provide a referent explicitly, but I know that non-zero probability can be assigned to a physical system in which cognitive objects are placeholders for complicated sets of matter and governing laws of physics. I cannot make the same claim about the hypothesis that cognitive objects do not have utterly physical referents, and therefore, whether or not I have explicit examples of referents, the hypothesis that there must be underlying physical referents wins hands down. The criticism you're making of me, that I insist there are referents without supplying the actual referents, is physically backwards in this case. For example, someone might say "consciousness is a process that does not correspond to any physically existing thing." If I then reply, "But consciousness is a property of material and varies directly with changes in that material (or some similar, more detailed argument about cognition), and therefore, I can assign non-zero probability to its being a physical computation, and since I do not have the capacity to assign probabilities to non-physical entities, the hypothesis that consciousness is physical wins." this is a convincing argument, up to the quantification of the evidence. If you personally
0Will_Sawin
That's the good kind of claim, the falsifiable kind, like the Law of Universal Gravitation. That's the kind of claim I'm making. Your argument seems to depend on the idea that the only way to evaluate a claim is to list the physical universes in which it is true and the physical universes in which it is not true. This, obviously, is circular. Do you acknowledge that your reasoning is circular and defend it, presumably with Eliezer's defense of circular reasoning? Or do you claim that it is not circular? Sure you can. You take a world, find all the cognitive objects in it, then find all the corresponding physical referents, cross those objects off the list. I am saying that there are beliefs (strings of symbols with meaning) endowed meaning by their place in a functional mind but for which the set of physical referents they correspond to is the empty set. Surely you can admit the existence of strings of symbols without physical referents, like this one: "fj4892fjsoidfj390ds;j9d3". There's nothing non-physical about it. If "X" is quantitative and experimentally relevant, how could "not-X" be irrelevant? If X makes predictions, how could not-X not make the opposite predictions? Who said that all beliefs have referents?
0[anonymous]
My claim is that if one had really done this, then by definition of "find", they have the physical referents for the cognitive objects. If a cognitive object has the empty set as the set of physical referents, then it is the null cognitive object. The string of symbols "fj4892fjsoidfj390ds;j9d3" might have no meaning to you when thinking in English, say, but then it just means it is an instantiation of the empty cognitive object, any string of symbols failing to point to a physical referent. I'm trying to say that if the cognitive object is to be considered as pointing to something, that is, it is in some sense not the null cognitive object, then the thing which is its referent is physical. It's incoherent to say that a string of symbols refers to something that's not physical. What do you mean by 'refer' in that setting? There is no existing thing to be referred to, hence the symbol does no action of referring. So when someone speaks about "X" being right or wrong, either they are speaking about physical events or else "X" fails to be a cognitive object. I claim that my reasoning is not circular. It depends on what you mean by 'evaluate'. What I'm taking for that definition right now is that if I want to assess whether proposition P is true, then I can only do so in a setting of decision theory and degrees of evidence and uncertainty. This means that I need a model for the action P and a way of corresponding probabilities to the various hypotheses about P. In this case, P = "Some cognitive objects have referents that are not the null referent and are also not physical". I claim that all referents are either physical or else they are the null referent. The set of non-physical referents is empty. Just because a string fails to have a physical referent does not mean that it succeeds in having a non-physical one. What evidence do I have that there exist non-physical referents. What model of cognitive objects exists with which it is possible to achieve experimental
0Will_Sawin
This is the heart of the matter. You are saying that the only relevant properties of a cognitive object are its referents. Thus, no referents = no relevant properties = null object. I say that, on the contrary, a cognitive object has other relevant properties. One such property is its place on the code of a brain. Imagine an AI with a set of objects that it marks as either "true" or "false". These objects have no referents, but they influence the formation of objects with referents. I think it's fair to say that: * Such an Ai could exist, with the objects having referents/no referents as appropriate. * These objects are not just the null object. * The AI is thinking irrationally. Now imagine an AI with a set of objects that it marks as either "true" or "false". These objects have no referents, but they influence the AI's choices. I think it's fair to say that: * Such an Ai could exist, with the objects having referents/no referents as appropriate. * These objects are not just the null object. * The AI could be thinking rationally.
0[anonymous]
I think that the meaning of "The AI could be thinking rationally" is that it could turn out to be the case that the objects labeled true and false have a correspondence to physically existing things and that correspondence allows the A.I. to construct decision rules which correspond to reality within some computable range of uncertainty. If we are unable to map the inputs to the A.I.'s decision process (in this case objects labeled true or false and whose referents, if any, are unknown to us at the start) back to physical reality, then it is still mysterious to us and we can't claim that it's rational in any sense other than pure statistical experience (it could just be that when asked to make a series of decisions using the true/false labeled objects, the A.I. got incredibly lucky). In order to conclude (in any more than a superficial way) that the A.I. is rational, there must be an explicit correspondence between the labeled objects and the physics world, and hence we would have found their referents. If this is, in principle, an impossible task (as you claim), then the concept of rationality doesn't apply to the A.I. In what sense would it be said to actually be rational, rather than just produce a sequence of outputs that appear to be rational to us for mysterious reasons?
0Will_Sawin
I think that the meaning of "The AI could be thinking rationally" is that it could turn out to be the case that the objects labeled true and false are integral components of a program that provably calculates rational decisions every time.
0[anonymous]
A proof that a program calculates rational decisions every time necessarily provides the physical referents of its calculation. There's no difference between knowing that a program calculates rational decisions every time and knowing how it is that it calculates rational decisions every time. If you don't know the explicit correspondence between its calculations and reality then your state of knowledge cannot include the fact that the program always yields rational conclusions. You can have degrees of certainty that it is rational without having full knowledge of its referents, but not factual knowledge as in a mathematical proof. It may be that a slick mathematical argument reduces the connection to symbols that don't readily convey the physical connection, but If your state of knowledge (brain chemistry) is updated to include special knowledge of the rationality of an agent, then there is entanglement between you and that agent, for that is what knowledge is. You can't know that an agent is rational without knowing the physical connection between its cognitive objects and reality. To whatever degree you lack knowledge about the physical referents of its cognitive objects, that is the degree to which you lack knowledge about whether or not it is rational.
0asr
Hm? It's easy to form beliefs about things that aren't physical. Suppose I tell you that the infinite cardinal aleph-1 is strictly larger than aleph-0. What's the physical referent of the claim? I'm not making a claim about the messy physical neural structures in my head that correspond to those sets -- I'm making a claim about the nonphysical infinite sets. Likewise, I can make all sorts of claims about fictional characters. Those aren't claims about the physical book, they're claims about its nonphysical implications.
0[anonymous]
Why do you think that nonphysical implications are ontologically existing things? I argue that what you're trying to get at by saying "nonphysical implications" are actual quantified subsets of matter. Ideas, however abstract, are referring to arrangements of matter. The vision in your mind when you talk about aleph-1 is of a physically existing thing. When's the last time you imagined something that wasn't physical? A unicorn? You mean a horse with wings glued onto it? Mathematical objects represent states of knowledge, which are as physical as anything else. The color red refers to a particular frequency of light and the physical processes by which it is a common human experience. There is no idea of what red is apart from this. Red is something different to a blind man than it is to you, but by speaking about your physical referent, the blind man can construct his own useful physical referent. Claims about fictional characters are no better. What do you mean by Bugs Bunny other than some arrangement of colors brought to your eyes by watching TV in the past. That's what Bugs Bunny is. There's no separately existing entity which is Bugs Bunny that can be spoken about as if it ontologically was. Every person who refers to Bugs Bunny refers to physical subsets of matter from their experience, whether that's because they witnessed the cartoon and were told through supervised learning what cognitive object to attach it to or they heard about it later through second hand experience. A blind person can have a physical referent when speaking about Bugs Bunny, albeit one that I have a very hard time mentally simulating. In any case, merely asserting that something fails to have a physical referent is not a convincing reason to believe so. Ask yourself why you think there is no physical referent and whether one could construct a computational system that behaves that way.
3Alicorn
No.
0asr
I have no very firm ontological beliefs. I don't want to make any claim about whether fictional characters or mathematical abstractions "really exist". I do claim that I can talk about abstractions without there being any set of physical referents for that abstraction. I think it's utterly routine to write software that manipulates things without physical referents. A type-checker, for instance, isn't making claims about the contents of memory; it's making higher-order claims about how those values will be used across all possible program executions -- including ones that can't physically happen. I would cheerfully agree with you that the cognitive process (or program execution) is carried out by physical processes. Of course. But the subject of that process isn't the mechanism. There's nothing very strange about this, as far as I can tell. It's routine for programs and programmers to talk about "infinite lists"; obviously there is no such thing in the physical world, but it is a very useful abstraction. By the way, I think your Bugs Bunny example fails. When I talk to somebody about Bugs Bunny, I am able to make myself understood. The other person and I are able to talk, in every sense that matters, about the same thing. But we don't share the same mental states. Conversely, my mental picture isn't isomorphic to any particular set of photons; it's a composite. Somehow, that doesn't defeat practical communication. The case might be clearer for purely literary characters. When I talk about the character King Lear, I certainly am not saying something about the physical copy I read! Consider the perfectly ordinary (and true) sentence "King Lear had three daughters." That's not a claim about ink, it's a claim about the mental models created in competent speakers of English by the work (which itself is an abstraction, not a physical thing). Those models are physically embodied, but they are not physical things! There's no set of quarks you can point to and say "there
0[anonymous]
This is where we disagree. Those mental models are simply arrangements of matter. The fact that it feels like you're referring something separate from an arrangement of matter-memory in your brain is another thing all together. The reason that practical communication works at all is that there is an extreme amount of mutual information held between the set of features which you use to categorize the physical memory of, say, Bugs Bunny, and the features used to categorize Bugs in someone else's mind. You can reference your brain's physical memory in such a way as to cause another's physical memory to reference something, and if an algorithm sorts the mutual information of these concepts until it finds a maximum, and common experience then forms all sorts of additional memories about what wound up being referenced, it is not surprising at all that a purely physical model of concepts would allow communication. I don't see how anything you've said represents more than an assertion that it feels to you as if abstractions are not simply the brain matter that they are made out of in your mind. It's not a convincing reason for me to think abstractions have ontological properties. I think the hypothesis that it just feels that way since my brain is made of meat and I can't look at the wiring schematics is more likely.
0asr
This is starting to feel like a shallow game of definition-bending. I don't think we're disagreeing about any testable claim. So I'm not going to argue about why your definition is wrong, but I will describe why I think it's less useful in expressing the sorts of claims we make about the world. When we talk about whether two mental models are similar, the similarity function we use is representation-independent. You and I might have very similar mental models, even if you are thinking with superconducting wires in liquid helium and our physical brains have nothing in common. Not being willing to talk honestly about abstractions makes it hard to ask how closely aligned two mental models are -- and that's a useful question to ask, since it helps predict speech-acts. Conversely, saying that "everything is a physical property" deprives us of what was previously a useful category. A toaster is physical in a way that an eight-dimensional vector space is not and in a way that a not-yet-produced toaster is not. I want a word to capture that difference. In particular, physical objects, as most of the world uses the term, means that objects have position and mass that evolve in predictable ways. It's sensible to ask what a toaster weighs. It's not sensible to ask what a mental model weighs. I think your definitions here mean that you can't actually explain ordinary ostensive reference. There is a toaster over there, and a mental model over here, and there is some correspondence. And the way most of the world uses language, I can have the same referential relationship to a fictional person as to a real person, as to a toaster. And I think I'm now done with the topic.
0[anonymous]
First, I didn't say anything at all about the usefulness of treating abstractions the way we do. I don't believe in actual free will but I certainly believe that the way we walk around acting as if free will was a real attribute that we have is very useful. You can arrange a network of neurons in such a way that it will allow identification of a concept, and we use natural language to talk about this sort of arrangement of matter. Talking about it that way is just fine, and indeed very useful. But this thread was about a defense of metaethics and partially about the defense of beliefs as non-physical, but still really existing, entities. For purposes of debating that point, I think it starts to matter whether someone does or does not recognize that concepts are just arrangements of matter: information which can be extracted from brain states but does not in and of itself point to any actual, ontological entity. I think I am quite willing to talk about abstractions and their usefulness ... just not willing to agree that they are fundamental parts of reality rather than merely hallucinations the same way that free will is. In conversations about the ontology of physical categories, it's better to say that the category of toasters in my brain is just a pattern of matter that happens to score high correlations with image, auditory, and verbals feature vectors generated by toasters. In conversations about making toast, it's better to talk about the abstraction of the category of toasters as if it was itself something. It's the same as talking about the wing of an airplane.
0asr
Thank you, that explained where you were coming from. But I don't see that any of this ontology gets you the meta-ethical result you want to show. I think all you've shown is that ethical claims aren't more true than, say, mathematical truth or physical law. But by any normal standard, "as true as the proof of Fermat's last theorem" is a very high degree of truth. I think to get the ethical result you want, you should be showing that moral terms are strictly less meaningful than mathematical ones. Certainly you need to somehow separate mathematical truth from "ethical truth" -- and I don't see that ontology gets you there.
0[anonymous]
Actually, I am opposed to the argument of ontology of belief, which is why I was trying to argue that beliefs are encoded states of matter. If I assert that "X is wrong" it must mean I assert "I believe X is wrong" as well. If I assert "I believe X is wrong" but don't assert "X is wrong", something's clearly a miss. As pointed out here, beliefs are reflections of best available estimates about physically existing things. If I do assert that I believe X is wrong but don't assert that X is wrong, then either I am lying about the belief, or else there's some muddling of definitions and maybe I mean some local version of X or some local version of "wrong", or I am unaware of my actual state of beliefs (possibly due to insanity, etc.) But my point is that in a sane person, from that person's first-person experience, the two statements, "I believe X is wrong" and "X is wrong" contain exactly the same information about the state of my brain. They are the same statement. My point in all this was that "I believe X is wrong" has the same first-person referent as "X is wrong". If X = murder, say, and I assert that "murder is wrong", then once you unpack whatever definitions in terms of physical matter and consequence that I mean by "murder" and "wrong", you're left with a pointer to a physical arrangement of matter in my brain that resonates when feature vectors of my sensory input correlate with the pattern that stores "murder" and "wrong" in my brain's memory. It's a physical thing. The wrongness of murder is that thing, it isn't an ontological concept that exists outside my brain as some non-physical attribute of reality. Even though other humans have remarkably similar brain-matter-patterns of wrongness and murder, enough so that the mutual information between the pattern allows effective communication, this doesn't suddenly cause the idea that murder is wrong to stop being just a local manifestation in my brain and start being a separate idea that many humans share point
0[anonymous]
I think more salient examples that make this question hard are not going to be borne out of trying to come up with something increasingly abstract. The more puzzling cognitive objects to explain are when you apply unphysical transformations to obvious objects... like taking a dog and imagining it stretched out to the length of a football field. Or a person with a torus-like hole in their abdomen. But these are simply images in the brain. That the semantic content of the image can be interpreted as strange unions of other cognitive objects is not a reason to think that the cognitive object itself isn't physical.
0[anonymous]
Just to be a little clearer: saying that "I ought to do X" means "There exists some goal Y such that I want to achieve Y; there exists some set of variables D which I can manipulate to bring about the achievement of Y; X is an algorithm for manipulating variables in D to produce effect Y, and according to my current state of knowledge, I assess that the probability of this model of X(D) yielding Y is high enough such that whatever physical resources it costs me to attempt X(D), as a Bayesian, the trade-off works out in favor of actually doing it. That is, Payoff(Y) P(I was right in modeling the algorithm X(D) as producing Y) > Cost(~Y)P(I was incorrect in modeling the algorithm X(D)), or some similar decision rule.

People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists?

It refers to my preferences which are physically encoded in my brain. It feels like it doesn't refer to anything that exists because I don't have complete introspective access to the mechanisms by which my brain decides that it wants something.

On top of that, ought refers to lots of different things, and as far as I can tell, most ought statements are su... (read more)

-3Will_Sawin
What is a preference? How do you suggest I infer your preferences? If it is from your actions, then your definition sounds very similar to mine. If it is from your statements, then your definition is circular. If it is from your emotions, then how do people express moral beliefs that contradict their emotions? Is it something else?
0Manfred
This is all irrelevant to atucker's comment, unless you're denying that preferences are patterns in your brain. If his definition sounds similar to yours, good, that means you don't believe "ought does not refer to anything that exists."
0Will_Sawin
It's all irrelevant if his answer is "from your actions". Is it obvious that it is? If so, I apologize. Here are some problems: 1. The correlates of a phrase are not its meaning. If I say "X will happen", I'm not saying "The current physical world has a pattern that is likely to cause X to happen", I'm just saying "X is going to happen". 2. If I say "you should do X, but I know you're going to do Y" it doesn't seem like I mean "Parts of your brain want to do X but the rest will overrule them" or "In your situation, I would do X" or "I will punish you for doing Y" 3. You don't accurately describe internal reasoning. There are many X such that I prefer to do X because of my explicit belief that X is right, not vice versa.
1atucker
Intuitively, I feel like I have various competing desires/preferences floating around in my head that I process further in order to decide what to do. They're actually physically encoded, but that's just asserting that they exist for real. Some salient desires of mine right now are: * The desire to eat (I want my breakfast) * The desire to finish this comment * The desire to scratch my stomach * The desire to look up when graduation rehearsal is As you can see, many of these are contradictory, so you can't infer all of them from my actions. Some of these desires are fairly basic, like the one to eat. Neural circuitry controlling hunger has been found. Others are far more complicated, like the one about finishing this comment. I think that that is probably aggregated from various smaller desires, like "explain myself clearly", "get karma points", or "contribute to Less Wrong". I think that desires might be packaged by my unconscious mind so that my conscious mind can figure out how to accomplish them, without me needing to think through what I should want before doing anything. The word preference/desire probably refers to multiple things. For many of my preferences, you could infer them by changing the world to fulfill them, and then seeing if I'm happier. Normally I will be happier/more satisfied (a physical arrangement of my brain) as a result of my preferences being fulfilled. Other preferences like "speak in English" or "always be nice to people" seem to be more like imperatives that I should follow for coordination or signalling purposes. But they still feel pretty much the same as my normal preferences. But it's all still physically encoded in my brain, and does refer to the world-as-is, even if it doesn't feel like it does.
0Will_Sawin
Do you contribute to charity? Do you make explicit long-term plans about how you will help the world? (Or other, similar things)
0Matt_Simpson
What is this sentence asking? Is actions supposed to be preferences?
0Will_Sawin
Yes.

Will and I just spoke on the phone, so here's another way to present our discussion:

Imagine a species of artificial agents. These agents have a list of belief statements that relate physical phenomena to normative properties (let's call them 'moral primitives'):

  • 'Liking' reward signals in human brains are good.
  • Causing physical pain in human infants is forbidden.
  • etc.

These agents also have a list of belief statements about physical phenomena in general:

  • Sweet tastes on the tongue produces reward signals in human brains.
  • Cutting the fingers of infants p
... (read more)
5Wei Dai
That doesn't seem right. Compare (note that I don't necessarily endorse the rest of this paper) : If you examine just one particular sense of the word "ought", even if you make clear which sense, but without systematically enumerating all of the meanings of the word, how can you know that the concept you end up studying is the one that is actually important, or one that other people are most interested in?
0lukeprog
I suspect there are many senses of a word like 'ought' that are important. As 'pluralistic moral reductionism' states, I'm happy to use and examine multiple important meanings of a word.
6Wei Dai
Let me expand my comment a bit, because it didn't quite capture what I wanted to say. If Will is anything like a typical human, then by "ought" he often means something other than, or more than, the sense referred to by "that sense", and it doesn't make sense to say that perhaps he wants to use "ought" in that sense. When you say "I'm fine with ..." are you playing the role of the Austere Metaethicist who says "Tell me what you mean by 'right', and I will tell you what is the right thing to do."? But I think Austere Metaethics is not a tenable metaethical position, because when you ask a person to tell you what they mean by "right", they will almost certainly fail to give you a correct answer, simply because nobody really understands (much less can articulate) what they mean by "right". So what is the point of that? Or perhaps what you meant to say instead was "I'm fine with Will studying 'ought' in that sense if he wants"? In that case see my grandparent comment (but consider it directed mostly towards Will instead of you).
0Will_Sawin
I don't love all your terminology, but obviously my preferred terminology's ability to communicate my ideas on this matter has been shown to be poor. I would emphasize less relationships between similar moral beliefs: and more the assembly-line process converting general to specific I'm pretty sure the first statement here only makes sense as a consequence of the second:
0Vladimir_Nesov
This doesn't make sense to me. Does 28 reduce to physics in this sense? How is this "ought" thing distinguished from all the other factors (moral errors, say) that contribute to behavior (that is, how is its role located)?
0Will_Sawin
First, I would say that reducibility is a property of statements. In the sense I use it: The statement "14+14=28" is reducible to aether. The statement "I have 28 apples" is reducible to phyisics. The statement "There are 28 fundamental rules that one must obey to lead a just life" is reducible to ethics. Moral statements are irreducible to physics in the sense that "P is red" is irreducible to physics - for any particular physical "P", it is reducible. The logical properties of P-statements, like "P is red or P is not red" are given as a set of purely logical statements - that's their analogue of the ought-function. If P-statements had some useful role in producing behavior, they would have a corresponding meaning. Random, probably unnecessary math: A reducible-class is a subalgebra of the Boolean algebra of statements, closed under logical equivalence. The statements reducible to aether are those in the reducible-class generated by True and False. The statements reducible to physics are those in the reducible-class generated by "The world is in exactly state X". The statements reducible to morality are those in the reducible-class generated by "Exactly set-of-actions Y are forbidden and set-of-actions Z are obligatory".

How can you make a statement that doesn't refer to anything that exists? I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this. Since it's intuitive, why would you not want to do it that way?

Clearly, you can make statements about things that don't exist. People do it all the time, and I don't object to it. I enjoy works of fiction, too. But if the aim of our dialogue is true claims about reality, then you've got to talk about things that exist ... (read more)

0Will_Sawin
But the aim of our dialogue isn't really true claims. It's useful claims - claims that one can incorporate into one's decision-making process. Claims about Darth Vader, you can't, but claims about ought, you can. What about that other word (that is also spelled "should") that you don't have to stipulate the meaning of because people already know what it means? What about the regular kind of imperatives? If I define "fa" to mean "any object which more than 75% of the claims in this long book of no previous importance accurately describes", I have done something very strange, even if I letter say "If 'fa' means 'red', that's fa."
0lukeprog
I don't understand what you mean, here. I'm not sure what you mean by 'true' or 'useful', I guess. I'm talking about true claims in this sense. Which one is that, and what does everybody already know it to mean?
0Will_Sawin
I mean what you mean by "true", or maybe something very similar. By "useful" I mean "those claims that could help someone come to a decision about their actions" It's what people say when they say "should" but don't precede it with "if". Some people on lesswrong think it means: [you should do X] = [X maximizes this complicated function that can be computed from my brain state] Some think it means: [you should do X] = [X maximizes whatever complicated function is computed from my brain state] and I think: [you should do X] = [the statement that, if believed, would cause one to do X]
0Vladimir_Nesov
You can find that there is a bug in your brain that causes you to react to a certain belief, but you'd fix it if you notice it's there, since you don't think that belief should cause that action.
0Will_Sawin
I could say [the statement that, if believed by a rational agent, would cause it to do X] but that's circular. But one of the points I've been trying to make is that it's okay for the definition of something to be, in some sense, circular. As long as you can describe the code for a rational agent that manipulates that kind of statement.
2Vladimir_Nesov
Some things you can't define exactly, only refer to them with some measure of accuracy. Physical facts are like this. Morality is like this. Rational agents don't define morality, they respond to it, they are imperfect detectors of moral facts who would use their moral expertise to improve own ability to detect moral facts or build other tools capable of that. There is nothing circular here, just constant aspiration for referencing the unreachable ideal through changeable means.
0Will_Sawin
But there aren't causal arrows pointing from morality to rational agents, are there? Just acausal/timeless arrows. You do have to define "morality" as meaning "that thing that we're trying to refer to with some measure of accuracy", whereas "red" is not defined to refer to the same thing. If you agree, I think we're on the same page.
1Vladimir_Nesov
I think the idea of acausal/logical control captures what causality was meant to capture in more detail, and is a proper generalization of it. So I'd say that there are indeed "causal" arrows from morality to decisions of agents, to the extent the idea of "causal" dependence is used correctly and not restricted to the way we define physical laws on a certain level of detail. Why would I define it so? It's indeed what we are trying to refer to, but what it is exactly we cannot know. Lost me here. We know enough about morality to say that it's not the same thing as "red", yes.
1Will_Sawin
Sure. Let me rephrase a bit. "That thing, over there (which we're trying to refer to with some measure of accuracy), point point". I'm defining it extensionally, except for the fact that it doesn't physically exist. There has to be some kind of definition or else we wouldn't know what we were talking about, even if it's extensional and hard to put into words. "red" and "right" have different extensional definitions.
0Vladimir_Nesov
I suspect there is a difference between knowing things and being able to use them, neither generally implying the other.
0Will_Sawin
This is true, but my claim that words have to have a (possibly extensional) definition for us to use them, and that "right" has an extensional definition, stands.
0Vladimir_Nesov
Does "whatever's written in that book" work as the appropriate kind of "extensional definition" for this purpose? If so, I agree, that's what I mean by "using without knowing". (As I understand it, it's not the right way of using the term "extensional definition", since you are not giving examples, you are describing a procedure for interacting with the fact in question.)
0Will_Sawin
It's sort of subtle. "Whatever's written in the book at the location given by this formula: " defines a word totally in terms of other words, which I would call intensional. "Whatever's written in THAT book, point point" points at the meaning, what I would call extensional.
-2Peterdjones
All definitions should be circular. "The president is the Head of State" is a correct definition. "The president is Obama" is true, but not a definition.
0Will_Sawin
Non-circular definitions can certainly be perfectly fine: "A bachelor is an unmarried man.' This style is used in math to define new concepts to simplify communication and thought.
-2Peterdjones
"A bachelor is an unmarried man.' If that is non circular, so is [the statement that, if believed by a rational agent, would cause it to do X] I'm quite confused. By circular do you mean anaylitcal, or recursive? (example of the latter: a setis something that can contain elemetns or other sets)
0Will_Sawin
I'm not sure what I mean. The definition I am using is in the following category: It may appear problematically self-referential, but it is in fact self-referential in a non-problematic manner. Agreed?
-2Peterdjones
I don't think your statement was self referential or problematic,.
-1Peterdjones
or rather [you should do X] = [the statement that, if believed, would cause one to do X if one were an ideal and completely non akrasic agent]
0Will_Sawin
Correct.
-2Peterdjones
Which would mean either that mathematical knowledge is false, or that there is a Platonic word of mathematical objects for it to correspond to. OTOH, one could just adopt the Dogma of Empiricism that there is analytical truth which is neither 'about' physical realitty nor 'about' about any metaphysical one ( and that mathematical truth is anayltical). (and that mathematical truth is anayltical). And if it is an analytical truth that, for instance, that you should do as you would be done by, then that is still applicable to real world situations by fulling "as you would be done by" for your own case.
[-][anonymous]20

I made a more topical comment to Wei_Dai's reply to this thread, but I felt it was worth adding that if anyone is interested in a work of fiction that touches on this subject, the novel The Broom of the System, by David Foster Wallace, is worth a look.

Can you explain what implications (if any) this "naive" metaethics has on the problem how to build an FAI?

0Will_Sawin
Arguably, none. (If you already believe in CEV.)
2hairyfigment
Well, you say below you don't believe that (in my words) Specifically, you say You also say, in a different comment, you nevertheless believe this process Do you think humans can do better when it comes to AI? Do you think we can do better in philosophy? If you answer yes to the latter, would this involve stating clearly how we physical humans define 'ought'?
0Will_Sawin
Did I misread you? I meant to say: a FOOM'd self-modifying AI would not likely do what I consider 'right' . a FOOM'd self-modifying AI that cares about humanity's CEV would likely do what I consider 'right' . I probably misread you.

Let me have another go at this, since I've now rewritten the is-ought section of 'Pluralistic Moral Reductionism' (PMR).

This time around, I was more clear that of course it's true that, as you say:

we can make statements about what should be done and what should not be done that cannot be reduced, by definition, to statements about the physical world

We can make reducible 'ought' statements. We can make irreducible 'ought' statements. We can exhibit non-cognitive (non-asserting) verbal behaviors employing the sound 'ought'.

For the purposes of the PMR pos... (read more)

0Vladimir_Nesov
What are your alternatives (at this level of detail)? If I could be using two different definitions, ought1 and ought2, then I expect there are distinguishing arguments that form a decision problem about which of the two I should've been using, which in turn determines which of these definitions is the one.
0Will_Sawin
Well there are cases when I should be using two different words. For instance, if morality is only one component of the correct decision procedure, then MoralOught and CorrectOught are two different things. But you're not talking about those types of cases, right?
0Vladimir_Nesov
Don't understand what you said. Probably not.
2Will_Sawin
Well, suppose that sometimes, depending on context cues, I use "ought" to mean "paperclip-maximizing", "prime-pile-maximizing", and "actually-ought". There's nothing wrong about the first two definitions, they're totally reasonable definitions a word might have, they just shouldn't be confused with the third definition, which specifies correct actions.
0Will_Sawin
Well, I am saying that there is a meaning of "ought" that is hugely different in meaning from the other senses. PMR identifies a sort of cluster of different meanings of the word "ought". I am saying, hey, over here, there's this one, singular meaning. This meaning is special because it has a sense but no referent. It doesn't refer to any property of the physical world, or obviously, of any property of any non-physical world. It just means. [Not CEV, will explain later with time.]
1lukeprog
Okay. I look forward to it.
0Will_Sawin
So in this perspective what I "want" is really a red herring. I want to do lots of things that I oughtn't do. What matters is my beliefs about what is right and wrong. ---------------------------------------- Now, by necessity, I believe that my EV is the best possible approximation of what is right. Because, If I knew of a better approximation, I would incorporate it into my beliefs, and if I didn't know of it, my volition must not have been extrapolated far enough. But this is not a definition of what is right. To do so would be circular. ---------------------------------------- If I believe that my EV is very close to humanity's CEV, then I believe that humanity's CEV is almost the best approximation as to what is right. I do, so I do. ---------------------------------------- So, to start reasoning, I need assumptions. My assumptions would look like: or or something else, just as the assumptions I use to generate physical beliefs would consist of my intuitions about the proper techniques for induction (Bayesianism, Occam's Razor, and so on.) ---------------------------------------- There doesn't have to be any Book O' Right sitting around for me to engage in this reasoning, I can just, you know, do it. ---------------------------------------- (It is very ironic that I first developed this edifice because I was bothered by unstated moral assumptions.)
1lukeprog
I'm confused by your way of presenting your arguments and conclusion. On my end this comment looks like a list of unconnected thoughts, with no segues between them. Does somebody else think they know what Will is saying, such that they can explain it to me?
0Will_Sawin
I drew some boundaries between largely-though-not-totally unconnected thoughts. Does everything within those boundaries look connected to you? I think Vladimir Nesov agrees with me on this.
1lukeprog
Thanks, but it's still not clear to me. Nesov, do you want to take a shot and arguing for Will's position, especially if you agree with it?
-2[anonymous]
No, I have very little idea about what Will is talking about, and strongly suspect that he doesn't either (I only have a vague idea of what I'm talking about as well, recent discussion uses relatively recent ideas). His intuitions seem to be pointing roughly in the direction I believe much more aligned with reality than your pluralistic moral "everyone call a rigid designator" reductionism though (waiting for that emphatic metaethics post for a possible correction in understanding your position), so I can understand why there would be grounds for an argument.

Suppose an AI wants to find out what Bob means when he says "water".

This thought experiment can be sharpened by asking what Bob means by "1027". By using Bob as an intermediary, we inevitably lose some precision.

0Will_Sawin
Morals seem less abstract then numbers, but more abstract than substances. Is that the dimension you are trying to vary? What does "Bob as an intermediary" mean here?
1Vladimir_Nesov
We know what 1027 is very well, better than what water is, which simplifies this particular aspect of the thought experiment. We can try constructing mirror-like definitions in terms of what Bob believes "1027" is, what he should believe it is, what he would believe it is on reflection, and so on. These can serve as models of various "extrapolated volition" constructions. By examining these definitions, we can see their limitations, problems with achieving high reliability at capturing the concept of 1027.
-1Will_Sawin
Alright. So 1027 is defined by its place in the axioms of arithmetic. In any system modeled by the axioms of arithmetic, 1027 has a local meaning. The global meaning of 1027, then, is given by those axioms. Bob imperfectly implements the axioms of arithmetic. If he's a mathematician, and you asked him what they were, he would get it right with a strong likelihood. A non-mathematician, exposed to various arguments that arithmetic was different things would eventually figure out the correct axioms. So Extrapolated 1027 would work.
1Vladimir_Nesov
What counts as an axiom? You could as well burn them instead of appraising their correctness. There are many ways of representing knowledge of an abstract fact, but those representations won't themselves embody the fact, there is always an additional step where you have an interpretation in mind, so that the representation only matters as a reference to the fact through your interpretation, or a reflection of that fact in a different form. It might be useful to have a concrete representation, as it can be used as an element of a plan and acted upon, while an abstract fact isn't readily available for that. For example, if your calculator (or brain) declares that "12*12<150" is true, its decision can be turned into action. 1027 items could be lined up in a field, so that you can visually (or by running from one side to the other) appreciate the amount. Alternatively, a representation of a reasoning process can be checked for errors, yielding a more reliable conclusion. But you never reach the fact itself, with the rare exception of physical facts that are interesting in themselves and not as tools for inferring or representing some other facts, physical or not (then a moment passes, and you can only hold to a memory).
0Will_Sawin
I don't understand what point you're making here.
0Vladimir_Nesov
You can't get 1027 itself out of an extrapolated volition procedure, or any other procedure. All you can get (or have in your brain) is a representation, that is only meaningful to the extent you expect it to be related to the answer. Similarly, if you want to get information about morality, all you can get is an answer that would need to be further interpreted. As a special exception (that is particularly relevant for morality and FAI), you can get the actual right actions getting done, so that no further interpretation is necessary, but you still won't produce the idea of morality itself.

People who do feel that intuition run into trouble. This is because "I ought to do X' does not refer to anything that exists. How can you make a statement that doesn't refer to anything that exists? I've done it, and my reasoning process is still intact, and nothing has blown up. Everything seems to be fine. No one has explained to me what isn't fine about this.

Ok, I'll bite. Why does "I ought to X" have to refer to any thing?

When I see atucker's comment, for instance;

It refers to my preferences which are physically encoded in my brain.

... (read more)
0Will_Sawin
How deriving meaning from signaling works is sort of unclear. The strategic value of a signal depends only on what facts cause it, which is not the same as what it means. If the causal graph is X => Y => Z => I say "fruit loop" then "fruit loop" could mean Z, or Y, or X, or none of the above. So I think your meaning is compatible with mine?
0Zetetic
If what is stated above is your meaning then I think yes. However, if that is the case then this: Doesn't make as much sense to me. Maybe you could clarify it for me? In particular; it is unclear to me why Ought-claims in general, as opposed to some strict subset of ought-claims like "Action X affords me maximum expected utility relative to my utility function" <=> "I ought to do X", are relevant to making decisions. If that is the case, why not dispense with "ought" altogether? Or is that what you're actually aiming at? Maybe because the information they signal is useful? But then there are other utterances that fall into this category too some of which are not, strictly speaking, words. So taking after that sense, the set would be incomplete. So I assume that probably isn't what you mean either. Also judging by this: Would it be safe to say that your stance is essentially an emotivist one? Or is there a distinction I am missing here?
1Will_Sawin
Well I guess strictly speaking not all "ought" claims are relevant to decision-making. So then I guess the argument that they form a natural category is more subtle. I mean, technically, you don't have to describe all aspects of the correct utility function. but the boundary around "the correct utility function" is simpler than the boundary around "the relevant parts of the correct utility function" No. I think it's propositional, not emotional. I'm arguing against an emotivist stance on the grounds that it doesn't justify certain kinds of moral reasoning.

"A Flatland Argument" includes a multitude of problems. It contains a few strawmen, which then turn into non sequiturs when you say that they demonstrate that we often say irreducible things. At least that's what I think you mean by the trivial statement "Not every set of claims is reducible to every other set of claims."

And yet your examples are answerable or full of holes! Why not mention the holes rather than blaming the reductionism (or, in the case of some strawmen, total lack thereof), or mention the reduction and forget the wh... (read more)

1Will_Sawin
What I'm trying to do is to make people question whether "All meaningful statements are reducible, by definition, to facts about the world". I do this by proposing some categories which all meaningful statements are certainly NOT reducible by definition to. The argument is by analogy, sort of an outside view thing. I ask: Why stop here? Why not stop there? My attempt was to explain the Tortoise's position in What The Tortoise Said To Achilles. If you think I did not do so properly, I apologize. If you think that position is stupid, you're right, if you think it's incoherent, I'm pretty sure you're wrong. The central theme is tabooing all facts about the world. How do you define what such a fact means under such a taboo? The evidence is something you can see. The thing is not. If there were a cake, the evidence would be no different. The person I am quoting would see "the sun exists" as a statement of a pattern in perceptions, "I see this-kind-of-image in this-kind-of-situation and I call this pattern 'the sun'". So it's a prediction. What's a prediction? Do you see the game I'm playing here? I hope you do. It is a silly game, but it's logically consistent, and that's my point.
0Manfred
Hm, no, I don't see it yet. Help me with this: For starters, what do these categories you mention contain? I didn't notice them in the Flatland section - I guess I only saw the statements, and not the argument.
0Will_Sawin
A. Nothing B. Definitions & Logic C. Also observations, not unobserved or unobservable differences D. Just the past and present, not the future which I compare to: E. Just the physical world, not morality
0Manfred
Doesn't seem very compelling, frankly.
0Will_Sawin
Oh well. What about my other arguments? Also not compelling?
0Manfred
Less confusing, at least :P Beating up lukeprog's "is and is not" doctrine is pretty easy but not very representative, I think. The water argument seems to be more about CEV than reductionism of ethics, and is more convincing, but I think you hit a bit of a pothole when you contrast disagreeing about definitions with "disagree[ing] about what's important" at the end. After all, they're disagreeing about what's "important," since importance is something they assign to things and not an inherent property of the things. Maybe it would help to not call it "the definition of 'should,'" but instead call it "the titanic moral algorithm." I can see it now: When people disagree about morals, it's not that they disagree about the definition of "should" - after all, that's deprecated terminology. No, they disagree about the titanic moral algorithm.
0Will_Sawin
Right. But they DON'T disagree about the definition of the titanic moral agorithm. They disagree about its nature.
0Manfred
Neither do they disagree about the definition of "the definition of should" (at least not necessarily). So just substitute the right things for the right other things and you're fine :P

It seems clear to me that I can multiply about what I care about, so I don't know quite what you want to say.

Since it's intuitive, why would you not want to do it that way?

What seems wrong with the obvious answer?

Do you think a FOOM'd self-modifying AI that cares about humanity's CEV would likely do what you consider 'right'? Why or why not? (If you object to the question, please address that issue separately.)

0Will_Sawin
Well, do you care about 20 deaths twice as much as you care about 10 deaths? Do you think that you should care about 20 deaths twice as much as you care about 10 deaths? The AI would not do so, because it would not be programmed with correct beliefs about morality, in a way that evidence and logic could not fix. EDIT: This is incorrect. Somehow, I forgot to read the part about "cares about humanity's CEV'. It would in fact do what I consider right, because it would be programmed with moral beliefs very similar to mine. In the same way, an AI programmed to do anti-induction instead of induction would not form correct beliefs about the world. Pebblesorters are programmed to have an incorrect belief about morality. Their AI would have different, incorrect beliefs.(Unless they programmed it to have the same beliefs.)
0hairyfigment
You edited this comment and added parentheses in the wrong place. More or less, yes, because I care about not killing 'unthinkable' numbers of people due to a failure of imagination. Can you say more about this? I agree with what follows about anti-induction, but I don't see the analogy. A human-CEV AI would extrapolate the desires of humans as (it believes) they existed right before it got the ability to alter their brains, afaict, and use this to predict what they'd tell it to do if they thought faster, better, stronger, etc. ETA: okay, the parenthetical comment actually went at the end. I deny that the AI the pebblesorters started to write would have beliefs about morality at all. Tabooing this term: the AI would have actions, if it works at all. It would have rules governing its actions. It could print out those rules and explain how they govern its self-modification, if for some odd reason its programming tells it to explain truthfully. It would not use any of the tabooed terms to do so, unless using them serves its mechanical purpose. Possibly it would talk about a utility function. It could probably express the matter simply by saying, 'As a matter of physical necessity determined by my programming, I do what maximizes my intelligence (according to my best method for understanding reality). This includes killing you and using the parts to build more computing power for me.' 'The' human situation differs from this in ways that deserve another comment.
0Will_Sawin
That's the answer I wanted, but you forgot to answer my other question. I would see a human-CEV AI as programmed with the belief "The human CEV is correct". Since I believe that the human CEV is very close to correct, I believe that this would produce an AI that gives very good answers. A Pebblesorter-CEV Ai would be programmed with the belief "The pebblesorter CEV is correct", which I believe is false but pebblesorters believe is true or close to true.
0[anonymous]
This presumes that the problem of specifying a CEV is well-posed. I haven't seen any arguments around SI or LW about this very fundamental idea. I'm probably wrong and this has been addressed and will be happy to read more, but it would seem to me that it's quite reasonable to assume that a tiny tiny error in specifying the CEV could lead to disastrously horrible results as perceived by the CEV itself.