Today's post, The Bedrock of Morality: Arbitrary? was originally published on 14 August 2008. A summary (taken from the LW wiki):

 

Humans are built in such a way as to do what is right. Other optimization processes may not. So what?


Discuss the post here (rather than in the comments to the original post).

This post is part of the Rerunning the Sequences series, where we'll be going through Eliezer Yudkowsky's old posts in order so that people who are interested can (re-)read and discuss them. The previous post was Is Fairness Arbitrary?, and you can use the sequence_reruns tag or rss feed to follow the rest of the series.

Sequence reruns are a community-driven effort. You can participate by re-reading the sequence post, discussing it here, posting the next day's sequence reruns post, or summarizing forthcoming articles on the wiki. Go here for more details, or to have meta discussions about the Rerunning the Sequences series.

New Comment
24 comments, sorted by Click to highlight new comments since:

So, can someone summarize why EY thinks that p-morality is inferior (not just h-inferior) to h-morality, which he seems to call a one-place function "morality"? The OP and the following discussion did not make it clear for me.

He doesn't. He only thinks that p-morality is h-inferior. He doesn't believe that there's such a thing as "inferior".

EDIT: Hmmm... I don't really mean that EY doesn't believe that there's such a thing as "inferior". I just mean that when he uses the word "inferior" he means "h-inferior". He doesn't think that there's some universal "inferior" by which we can judge p-morality against h-morality, but of course p-morality is h-inferior to h-morality.

Can you expand on your reasons for believing this? It seems very unlikely to me.

Does my edit help? I can't see how it's very unlikely, it's how I've understood the whole of the meta-ethics sequence.

Well, it helps, in that it clarifies your reasoning. Thanks.

That said, I continue to think that EY would reject a claim like "p-morality is h-inferior to h-morality" to the extent that its symmetrical counterpart, "h-morality is p-inferior to p-morality" is considered equivalent; I expect he would reply with some version of "No, p-morality is inferior to h-morality, which is right."

IOW, my own understanding of EY's position is similar to shminux's, here: that human morality is right, and other moralities (supposing they exist at all) are not right. It seems to follow that other moralities are inferior.

But I don't claim to be any sort of expert on the subject of EY's beliefs, and it's ultimately not a very important question; I'm content to agree to disagree here.

Oh, I think I get it now.

He's saying that he uses "right" to mean the same thing everyone else does — because the "everyone else" he cares about are human and share human values. Words like "right" (and "inferior") don't point to something outside of human experience; they point to something within it. We are having this conversation within human experience, not outside it, so words have their human meanings — which are the only meanings we can actually refer to.

Saying "h-right" is like saying "h-Boston". The meaning of "Boston" is already defined by humanity; you don't have to put "h-" in front of it.

It's just a fact about us that we do not respond to p-rightness in the same way that we respond to h-rightness, and our word "right" refers to the latter. You wouldn't go out and do things because of those things' p-rightness, after all. Rightness, not p-rightness, is what motivates us..

It's part of what we are — just as we (usually) have particular priors. We don't say "h-evidence" for "the sort of evidence that we find convincing" and contrast this with "y-evidence" which is the sort of evidence that a being who always believes statements written in yellow would find convincing. "h-evidence" is just what "evidence" means.

I think I agree with you, which is strange because it looks like TheOtherDave also agrees with you, but disagrees with me.

In general, it's not strange at all for A and B to both agree completely with C, but disagree with each other. For example, if C says "Pie is yummy!", B says "Pie is yummy and blueberry is the best!" and A says "Pie is yummy and cherry is the best!"

In this case, I disagree with your assertion that EY does not believe that Pebblesorter morality is inferior to human morality, an assertion fubarobfusco does not make.

I do think Eliezer is saying that Pebblesorter morality is inferior to human morality, specifically insofar as the only thing that "inferior" can refer to in this sense is also "h-inferior" — all the inferiorness that we know how to talk about is inferiorness from a human perspective, because hey, that's what perspective we use.

I do think Eliezer is saying that Pebblesorter morality is inferior to human morality

(nods) I agree. If Oscar_Cunningham agrees as well, then we all agree.

I also agree. Yay!

Oh, I think I get it now.

I think so too. I really like the way you explained this.

He's saying that he uses "right" to mean the same thing everyone else does — because the "everyone else" he cares about are human and share human values.

Well, again, I suspect he would instead say that he uses "right" the right way, which is unsurprisingly the way all the other people who are right use it. But that bit of nomenclature aside, yes, that's my understanding of the position.

My impression of it:

H-morality approximates certain (objective, mathematical) truths about things such as achieving well-being and cooperation among agents, just as human counting and adding ability approximates certain truths about natural numbers. P-morality does not approximate truths about well-being and cooperation among agents.

A creature that watches sheep passing into a sheepfold and recites, "One, two, seventeen, six, one, two ..." (and imagines the actual numbers that these words refer to) is not doing counting, and a creature whose highest value is prime-numbered pebble piles is not doing morality.

Morality, in the sense of "approximating mathematical truths about things such as achieving well-being and cooperation among agents", is not just an arbitrary provincial value; it is a Good Move. And it is a self-catalyzing Good Move: getting prime-numbered piles of pebbles does not make you more able to make more of them, but achieving well-being and cooperation among agents does make you more able to make more of it.

(EDIT: I no longer believe the above is the point of the article. Not using the retract button on account of making it hard to read is just silly.)

P-morality has a different view about well-being of agents. P-well-being consists solely of the universe having more piles of properly sorted pebbles. Hunger of agents is p-irrelevant, except that it might indirectly affect the sorting of pebbles. If a properly sorted pile of pebbles can be scattered to prevent the suffering of an agent, it p-should not be.

Conversely, h-morality considers suffering of agents to be directly h-relevant, and the sorting of piles of pebbles is only indirectly h-relevant. An agent h-should not be tortured to prevent the scattering of any pile of pebbles.

None of this provides a reason why torturing agents is objectively o-worse than scattering pebbles, so does not validate any claim to objective morality. To appeal to objective morality, we first have to accept that everything that is h-right and/or p-right may or may not be o-right. Frankly. I'm scared enough that that is the case that I would rather remain h-right and be ignorant of what is o-right than take the risk that o-right differs significantly from what is h-right. From the subjective point of view, that is even the h-right decision to make. The pebblesorters also agree- it is p-wrong to try to change to o-morality, just like it is p-wrong to change to h-morality.

If I haven't misunderstood this comment, this is not Eliezer's view at all. See the stuff about no universally compelling arguments though you don't seem to be suggesting that such arguments exist, I think you are making a similar error;. a paperclip maximizer would not agree that achieving well-being and cooperation are inherently Good Moves. We would not inherently value well-being and cooperation if we had not evolved to do so. (For the sake of completeness, the fact that I phrased the previous sentence as a counterfactual should not be taken to indicate that I find it excessively likely that we did, in fact, evolve to value such things.)

I'm >.9 confident that EY would agree that with you that, supposing we do inherently value well-being and cooperation, we would not if we had not evolved to do so.
I'm >.8 confident that EY would also say that valuing well-being and cooperation (in addition to other things, some of which might be more important) is right, or perhaps right, and not just "h-right".

For my own part, I think "inherently" is a problematic word here. A sufficiently sophisticated paperclip maximizer would agree that cooperation is a Good Move, in that it can be used to increase the rate of paperclip production. I agree that cooperation is a Good Move in roughly the same way.

I agree that EY would say both those things. I did not mean to contradict either in my comment.

A sufficiently sophisticated paperclip maximizer would agree that cooperation is a Good Move, in that it can be used to increase the rate of paperclip production. I agree that cooperation is a Good Move in roughly the same way.

That is part of what I was trying to convey with the word 'inherently'. The other part is that I think EY would say that humans do value some forms of cooperation, such as friendship, inherently, in addition to their instrumental value. I am, however, a bit less confident of that than of the things I have said about EY's metaethical views.

Most variants of h-morality inherently value those things. Many other moralities also value those things. That does not make them objectively better than their absence. Note that the presence of values in a specified morality is a factual question, not a moral one.

Whether or not h-morality h-should value cooperation and friendship inherently is a null question. H-moralities h-should be whatever they are, by definition. Whether or not h-morality o-should do so is a question that requires understanding o-morality to answer.

If so, I've badly slipped a meta-level.

I take some issue with Eliezer naming his function for computing morality "right" and calling the pebble sorters "p-right." Shouldn't Eliezer call his morality function "EY-right"? Humans have many similarities in their morality but it's by no means identical. If anything we should attempt to separate the human morality functions into equivalence classes and then start talking about how to bring all of them into compatibility with each other.

In fact, this relates to something that bugged me about the dust speck/torture post. Suppose we had the choice of torturing a pebble sorter for 50 years versus preventing 3^^^3 pebble sorters from getting a speck of dust in their eyes. Does that change the outcome for anyone? Suppose the choice was between torturing a simple paperclip maximizer or making 3^^^3 paperclip maximizers get a dust speck in one of their sensors? Now the interesting question; would anyone torture a pebble sorter for 50 years to prevent 3^^^3 humans from getting a speck of dust in their eyes? Torture a paperclip maximizer? Torture a human to save paperclip maximizers? I think that we intuitively value the morality of the subjects we're computing maximum utility for based on their moral similarity to ourselves, and further that doing this is right. For instance, I find torture to be pretty abhorrent, to the point that I hope a theoretical metaphysicist would choose to give me and 3^^^3-1 other humans a dust speck instead of torture a single human for any length of time. I would even precommit to feeling bad with at least negative utility (dust_speck_in_the_eye - epsilon) if any rational agent could conceivably choose torture over dust specks. If I had to choose whether to torture a human for 50 years or give 3^^^3 people dust specks I would calculate the total utility with torture as lower than with dust specks because I would project my morality onto those 3^^^3 people, all of whom would then be slightly sadder if I were to choose torture than if I were to choose dust specks. If that is not a rational way to calculate the utility functions of other beings then why shouldn't I choose to torture a human for 50 years to maximize the utility of 3^^^3 paperclip maximizers who might be distracted from perfectly producing paperclips by a dust speck?

I would even precommit to feeling bad with at least negative utility (dustspeckintheeye - epsilon) if any rational agent could conceivably choose torture over dust specks. If I had to choose whether to torture a human for 50 years or give 3^^^3 people dust specks I would calculate the total utility with torture as lower than with dust specks because I would project my morality onto those 3^^^3 people, all of whom would then be slightly sadder if I were to choose torture than if I were to choose dust specks.

If your objective is to make it right to choose 3^^^3 dust specks over torture, this doesn't work. If you count the negative utility of one person knowing that another is suffering, you should still choose torture over 3^^^3 dust specks, because you also have to count the negative utility of 3^^^3 people knowing about 3^^^3 dust specks.

Specifically, suppose the universe contains N+1 people, and we can either torture one person or give N people dust specks. If the disutility of the sadness of one person knowing that another is being tortured is x, then, torturing one person has disutility Nx (plus a constant from the bare fact that a person is being tortured, which is irrelevant if N is large enough). Meanwhile, if the disutility of the sadness of one person knowing that another person has a dust speck in their eye is y, then giving one person a dust speck has disutility Ny, and giving N people dust specks has disutility N^2 y (plus an order-N quantity from the bare fact that N people get dust specks, which is irrelevant if N is large enough). No matter what x and y are, N^2 y > Nx if N is large enough. 3^^^3 is gigantically huge, and thus certainly large enough. Thus if N is large enough, the disutility of 3^^^3 dust specks is greater than that of torture.

To get around this you could try to make x infinite (this seems to be something you are dancing around) but if x is really infinite then you should be doing nothing in your life that is not devoted to preventing torture, or even the slightest possibility that someone might someday be tortured. Everything nonessential should be sacrificed to this end, because a literal infinity of utility is on the line. Or you could say that y = 0, if you literally couldn't care less whether other people get dust specks, in which case you should substitute in place of dust specks some other minor discomfort that you would rather spare other people from experiencing.

Maybe a better attempt is to argue that one person's sadness about other people's dust specks does not increase linearly with the number of dust specks. If there is an upper limit to how sad one person can feel about N dust specks that cannot be exceeded no matter how large N is, then your utility function may recommend dust specks over torture. But I suspect this position has problems. I am going to think about it some.

If that is not a rational way to calculate the utility functions of other beings then why shouldn't I choose to torture a human for 50 years to maximize the utility of 3^^^3 paperclip maximizers who might be distracted from perfectly producing paperclips by a dust speck?

You can rationally give dust specks to 3^^^3 paperclip maximizers instead of torturing one human if your disutility for giving one paperclip maximizer a dust speck is less than or equal to zero. This seems quite compatible with standard human morality, especially if these paperclip maximizers inspire zero empathy or if their paperclip-maximizing is dangerous to humans. But if you would feel even the slightest bit of remorse about giving a paperclip maximizer a dust speck, torture one human instead of multiplying that remorse by a stupefyingly huge number like 3^^^3.

No matter what x and y are, N^2 y > Nx if N is large enough. 3^^^3 is gigantically huge, and thus certainly large enough. Thus if N is large enough, the disutility of 3^^^3 dust specks is greater than that of torture.

Am I not allowed to set my disutility for knowing that another person gets a dust speck to zero? I admit that before I thought about this problem I had no opinion about other people getting dust specks in their eyes or not, but I had a strong opinion against torture. Additionally, even now I would say that the margin for error on my estimation of the disutility of a dust speck in someone else's eye is far greater in magnitude than my estimation of the disutility itself, and therefore zero may be a reasonable choice to avoid unintuitive conclusions. Maybe occasional dust specks would help all those folks staring at computer screens and forgetting to blink and this would outweigh any annoyance for the rest of humanity, for instance. There's also the next level of evaluation: z, the utility of knowing that another person would willingly receive a dust speck in their eye to collectively prevent a single person from being tortured. If that utility outweighs the disutility of knowing that another person received a dust speck in their eye then N^2 z outweighs both of the others. I think I would be happier knowing I was in a society that completely agreed with me about not torturing one person to prevent dust specks from getting in their eyes than I would be if no one received that particular dust speck in their eye. Even if only half of the population thought that way the ratio between (N/2)^2 y and (N/2)^2 z would be a constant factor and would still dominate Nx. This may create a paradox because now the utility of getting a dust speck in the eye is zero or positive in the world of N individuals. It may not be a paradox if the utility only becomes non-negative as a condition of dealing with specific ethical meta-questions, and is already playing with the limits of another sort of insanity; If agents are concerned about every possible dilemma of this nature they may precommit to taking on a lot (potentially unbounded) of disutility without the actual existence of a situation requiring them to experience that disutility to save a real person from torture.

Maybe a better attempt is to argue that one person's sadness about other people's dust specks does not increase linearly with the number of dust specks. If there is an upper limit to how sad one person can feel about N dust specks that cannot be exceeded no matter how large N is, then your utility function may recommend dust specks over torture. But I suspect this position has problems. I am going to think about it some.

The marginal utility of goods would suggest that marginal disutility exists too. Once everyone experiences something it becomes much more normal than if only one person suffers it. This is, of course, also the reason that not many people are adamant about preventing death from old age and so it's not necessarily good reasoning in general. Equalizing disutility may sound fair but it is probably not always right.

You can rationally give dust specks to 3^^^3 paperclip maximizers instead of torturing one human if your disutility for giving one paperclip maximizer a dust speck is less than or equal to zero. This seems quite compatible with standard human morality, especially if these paperclip maximizers inspire zero empathy or if their paperclip-maximizing is dangerous to humans. But if you would feel even the slightest bit of remorse about giving a paperclip maximizer a dust speck, torture one human instead of multiplying that remorse by a stupefyingly huge number like 3^^^3.

I think you are correct, and it becomes more apparent if you just form the question as a limit. As N tends toward infinity, do you apply -x utility to N individuals or apply -y utility to 1 individual? The only way to ever choose -x utility to N individuals is if you weight all but a finite number of the N individuals' utility at zero. This may mean we have to weight every individual who doesn't share our exact morality at zero to avoid making our own immoral decisions.

But why should what Eliezer says is e-right converge to what I think is m-right? (given that both e-right and m-right are permitted to change over time, and that both of us prefer a universe in which the other is right)

A key feature of 'rightness' in societies is that societies which follow a concept of 'rightness' that results in their continued existence and increased influence have continued existence and increased influence. Therefore the existing and influential societies likely have moral values which encourage their existence and influence. That seems circular, but it is also the explanation for why p-morality is not exemplified; p-morality does not make a society more influential or more likely to continue to exist. Similarly, values which promote the survival and proliferation of those who hold them become more popular over time.

Should continued existence and increased influence be the (?)-right measure of the (?)-quality of x-morality?