I'll admit that I'm using the LessWrong Board to try and figure out flaws in my own philosophical ideas. I should also make a disclaimer that I do not dispute the usefulness of Eliezer's ideas for the purposes of building a Friendy AI.

My criticisms are designed for other purposes- namely, that contrary to what I am led to believe most of this site believes Eliezer's metaethics does not work for solving ethical dilemnas except as a set of arbitrary rules, and in no way the stand-out best choice compared to any other self-consistent deontological or consequentialist system.

I'll also admit that I have something of a bias, for those looking in- I find an interesting intellectual challenge to look through philosophies and find weak points in them, so I may have been over-eager to find a bias that doesn't exist. I have been attempting to find an appropriate flaw for some time as some of my posts may have foreshadowed.

Finally, I will note that I am also attempting to dodge attacks on Elizier's ethics despite it's connections to Eliezer's epistemology.

---------------------------------------

1: My Basic Argument

Typically, people ask two things out of ethics- a reason to be ethical in the first place, and a way to resolve ethical dilemnas. Eliezer gets around the former by, effectively, appealing to the fact that people want to be moral even if there is no universially compelling argument.

The problem with Eliezer's metaethics are based around what I call the A-case after the character I invented to be in it the first time I thought up this idea. A has two options- Option 1 is the best choice from a Consequentialist perspective and A is smart enough to figure that out. However, following Option 1 would make A feel very guilty for some reason (which A cannot overcome merely by thinking about it), whereas Option 2 would feel morally right on an emotive level.

This, of course, implies that A is not greatly influenced by consequentialism- but that's quite plausible. Perhaps you have to be irrational to an intelligent non-consequentialist, but an irrational non-consequentialist smart enough to perform a utility calculation as a theoretical exercise is plausible.

How can we say that the right thing for A to do is Option 1, in such a way as be both rational and in any way convincing to A? From the premises, it is likely that any possible argument will be rejected by A in such a manner that you can't claim A is being irrational.

This can also be used against any particular deontological code- in fact more effectively due to greater plausibility- by substituting it for Consequentialism and claiming that according to said code it is A's moral duty. You can define should all you like, but A is using a different definition of should (not part of the opening scenario, but a safe inference except for a few unusual philosophers). You are talking about two different things.

-----------------------------------------------------------------

2: Addressing Counterarguments

i:

It could be argued that A has a rightness function which, on reflection, will lead A to embrace consequentialism as best for humanity as a whole. This is, however, not necessarily correct- to use an extreme case, what if A is being asked to kill A's own innocent lover, or her own baby? ("Her" because it's likely a much stronger intution that way) Some people in A's posistion have said rightness functions- it is easily possible A does not.

In addition, a follower of Lesswrong morality in it's standard form has a dilemna here. If you say that A is still morally obliged to kill her own baby, then Eleizer's own arguments can be turned against you- still pulling a child off the traintracks regardless of any 'objective' right. If you say she isn't, you've conceded the case.

A deontological theory is either founded on intuitions or not. If not Hume's Is-Ought distinction refutes it. If it is, then it faces similiar dilemnas in scenarios like this. Intuitions, however, do not add up to a logically consistent philosophy- "moral luck" (the idea a person can be more or less morally responsible based on factors outside their control) feels like an oxymoron at first, but many intuitions depend on it.

ii:

One possible counteragument is that A wants to do things in the world, and merely following A's feelings turns A into a morality pump making actions which don't make sense. However, there are several problems with this.

i- A's actions probably make sense from the perspective of "Make A feel morally justified". A can't self-modify (at least not directly), after all.

ii- Depending on the strengths of the emotions, A does not necessarily care even if A is aware of the inconsistencies in their actions. There are plenty of possible cases- a person dealing with those with whom they have close emotional ties, biases related to race or physical attractiveness, condeming large numbers of innocents to death etc.

iii:

A final counterargument would be that the way to solve this is through a Coherentist style Reflective Equilibrium. Even if Coherentism is not epistemically true, by treating intuitions as if it were true and following the Coherentist philosophy the result could feel satisfying. The problem is- what if it doesn't? If a person's emotions are strong enough, no amount of Reflective Equilibrium is strong enough to contradict them.

If you take an emotivist posistion, however, you have the problem Emotivism has no solution when feelings contradict each other.

------------------------------------------------------------------

3: Conclusions

My contention here is that we have a serious problem. The concept of right and wrong is like the concept of personal identity- merely something to be abolished for a more accurate view of what exists. It can be replaced with "Wants" (for people who have a unified moral system but various feelings),  "Moralities" (systematic moral codes which are internally coherent), and "Pseudo-Moralities" with no objective morality even in the Yudowskyite sense existing.

A delusion exists of morality in most human minds, of course- just as a delusion exists of personal identity in most if not all human minds. "Moralities" can still exist in terms of groups of entities who all want similiar things or agree with basic moral rules, that can be taken to their logical conclusions.

Why can that not lead to morality? It can, but if you accept a morality on that basis it implies that rational argument (as opposed to emotional argument, which is a different matter) is in many cases entirely impossible with humans with different moralities, just as it is with aliens.

This leaves two types of rational argument possible about ethical questions:

-Demonstrating that a person would want something different if they knew all the facts- whether facts such as "God doesn't exist", facts such as "This action won't have the consequences you think it will", or facts about the human psyche.

-Showing a person's Morality has internal inconsistencies, which in most people will mean they discard it. (With mere moral Wants this is more debatable)

Arguably it also leads a third- demonstrating to a person that they do not really want what they think they want. However, this is a philosophical can of worms which I don't want to open up (metaphorically speaking) because it is highly complicated (I can think of plenty of arguments against the possibility of such, even if I am not so convinced they are true as to assert it) and because solving it does not contribute much to the main issue.

Eliezer's morality cannot even work out on that basis, however. In any scenario where an individual B:

i- Acts against Eliezer's moral code

ii- Feels morally right about doing so, and would have felt guilty for following Eliezer's ideas

Then they can argue against somebody trying to use Eliezer's ideas against them by pointing out that regardless of any Objective Morality, Eliezer still has a good case for dragging children off train tracks.

I will not delve into what proportion of humans can be said to make up a single Morality due to having basically similiar premises and intuitions. Although there are reasons to doubt it is as large as you'd think (take the A case), I'm not sure if it would work.

In conclusion- there is no Universially Compelling argument amongst humans, or even amongst rational humans.

New Comment
31 comments, sorted by Click to highlight new comments since:

Eliezer's metaethics does not work for solving ethical dilemmas...

Nobody's metaethics works for solving ethical dilemmas. That's not the purpose of metaethics. Metaethics is about the status of moral claims, whether they're objectively true, knowable, etc. But for actually figuring out what to do, you want normative / applied ethics.

To the best of my knowledge, Eliezer has never tried to give a complete account of his views on normative ethics. In fact, I suspect he doesn't have a completely worked-out theory of normative ethics, since the point of Coherent Extrapolated Volition is it attempts to give you a way of programming ethics into an AI without having a complete theory of normative ethics.

While Eliezer calls himself a "consequentialist," he often seems not to be a stereotypical consequentialist a la Peter Singer. The closest thing to a detailed account of his normative I've seen him give is here. Notably, he rejects the claim that "That virtuous actions always correspond to maximizing expected utility under some utility function."

The answers to ethical dilemnas are factual questions (in case you see a problem here, I'm a moral error theorist about conventional morality). Metaethics represents the reason to believe that there are factual answers at all. Therefore no means of solving ethical dilemnas can work without metaethics, and any means that does work must be based on a metaethic.

In my post, I argue that Elizier's metaethics cannot lead to a system of normative ethics that actually solves ethical dilemnas, as it falls apart whenever confronted by a scenario where philosophical reasoning and emotive conclusions contradict. The A example was intended to show this.

Elizier has minor exceptions to consequentialism, but is a consequentialist for most practical purposes. Hence:

"Where music is concerned, I care about the journey.

When lives are at stake, I shut up and multiply."

However, the arguments he use in "The Moral Void" can be used against his own consequentialism whenever it is the case that an actor will feel much more guilty for acting in the consequentialist manner than against it because his basic metaethics involves humans being moral not because of an objective reason in the universe but because they want to be.

If the previous link didn't convince you, I'd also recommend Can't Unbirth A Child, as an example of Eliezer not fitting the stereotype of a consequentialist.

I may have to adjust on the question of whether Eliezer is in minor or major ways a non-Consequentialist, but it still isn't relevant as my initial argument against him still applies.

Your argument seems to boil down to:

"There are possible agents with moralities that prohibit them from taking actions which lead to futures they morally prefer"

Am I understanding you?

I fail to see the relation between this argument and the metaethics sequence. I agree that this is an annoying thing to discover about your own ethics, and that rational agents should strive to be self-modifying. Further, I agree that it's possible to design an agent with a morality that has no stable reflective equilibrium, so that even self-modification will not allow the agent to satisfy itself. Such an agent would be eternally frustrated. Let's hope that human morality isn't pathological.

If our morality is unsatisfiable, we'll never be satisfied. That seems tautological to me. Perhaps I don't understand the point you're trying to make.

It's broader than that- different people have moral instincts which lead towards different conclusions, to the point that simplifying them down to any form of consequentialism or even a common code of deontological morality is not tenable.

As for your claim that rational agents should strive to be self-modifying in a case such as this, consider an individual who has the choice to wirehead or not but who is otherwise normal. Wireheading will get them a far greater feeling of satisfaction along with the other benefits, but would require them to violate one of their own values. Should they self-modify to accept wireheading?

As a means to construct a friendly AI, Eliezer's metaethics does its job as well as can be expected. As a guide to how a human can behave morally, it fails for reasons I was trying to demonstrate. I don't know if human morality is pathological or not, but that was not the point I was intending to discuss as my argument works either way.

different people have moral instincts which lead towards different conclusions, to the point that simplifying them down to any form of consequentialism or even a common code of deontological morality is not tenable.

Perhaps, but this has nothing to do with metaethics. The argument that humans have similar (though not identical) ethics is made elsewhere (some of the evolution posts might touch upon it?), and is by no means conclusive.

Wireheading will get them a far greater feeling of satisfaction along with the other benefits, but would require them to violate one of their own values. Should they self-modify to accept wireheading?

Probably not. But this has nothing to do with metaethics. The arguments you are making are not meta-ethical.

The metaethics sequence, if I may torture it down to a few bullets, makes the following claims:

  • Morality is not objective; it is not written in the stars nor handed down by the gods.
  • Morality is not subjective; it is not "that which you decide is right".
  • Your morality is given to you. It is built into you by your history, your environment, your construction.
  • You don't get to choose what ethics you have -- you choose only how to adhere to them.

If these claims all seem somewhat obvious to you, then the metaethics sequence has served its purpose. For many people, these ideas are quite foreign. You're arguing something else entirely -- you're wondering how to choose to adhere to your given ethics, and whether all humans have similar ethics. These discussions could be interesting, but they have nothing to do with metaethics.

We can have those talks if you want, but you're going to meet some resistance if you try to make non-metaethical arguments in a post claiming to critique the metaethics sequence.

Out of curiousity, take the original, basic A case on it's own. What rational argument, metaethical or otherwise, can you see for making A act in a consequentialist manner? Assume for the sake of argument that whilst getting A to act in such a manner is realistic, making him not feel guilty isn't so you're trying to get him to act despite his guilt.

A- The fact that different human beings have different values at a core level implies that a single, unified human ethical theory is impossible. Eliezer, at the very least, has provided no reasonable argument for why the differences between human feelings should be skipped over. Even if he has, I don't see how he can reconcile this with "The Moral Void".

I'll repeat again- Eliezer has a problem with any case where he wants a person to act against their moral feelings of right and wrong.

B- The wireheading argument is a refutation of your argument that humans should self-modify to eliminate what you see as problematic parts of their metaethics. I am comparing such things to wireheading.


As for your claims on the metaethics, Point 1 is so obvious that although some people don't know it it is hardly worth mentioning in serious philosophical discussion. Point 3 I agree with. Point 4 is technically correct but if we define "choose" in the sense of "choose" used on Lesswrong instead of the sense of "choose" used by those who believe in free will then it is safe to say that humans can choose to a small extent. Point 2 is correct in that we don't spontaneously decide to have an ethics then start following it, but you may mean something else on this.

However, what you miss is that part of Eliezer's system of metaethics is the implicit assumption that "ethics" is a field in which ethics for all human beings can be talked about without trouble. These two assumptions must work for his metaethics to work. I demonstrate that no deontological, no consequentialist, and no virtue-ethical (although virtue-ethics is far more bunk than deontology or consequentialism for a variety of reasons so I only mention it in case of nitpicking) system is compatible with the basic reason to be moral Eliezer has at the core of his system- which amounts to the idea that humans WANT to be moral- if he is going to also assume that there is a universial ethics at the same time.

This is why I contend my meta-ethical system is better. As a description of how humans see ethical claims (which is generally a simplistic, unreflective idea of Right or Wrong) it doesn't work, but it fullfills the more important role of describing how the prescriptive parts of ethics actually works, to the extent that a prescriptive theory of ethics as a question of fact can be coherent.

We seem to be talking past each other. I'm not entirely sure where the misunderstading lies, but I'll give it one more shot.

Nobody's arguing for consequentialism. Nobody's saying that agent A "should" do the thing that makes A feel guilty. Nobody's saying that A should self-modify to remove the guilt.

You seem to have misconstrued my claim that rational agents should strive to be self-modifying. I made no claim that agents "should" self-modify to eliminate "problematic parts of their metaethics". Rather, I point out that many agents will find themselves inconsistent and can often benefit from making themselves consistent. Note that I explicitly acknowledge the existence of agents whose values prevent them from making themselves consistent, and acknowledge that such agents will be frustrated.

All of this seems obvious. Nobody's trying to convince you otherwise. It's still not metaethics.

what you miss is that part of Eliezer's system of metaethics is the implicit assumption that "ethics" is a field in which ethics for all human beings can be talked about without trouble

Perhaps this is the root of the misunderstanding. I posit that the metaethics sequence makes no such assumption, and that you are fighting a phantom.

Other people on this website seem to think I'm not fighting a phantom and that the Metaethics sequence really does seem to talk of an ethics universial to almost all humans, with psycopaths being a rare exception.

One of my attacks on Eliezer is for inconsistency- he argues for consequentialism and a Metaethics of which the logical conclusion is deontology.

How can you describe somebody as "benefiting" unless you define the set of values the perspective of which they benefit from? If it is their own, this is probably not correct. Besides, inconsistency is a kind of problematic metaethics.

And how is it not metaethics?

Other people on this website seem to think I'm not fighting a phantom

Feel free to take it up with them :-)

And how is it not metaethics?

Metaethics is about the status of moral claims. It's about where "should", "good", "right", "wrong" etc. come from, their validity, and so on. What a person should do in any given scenario (as in your questions above) is pure ethics.

The absence of a universally compelling argument is not a unique feature of ethics. Among what are nearly universally considered objective matters (though what's objective is itself something people seem unable to convince one another of), there are, to be sure, some that nearly everybody will agree about. But there are many that people disagree about, and where they show great resistance to being convinced by any rational argument. And, of course, to get universal agreement on anything we need to exclude some people as insane or irrational; why is it suddenly unacceptable when the defender of objective ethics excludes some disagreements as involving irrationality? I don't see that you've given any compelling reason here to classify ethics as non-objective in preference to the alternative of grouping it with the more difficult and controversial of objective matters. I'm not saying that there are no such reasons, but if there are any, they aren't in this post.

If you wish to redefine sanity or rationality to include a sets of goals or moral principles, it is possible. However, there is the question of why this should be done- an agent can be highly rational in achieving their goals and simply not share the same set of morals. Why should this be called insane or irrational?

An agent which disagrees on questions on fact, by contrast (assuming they're wrong), will constantly face contradictions between the facts and their beliefs whenever the issue becomes relevant.

An agent which disagrees on questions on morals, by contrast, can be fully aware of any "contradictory" fact and just not care. The prototypical example would be a highly rational psycopath- they know they're causing massive harm with their actions, but consider this irrelevant.

"Whenever the issue becomes relevant." What if the person with the factual disagreement just considers the conflicting facts irrelevant? How is that different from what the psychopath does? Why should we say the psychopath is any more rational than the one who rejects what are less controversially facts?

I realize that it's not much of an argument for classifying the psychopath as irrational that we do the same to some other classes of people who stubbornly disagree with us. But so far as I can see, you have given no argument at all for doing otherwise; you've merely pointed out that doing otherwise is an option. Why should we consider it the better option?

Because when we test the issue, the person with conflicting facts can be found to be wrong. Elizier puts it nicely (for this sort of thing- not for other things) when he says that thinking I can fly won't save me from falling off a cliff.

You could attempt to parallel this issue by claiming that a psychopath thinking their actions are right or wrong doesn't change the facts of the matter. If so, however, your definition of "rational" conflicts with the ordinary conception of what the word means, which was my point to begin with.

Pragmatically, the reason I tend to use is that "rational" represents effective thinking for determining the true state of affairs or attaining goals and thus should not be confused with "moral", whatever that means. "Rational" by my definition is an important category of thought.

From an Epistemic perspective, the problem is how ought facts can be established from is facts in such a way as to have an answer to those who cite Elizer's argument in The Moral Void for following their intuitions instead. If you reduce them to, say, facts about the well being of conscious creatures, you face the dilemna of those who don't care about that and instead care about something else. If you drop the prescriptive element of 'ought', why even use the word?

But most factual matters aren't like thinking you can fly. It is not in dispute that ethical issues are not among the easy, obvious matters of objective fact.

As for your appeal to the ordinary conception of what the word means, you certainly will not find anything close to universal agreement that psychopaths are sane, or that immoral behavior is rational. You also make a passing reference to your definition of "rational;" you have certainly made it clear that on your definition the norms of rationality do not include any of what are normally classified as the moral norms, but what you persistently fail to do is give us any reason for thinking your way of selecting the category of the rational is superior to the alternatives. Plato and Kant thought immoral behavior was irrational, and they've both had a good deal of influence on what people mean by both "moral" and "rational." Are you so confident that more people agree with you than with them? And, more importantly, regardless of who wins the popularity contest, what's wrong with their way of categorizing things? Why is yours preferable?

The dispute between Einstein's physics and Newton's physics is an example of a difficult factual matter. Said dispute has had drastic implications for modern technology.

I am not discussing the ordinary definition of what "sane" means but the ordinary definition of what "rational" means. I then appealed to the case of the psycopath who discerns matters of fact better than an ordinary person to argue that they give a better impression of rational.

On matters of particulars, I am a Nominalist which is why I discuss the popularity issue. The example I think Yvain gave when he considered an economics lecture and concluded by stating it was his theory of the atom illustrates why you should not stray too far from ordinary definitions of the word.

I've already given reasons for why it's pragmatic to classify "rational" and" moral" seperately from most purposes, and an epistemic argument. Perhaps I edited them in too late and you missed them, in which case I apologise.

I'm not sure I can extract anything from your post that looks like a pragmatic argument. An epistemic argument is presumably one which maintains that we should not believe in objective morality because we know of no knowledge-producing mechanism which would give us access to objective moral facts. I may have misinterpreted you in thinking that your epistemic argument was based on the argument from disagreement, that you thought there couldn't be any such mechanism because if there were it wouldn't produce such conflicting results in different people. If you intended that, I think I've explained above why I think that's unconvincing. The Einstein/Newton case is once again not an especially helpful analogy; one thing which stands out about that case is just how much evidence is relevant. Again, even among uncontroversially objective facts, many are not like that.

But I can see signs that you might have instead, or perhaps additionally, wished to argue that there is no means of generating moral knowledge just because we don't know the details of how any such means would operate. In that case again worrisome analogies are plentiful; there are few cases of objective knowledge where we have a really detailed story to tell about how that knowledge is acquired, and the cases where we don't seem to have much to say at all include logic and mathematics. So the fact that we don't know how we could acquire a certain kind of knowledge does not seem a decisive reason for denying that we have it.

By pragmatic, I meant pragmatic in the ordinary sense of the word. Since the meaning of a word is not set in stone, it should be made to effectively serve a purpose-hence why I appeal to that sort of pragmatism.

As for my epistemic argument, see The Moral Void. Any rational argument to demonstrate something is the "right" thing to do is comparable to Eliezer's argument for killing babies if it feels like a moral wrong to do the " right" thing emotive lot. This is a new clarification of what I was trying to say earlier.

For the Einstein/Newton case you can substitute any case where there is a scientific test which could, in principle, determine a result one way or the other. This is not true in ethics- although Moore's Open Question argument is flawed, it does demonstrate that determining a proper philosophical criterion of what "should" is is necessary to discuss it. Any means of doing so must be philosophical by nature.

Just as the absence of evidence means that we assume unicorns don't exist by default, so the absence of evidence means we assume a way to establish ought from is does not exist by default.

Couple editorial notes on the intro:

My criticisms are designed for other purposes- namely, ...

A dash should either have spaces on both sides (like - this) or be twice the length and have spaces on neither side (like--this).

... contrary to what I am led to believe most of this site believes ...

If you mean this to say what I think you mean, then you need a comma after "believe." Otherwise it sounds like you're saying that you were led to have wrong beliefs about the beliefs of other members of the site. Also I sort of bristle at beliefs attributed to the site itself (rather than the people who participate on the site), but that may just be me.

Elizier's ethics

*Eliezer's

There is a private messaging feature. It seems ideally suited to this sort of thing.

It seems like you claim Eliezer would advocate CDT but Eliezer advocates TDT.

In TDT it can certainly be plausible that it's good to be an agent that wouldn't kill her own child.

Actually, I was merely simplifying. The original A-case was designed to illustrate the problem, whilst the "Kill your own baby" case was designed to illustrate a case where the consequences would be emotively irrelevant to feelings of guilt or justification.

I understand it as: "If you had the knowledge of all facts in the universe, and unlimited intelligence and rationality to see all the connections, and you would have the same moral feelings as you have now... after reflection, what would you consider right?"

This can be attacked as:

a) Speaking about "the same moral feelings" of a person with universal knowledge and unlimited intelligence somehow does not make sense, because it's incoherent for some reasons. I am not sure how specifically; just leaving it here as an option.

b) The results of the reflection are ambiguously defined, for example the result may strongly depend on the order of resolving conflicts. If some values are in mutual conflict, there are multiple ways to choose an internally consistent subset; it may not be obvious which of the subsets fits the original set better. (And this is why different humans would choose different subsets.)

c) Different humans could get very different results because their initial small differences could be hugely amplified by the process of finding a reflective equilibrium. Even if there is an algorithm for choosing among values A and B which is not sensitive to the order of resolving conflicts, the values may be in almost perfect balance, so that for different people different one would win.

Shortly: x-rational morality is a) ill-defined; or b) possible but ambiguous; or c) very different for different people.

Seems to me like you use a variant of the first option (and then somehow change it to the third one at the end of the article), saying more or less that a morality based on extrapolation and omniscience may feel completely immoral, or in other words that our morality objects against being extrapolated too far; that there is a contradiction between "human-like morality" and "reflectively consistent morality".

Although I knew that Eliezer considered CEV to be an important part of his morality, I was dodging that aspect and focusing on the practical recommendations he makes. The application of CEV to the argument does not really change the facts of my argument- not only could a post-CEV A still have the same problems I describe, but a pre-CEV A could discern a post-CEV A's conclusions well enough on a simple case and not care.

However, your summary at the end is close enough to work with. I don't mind working with that as "my argument" and going from there.

Typically, people ask two things out of ethics- a reason to be ethical in the first place, and a way to resolve ethical dilemnas.

A biological perspective on ethics considers it to be:

  1. A personal guide regarding how to behave;
  2. A guide to others expectations of you;
  3. A set of tools for manipulating others;
  4. A means of signalling goodness and affiliations;
  5. A set of memes that propagate at their hosts' expense.

Ethical philosophers tend to be especially hot on point 4.

I'm discussing prescriptive ethics here, not descriptive ethics.

Ethical behaviour is part of the subject matter of biology. If you exclude the science involved, there's not much left that's worth discussing.

Eliezer is not looking for a description of how human ethics works in his Metaethics (although he doubtless has views on the matter), but an argument for why individuals "should" follow ethical behaviour, for some value of "should". Hence, my argument against him revolves around such matters.

Reducing "a biological perspective on ethics" to "a description of how human ethics works" doesn't seem quite right to me. Naturalistic ethics isn't just concerned with the "how" of human morality. Things like "why" questions, shared other-oriented behaviours, social insect cooperation and chimpanzees are absolutely on the table.

Your original description was one which only makes sense regarding descriptive ethics- a topic in which I agree with the validity of your description. Prescriptive ethics, by contrast, is best described by my original description.