Matt_Simpson comments on What is Eliezer Yudkowsky's meta-ethical theory? - Less Wrong

33 Post author: lukeprog 29 January 2011 07:58PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (368)

You are viewing a single comment's thread.

Comment author: Matt_Simpson 29 January 2011 10:36:31PM *  10 points [-]

In a nutshell, Eliezer's metaethics says you should maximize your preferences whatever they may be, or rather, you should_you maximize your preferences, but of course you should_me maximize my preferences. (Note that I said preferences and not utility function. There is no assumption that your preferences HAVE to be a utility function, or at least I don't think so. Eliezer might have a different view). So ethics is reduced to decision theory. In addition, according to Eliezer, human have tremendous value uncertainty. That is, we don't really know what our terminal values are, so we don't really know what we should be maximizing. The last part, and the most controversial around here I think, is that Eliezer thinks that human preferences are similar enough across humans that it makes sense to think about should_human.

There are some further details, but that's the nutshell description. The big break from many philosophers, I think, is considering edit ones own /edit preferences the foundation of ethics. But really, this is in Hume (on one interpretation).

edit: I should add that the language I'm using to describe EY's theory is NOT the language that he uses himself. Some people find my language more enlightening (me, for one), others find EY's more enlightening. Your mileage may vary.

Comment author: wedrifid 30 January 2011 04:41:58AM 7 points [-]

In a nutshell, Eliezer's metaethics says you should maximize your preferences whatever they may be, or rather, you shouldyou maximize your preferences, but of course you shouldme maximize my preferences. (Note that I said preferences and not utility function.

Eliezer is a bit more aggressive in the use of 'should'. What you are describing as should<matt> Eliezer has declared to be would_want<matt> while 'should' is implicitly would_want<Eliezer>, with no allowance for generic instantiation. That is he is comfortable answering "What should a Paperclip Maximiser do when faced with Newcomb's problem?" with "Rewrite itself to be an FAI".

There have been rather extended (and somewhat critical) discussions in comment threads of Eliezer's slightly idiosyncratic usage of 'should' and related terminology but I can't recall where. I know it was in a thread not directly related to the subject!

Comment author: Matt_Simpson 30 January 2011 08:54:28PM 3 points [-]

You're right about Eliezer's semantics. Count me as one of those who thought his terminology was confusing, which is why I don't use it when I try to describe the theory to anyone else.

Comment author: lessdazed 02 July 2011 05:30:09PM *  0 points [-]

Are you sure? I thought "should" could mean would_want<being with aggregated/weighted [somehow] desires of all humanity>. Note I could follow this by saying "That is he is comfortable answering "What should a Paperclip Maximiser do when faced with Newcomb's problem?" with "Rewrite itself to be an FAI".", but that would be affirming the consequent ;-), i.e. I know he says such a thing, but my and your formulation both plausibly explain it, as far as I know.

Comment author: Raemon 30 January 2011 02:45:37AM 2 points [-]

I had a hard time parsing "you shouldyou maximize your preferences, but of course you shouldme maximize my preferences." Can someone break that down without jargon and/or explain how the "should_x" jargon works?

Comment author: Broggly 30 January 2011 04:09:37AM 1 point [-]

I think the difficulty is that in English "You" is used for "A hypothetical person". In German they use the word "Man" which is completely distinct from "Du". It might be easier to parse as "Man shouldRaemon maximize Raemon's preferences, but of course man shouldMatt maximize Matt's preferences."

On the jargon itself, Should_X means "Should, as X would understand it".

Comment author: XiXiDu 30 January 2011 12:59:59PM 2 points [-]

"Man" is the generalization of the personal subject. You can translate it with "one".

Comment author: NihilCredo 30 January 2011 06:50:25AM *  1 point [-]

I think it's better phrased by putting Man in all instances of Raemon.

Also: \ is the escape character on LW, so if you want to type an actual asterisk or underscore (or \ itself), instead of using it for formatting purposes, put a \ in front of it. This way they will not be interpreted as marking lists, italics, or bold.

Comment author: wedrifid 30 January 2011 08:26:13AM 0 points [-]

I think it's better phrased by putting Man in all instances of Raemon.

Hang on, is that Raemon's preferences we're talking about or....

Comment author: ata 30 January 2011 12:25:53AM *  2 points [-]

(Note that I said preferences and not utility function. There is no assumption that your preferences HAVE to be a utility function, or at least I don't think so. Eliezer might have a different view).

Your preferences are a utility function if they're consistent, but if you're a human, they aren't.

Comment author: Vladimir_Nesov 30 January 2011 04:06:55AM *  -1 points [-]

Consistent in what sense? Utility function over what domain? Under what prior? In this context, some unjustified assumptions, although understandably traditional to a point where objecting is weird.

Comment author: lukeprog 29 January 2011 11:23:14PM *  2 points [-]

I'd appreciate clarification on what you mean by "You should_me maximize my preferences."

I understand that the "objective" part is that we could both come to agree on the value of shouldyou and the value of shouldme, but what do you mean when you say that I should_MattSimpson maximize your preferences?

I certainly balk at the suggestion that there is a should_human, but I'd need to understand Eliezer in more detail on that point.

And yes, if one's own preferences are the foundation of ethics, most philosophers would simply call this subject matter practical rationality rather than morality. "Morality" is usually thought to be a term that refers to norms with a broader foundation and perhaps even "universal bindingness" or something. On this point, Eliezer just has an unusual way of carving up concept space that will confuse many people. (And this is coming from someone who rejects the standard analytic process of "conceptual analysis", and is quite open to redefining terms to make them more useful and match the world more cleanly.)

Also, even if you think that the only reasons for action that exist come from relations between preferences and states of affairs, there are still ways to see morality as a system of hypothetical imperatives that is "broader" (and therefore may fit common use of the term "morality" better) than Eliezer's meta-ethical theory. See for example Peter Railton or 1980s Philippa Foot or, well, Alonzo Fyfe and Luke Muehlhauser.

We already have a term that matches Eliezer's use of "ought" and "should" quite nicely: it's called the "prudential ought." The term "moral ought" is usually applied to a different location in concept space, whether or not it successfully refers.

Anyway, are my remarks connecting with Eliezer's actual stated position, do you think?

Comment author: Matt_Simpson 29 January 2011 11:56:24PM 3 points [-]

but what do you mean when you say that I should_MattSimpson maximize your preferences?

I mean that according to my preferences, you, me, and everyone else should maximize them. If you ask what should_MattSimpson be done, the short answer is maximize my preferences. Similarly, if you ask what should_lukeproq be done, the short answer is to maximize your preferences. It doesn't matter who does the asking. If you ask should_agent should be done, you should maximize agent's preferences. There is no "should" only should_agent's. (Note, Eliezer calls should_human "should." I think it's an error of terminology, personally. It obscures his position somewhat).

We already have a term that matches Eliezer's use of "ought" and "should" quite nicely: it's called the "prudential ought." The term "moral ought" is usually applied to a different location in concept space, whether or not it successfully refers.

Then Eliezer's position is that all normativity is prudential normativity. But without the pop-culture connotations that come with this position. In other words, this doesn't mean you can "do whatever you want." You probably do, in fact, value other people, you're a human after all. So murdering them is not ok, even if you know you can get away with it. (Note that this last conclusion might be salvageable even if there is no should_human.)

As for why Eliezer (and others here) think there is a should_human (or that human values are similar enough to talk about such a thing), the essence of the argument rests on ev-psych, but I don't know the details beyond "ev-psych suggests that our minds would be very similar."

Comment author: lukeprog 30 January 2011 12:02:15AM *  2 points [-]

Okay, that make sense.

Does Eliezer claim that murder is wrong for every agent? I find it highly likely that in certain cases, an agent's murder of some person will best satisfy that agent's preferences.

Comment author: Matt_Simpson 30 January 2011 09:02:03PM 2 points [-]

Murder is certainly not wrong_x for every agent x - we can think of an agent with a preference for people being murdered, even itself. However, it is almost always wrong_MattSimpson and (hopefully!) almost always wrong_lukeproq. So it depends on which question your are asking. If you're asking "is murder wrong_human for every agent?" Eliezer would say yes. If you're asking "is murder wrong_x for every agent x?" Eliezer would say no.

(I realize it was clear to both you and me which of the two you were asking, but for the benefit of confused readers, I made sure everything was clear)

Comment author: TheOtherDave 30 January 2011 09:06:22PM *  3 points [-]

I would be very surprised if EY gave those answers to those questions.

It seems pretty fundamental to his view of morality that asking about "wrong_human" and "wrong_x" is an important mis-step.

Maybe murder isn't always wrong, but it certainly doesn't depend (on EY's view, as I understand it) on the existence of an agent with a preference for people being murdered (or the absence of such an agent).

Comment author: Matt_Simpson 30 January 2011 09:20:12PM *  2 points [-]

Maybe murder isn't always wrong, but it certainly doesn't depend (on EY's view, as I understand it) on the existence of an agent with a preference for people being murdered (or the absence of such an agent).

That's because for EY, "wrong" and "wrong_\human" mean the same thing. It's semantics. When you ask "is X right or wrong?" in the every day sense of the term, you are actually asking "is X right_human or wrong_human?" But if murder is wrong_human, that doesn't mean it's wrong_clippy, for example. In both cases you are just checking a utility function, but different utility functions give different answers.

Comment author: TheOtherDave 30 January 2011 09:58:24PM *  3 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human, EY would say that making many paperclips doesn't therefore become right.

So it seems clear that at least under some circumstances, "wrong" and "wrong_human" don't mean the same thing for EY, and that at least sometimes EY would say that "is X right or wrong?" doesn't depend on what humans happen to want that day.

Now, if by "wrong_human" you don't mean what humans would consider wrong the day you evaluate it, but rather what is considered wrong by humans today, then all of that is irrelevant to your claim.

In that case, yes, maybe you're right that what you mean by "wrong_human" is also what EY means by "wrong." But I still wouldn't expect him to endorse the idea that what's wrong or right depends in any way on what agents happen to prefer.

Comment author: Matt_Simpson 30 January 2011 10:55:54PM *  2 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human

No one can change right_human, it's a specific utility function. You can change the utility function that humans implement, but you can't change right_human. That would be like changing e^x or 2 to something else. In other words, you're right about what the metaethics posts say, and that's what I'm saying too.

edit: or what jimrandomh said (I didn't see his comment before I posted mine)

Comment author: Lightwave 01 February 2011 10:11:03AM *  1 point [-]

What if we use 'human' as a rigid designator for unmodified-human. Then in case aliens convert people into paperclip-maximizers, they're no longer human, hence human_right no longer applies to them, but itself remains unchanged.

Comment author: TheOtherDave 30 January 2011 11:36:13PM 0 points [-]

OK. At this point I must admit I've lost track of why these various suggestively named utility functions are of any genuine interest, so I should probably leave it there. Thanks for clarifying.

Comment author: jimrandomh 30 January 2011 10:54:55PM 2 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human, EY would say that making many paperclips doesn't therefore become right.

In that case, we would draw a distinction between rightunmodifiedhuman and rightmodifiedhuman, and "right" would refer to the former.

Comment author: hairyfigment 30 January 2011 03:17:33AM *  0 points [-]

Murder as I define it seems universally wrong_victim, but I doubt you could literally replace "victim" with any agent's name.

Comment author: torekp 01 February 2011 01:14:23AM 0 points [-]

If you ask what should_MattSimpson be done, the short answer is maximize my preferences.

I find the talk of "should_MattSimpson" very unpersuasive given the availability of alternative phrasings such as "approved_MattSimpson" or "valued_MattSimpson". I have read below that EY discourages such talk, but it seems that's for different reasons than mine. Could someone please point me to at least one post in the sequence which (almost/kinda/sorta) motivates such phrasings?

Comment author: Matt_Simpson 01 February 2011 09:43:58PM 0 points [-]

Alternate phrasings such as those you listed would probably be less confusing, i.e. replacing "should" in "should_X" with "valued" and reserving "should" for "valued_human".

Comment author: orthonormal 29 January 2011 11:52:05PM *  2 points [-]

And yes, if one's own preferences are the foundation of ethics, most philosophers would simply call this subject matter practical rationality rather than morality.

They would be missing some important distinctions between what we think of as our moral values and what we think of as "chocolate/vanilla" preferences. For one obvious example, consider an alien ray gun that 'switches the way I feel' about two things, X and Y, without otherwise affecting my utility function or anything else of value to me.

If X were, say, licorice jelly beans (yum) and Y were, say, buttered popcorn jelly beans (yuck), then I wouldn't be too deeply bothered by the prospect of being zapped with this gun. (Same for sexual preference, etc.) But if X were "autonomy of individuals" and Y were "uniformity of individuals", I would flee screaming from the prospect of being messed with that way, and would take some extreme actions (if I knew I'd be zapped) to prevent my new preferences from having large effects in the world.

Now we can develop whole theories about what this kind of difference consists in, but it's at least relevant to the question of metaethics. In fact, I think that calling this wider class of volitions "preferences" is sneaking in an unfortunate connotation that they "shouldn't really matter then".

Comment author: XiXiDu 30 January 2011 12:47:21PM 2 points [-]

Now we can develop whole theories about what this kind of difference consists in...

Huh? You simply weigh "chocolate/vanilla" preferences differently than decisions that would affect goal-oriented agents.

Comment author: Matt_Simpson 30 January 2011 09:15:26PM 1 point [-]

This sounds, to me, like it's just the distinction between terminal and instrumental values. I don't terminally value eating licorice jelly beans, I just like the way they taste and the feeling of pleasure they give me. If you switched the tastes of buttered popcorn jelly beans (yuck indeed) and licorice jelly beans, that would be fine by me. Hell, it would be an improvement since no one else likes that flavor (more for me!). The situation is NOT the same for "autonomy of individuals" and "uniformity of individuals" before I really do have terminal values for these things, apart from the way they make me feel.

Comment author: TheOtherDave 30 January 2011 10:26:47PM 1 point [-]

The situation is NOT the same for "autonomy of individuals" and "uniformity of individuals" before I really do have terminal values for these things, apart from the way they make me feel.

How do you know that?

What would you expect to experience if your preference for individual autonomy in fact derived from something else?

Comment author: Matt_Simpson 30 January 2011 10:57:25PM 0 points [-]

It was meant as a hypothetical. I don't actually know.

Comment author: TheOtherDave 30 January 2011 11:23:13PM 0 points [-]

Ah. Sorry; I thought you were endorsing the idea.

Comment author: TheOtherDave 30 January 2011 01:42:38AM 1 point [-]

I agree that by using a single term for the wider class of volitions -- for example, by saying both that I "prefer" autonomy to uniformity and also that I "prefer" male sexual partners to female ones and also that I "prefer" chocolate to vanilla -- I introduce the connotation that the distinctions between these various "preferences" aren't important in the context of discourse.

To call that an unfortunate connotation is question-begging. Sometimes we deliberately adopt language that elides a distinction in a particular context, precisely because we don't believe that distinction ought to be made in that context.

For example, in a context where I believe skin color ought not matter, I may use language that elides the distinction between skin colors. I may do this even if I care about that distinction: for example, if I observe that I do, in fact, care about my doctor's skin color, but I don't endorse caring about it, I might start using language that elides that distinction as a way of changing the degree to which I care about it.

So it seems worth asking whether, in the particular context you're talking about, the connotations introduced by the term "preferences" are in fact unfortunate.

For instance, you class sexual preference among the "chocolate/vanilla" preferences for which the implication that they "shouldn't really matter" is appropriate.

I would likely have agreed with you twenty years ago, when I had just broken up with my girlfriend and hadn't yet started dating my current husband. OTOH, today I would likely "flee screaming" from a ray that made me heterosexual, since that would vastly decrease the value to me of my marriage.

Of course, you may object that this sort of practical consequence isn't what you mean. But there are plenty of people who would "flee screaming" from a sexual-preference-altering ray for what they classify as moral reasons, without reference to practical consequences. And perhaps I'm one of them... after all, it's not clear to me that my desire to preserve my marriage isn't a "moral value."

Indeed, it seems that there simply is no consistent fact of the matter as to whether my sexual preference is a "flee screaming" thing or not... it seems to depend on my situation. 20-year-old single me and 40-year-old married me disagree, and if tomorrow I were single again perhaps I'd once again change my mind.

Now, perhaps that just means that for me, sexual preference is a mere instrumental value, best understood in terms of what other benefits I get from it being one way or another, and is therefore a poor example of the distinction you're getting at, and I should pick a different example.

On the other hand, just because I pick an different preference P such that I can't imagine how a change in environment or payoff matrix might change P, doesn't mean that P actually belongs in a different class from sexual preference. It might be equally true that a similarly pragmatic change would change P, I just can't imagine the change that would do it.

Perhaps, under the right circumstances, I would not wish to flee from an autonomy/uniformity switching ray.

My point is that it's not clear to me that it's a mistake to elide over the distinction between moral values and aesthetic preferences. Maybe calling all of these things "preferences" is instead an excellent way of introducing the fortunate connotation that the degree to which any of them matter is equally arbitrary and situational, however intense the feeling that some preferences are "moral values" or "terminal values" or whatever other privileged term we want to apply to them.

Comment author: lessdazed 02 July 2011 05:38:51PM 0 points [-]

20-year-old single me and 40-year-old married me disagree

These are two different people, many objections from the fact they disagree one ought to have from the fact that one and some random other contemporary person disagree.

Comment author: TheOtherDave 02 July 2011 07:21:39PM 0 points [-]

And yet, a lot of our culture presumes that there are important differences between the two.

E.g., culturally we think it's reasonable for someone at 20 to make commitments that are binding on that person at 40, whereas we think it's really strange for someone at 20 or 40 to make commitments that are binding on some random other contemporary person.

Comment author: orthonormal 30 January 2011 05:37:57PM 0 points [-]

Ah, sexual preference was a poor example in general– in my case, being single at the moment means I wouldn't be injuring anybody if my preferences changed. Were I in a serious relationship, I'd flee from the ray gun too.

Comment author: lukeprog 30 January 2011 12:10:03AM 0 points [-]

Thanks for this clarification.

I personally don't get that connotation from the term "preferences," but I'm sure others do.

Anyway, so... Eliezer distinguishes prudential oughts from moral oughts by saying that moral oughts are what we ought to do to satisfy some small subset of our preferences: preferences that we wouldn't want changed by an alien ray gun? I thought he was saying that I morally should_Luke do what will best satisfy a global consideration of my preferences.

Comment author: orthonormal 30 January 2011 12:30:38AM 0 points [-]

No, no, no- I don't mean that what I pointed out was the only distinction or the fundamental distinction, just that there's a big honking difference in at least one salient way. I'm not speaking for Eliezer on what's the best way to carve up that cluster in concept-space.

Comment author: lukeprog 30 January 2011 12:37:18AM 0 points [-]

Oh. Well, what do you think Eliezer has tried to say about how to carve up that cluster in concept-space?

Comment author: Vladimir_Nesov 29 January 2011 11:34:02PM *  2 points [-]

I certainly balk at the suggestion that there is a should_human, but I'd need to understand Eliezer in more detail on that point.

We'd need to do something specific with the world, there's no reason any one person gets to have the privilege, and creating an agent for every human and having them fight it out is probably not the best possible solution.

Comment author: Wei_Dai 01 February 2011 07:09:24AM *  3 points [-]

I don't think that adequately addresses lukeprog's concern. Even granting that one person shouldn't have the privilege of deciding the world's fate, nor should an AI be created for every human to fight it out (although personally I don't think an would-be FAI designer should rule these out as possible solutions just yet), that leaves many other possibilities for how to decide what to do with the world. I think the proper name for this problem is "should_AI_designer", not "should_human", and you need some other argument to justify the position that it makes sense to talk about "should_human".

I think Eliezer's own argument is given here:

Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence; and a great and reasonable doubt as to whether any present disagreement is really unresolvable, even it seems to be about "values". The obvious reason for hope is the psychological unity of humankind, and the intuitions of symmetry, universalizability, and simplicity that we execute in the course of our moral arguments.

Comment deleted 29 January 2011 10:49:18PM [-]
Comment author: Matt_Simpson 29 January 2011 10:52:43PM *  2 points [-]

No, this is called preference utilitarianism.

Usually utilitarianism means maximize the utility of all people/agents/beings of moral worth (average or sum depending on the flavor of utilitarianism). Eliezer's metaethics says only maximize your own utility. There is a clear distinction.

Edit: but you are correct about considering preferences the foundation of ethics. I should have been more clear

Comment author: Jayson_Virissimo 30 January 2011 06:37:51AM *  2 points [-]

Eliezer's metaethics says only maximize your own utility.

Isn't that bog-standard ethical egoism? If that is the case, then I really misunderstood the sequences.

Comment author: Matt_Simpson 30 January 2011 08:51:52PM *  0 points [-]

Maybe. Sometimes ethical egoism sounds like it says that you should be selfish. If that's the case, than no, they are not the same. But sometimes it just sounds like it says you should do whatever you want to do, even if that includes helping others. If that's the case, they sound the same to me.

edit: Actually, that's not quite right. On the second version, egoism give the same answer as EY's metaethics for all agents who have "what is right" as their terminal values, but NOT for any other agent. Egoism in this sense defines "should" as "should_X" where X is the agent asking what should be done. For EY, "should" is always "should_human" no matter who is asking the question.

Comment author: jimrandomh 29 January 2011 11:54:51PM *  0 points [-]

Usually utilitarianism means maximize the utility of all people/agents/beings of moral worth (average or sum depending on the flavor of utilitarianism). Eliezer's metaethics says only maximize your own utility. There is a clear distinction.

Indeed, but I'd like to point out that this is not an answer about what to do or what's good and bad, merely the rejection of a commonly claimed (but incorrect) statement about what structure such an answer should have.

Comment author: Matt_Simpson 30 January 2011 12:00:16AM 0 points [-]

I think think I disagree, but I'm not sure I understand. Care to explain further?

Comment author: jimrandomh 30 January 2011 12:33:27AM 0 points [-]

(Note: This comment contains positions which came from my mind without an origin tag attached. I don't remember reading anything by Eliezer which directly disagrees with this, but I don't represent this as anyone's position but my own.)

"Standard" utilitarianism works by defining a separate per-agent utility functions to represent each person's preferences, and averaging (or summing) them to produce a composite utility function which every utilitarianism is supposed to optimize. The exact details of what the per-agent utility functions look like, and how you combine them, differ from flavor to flavor. However, this structure - splitting the utility function up into per-agent utility functions plus an agent utility function - is wrong. I don't know what a utility function that fully captured human values would look like, but I do know that it can't be split and composed this way.

It breaks down most obviously when you start varying the number of agents; in the variant where you sum up utilities, an outcome where many people live lives just barely worth living seems better than an outcome where fewer people live amazingly good lives (but we actually prefer the latter); in the variant where you average utilities, an outcome where only one person exists but he lives an extra-awesome life is better than an outcome where many people lead merely-awesome lives.

Split-agent utility functions are also poorly equipped to deal with the problem of weighing agents against each other. if there's a scenario where one person's utility function diverges to infinity, then both sum- and average-utility aggregation claim that it's worth sacrificing everyone else to make sure that happens (the "utility monster" problem).

And the thing is, writing a utility function that captures human values is a hard and unsolved problem, and splitting it up by agent doesn't actually bring us any closer; defining the single-agent function is just as hard as defining the whole thing.

Comment author: Matt_Simpson 30 January 2011 09:08:05PM 3 points [-]

I was about to cite the same sorts of things to explain why they DO disagree about what is good and bad. In other words, I agree with you about utilitarianism being wrong about the structure of ethics in precisely the way you described, but I think that also entails utilitarianism coming to different concrete ethical conclusions. If a murderer really likes murdering - it's truly a terminal value - the utilitarian HAS to take that into account. On Eliezer's theory, this need not be so. So you can construct a hypothetical where the utilitarian has to allow someone to be murdered simply to satisfy a (or many) murderer's preference where on Eliezer's theory, nothing of this nature has to be done.

Comment author: jimrandomh 30 January 2011 10:27:02PM 1 point [-]

That is a problem for average-over-agents utilitarianism, but not a fatal one; the per-agent utility function you use need not reflect all of that agent's preferences, it can reflect something narrower like "that agent's preferences excluding preferences that refer to other agents and which those agents would choose to veto". (Of course, that's a terrible hack, which must be added to the hacks to deal with varying population sizes, divergence, and so on, and the resulting theory ends up being extremely inelegant.)

Comment author: Matt_Simpson 30 January 2011 10:59:46PM 1 point [-]

True enough, there are always more hacks a utilitarian can throw on to their theory to avoid issues like this.

Comment author: endoself 31 January 2011 09:46:01AM 1 point [-]

in the variant where you sum up utilities, an outcome where many people live lives just barely worth living seems better than an outcome where fewer people live amazingly good lives (but we actually prefer the latter);

Are you sure of this? It sounds a lot like scope insensitivity. Remember, lives barely worth living are still worth living.

if there's a scenario where one person's utility function diverges to infinity, then both sum- and average-utility aggregation claim that it's worth sacrificing everyone else to make sure that happens (the "utility monster" problem).

Again, this seems like scope insensitivity.

Comment deleted 29 January 2011 11:05:08PM [-]
Comment author: Matt_Simpson 29 January 2011 11:11:58PM 3 points [-]

Yeah, that's probably right. But notice that even in that case, unlike the utilitarian, there are no thorny issues about how to deal with non-human agents. If we run into an alien that has a serious preference for raping humans, the utilitarian only has ad-hoc ways of deciding whether or not the alien's preference counts. Eliezer's metaethics handles it elegantly: check your utility function. Of course, that's easier said than done in the real world, but it does solve many philosophical problems associated with utilitarianism.

Comment author: Peterdjones 30 October 2012 06:41:45PM *  -1 points [-]

There is a way of testing metaethical theories, which is to compare their predictions or suggestions again common first-level ethical intuitions. It isnt watertight as the recalcitrant meatethicist can always say that the intuitions are wrong... anyway, trying it out n EY-metaethics, as you have stated it, doesn't wash too well, since there is an implication that those who value murder should murder, those who value paperclips should maximise paperclips, etc.

Some will recognise that as a form of the well known and widely rejected theory of ethical egoism.

OTOH, you may not have presented the theory correctly. For instance, the "Coherent" in CEV may be important. EY may have the get-out that murderers and clippies don't have enough coherence in their values to count as moral.

Comment author: Matt_Simpson 30 October 2012 09:39:55PM *  0 points [-]

I don't think the coherence part is particularly relevant here.

Consider two people, you (Peter) and me (Matt). Suppose I prefer to be able to murder people and you prefer that no one ever be murdered. Suppose I have the opportunity to murder someone (call him John) without getting caught or causing any other relevant positive or negative consequences (both under your preferences and mine). What should I do? Well, I should_Matt murder John. My preferences say "yay murder" and there are no downsides, so I should_Matt go ahead with it. But I should_Peter NOT murder John. Your preferences say "boo murder" and there are no other benefits to murdering John, so I should_Peter just leave John alone. But what should I do? Tell me what you mean by should and I'll tell you. Presumably you mean should_Peter or should_(most people), in which case, then I shouldn't murder.

(EY's theory would further add that I don't, in fact, value murder as an empirical claim - and that would be correct, but it isn't particularly relevant to the hypothetical. It may, however, be relevant to this method of testing metaethical theories, depending on how you intended to use it.)

EY-metaethics, as you have stated it, doesn't wash too well, since there is an implication that those who value murder should murder, those who value paperclips should maximise paperclips, etc.

Let me fix that sentence for you:

EY-metaethics, as you have stated it, doesn't wash too well, since there is an implication that those who value murder should_(those who value murder) murder, those who value paperclips should_(those who value paperclips) maximise paperclips, etc.

In other words, there is no "should," unless you define it to be a specific should_x. EY would define it as should_(human CEV) or something similar, and that's the "should" you should be running through the test.

Some will recognise that as a form of the well known and widely rejected theory of ethical egoism.

It isn't. Egoism says be selfish. There's no reason why someone can't have altruistic preferences, and in fact people do. (Unless that's not what you mean by egoism, but sure, this is egoism, but that's a misleading definition and the connotations don't apply).

Comment author: Peterdjones 31 October 2012 01:12:14PM *  0 points [-]

But what should I do? Tell me what you mean by should and I'll tell you. Presumably you mean shouldPeter or should(most people), in which case, then I shouldn't murder.

There are a lot of candidates for what I could mean by "should" under which you shouldn't murder. Should-most-people woulld imply that.. It is an example of a non-Yudkovskian theory that doens't have the problem of the self-centered vesion of his theory. So is Kantian metathics: you should not murder because you would not wish muder to be Universal Law.

EY-metaethics, as you have stated it, doesn't wash too well, since there is an implication that those who value murder should(those who value murder) murder, those who value paperclips should(those who value paperclips) maximise paperclips, etc.

And how is that supposed to help? Are you implying that nothing counts as a counterexample to a metaethical theory unless it relates to should_Peter, to what the theory is telling me to do. But as it happens, I do care about what metaethical theories tell other people to do, just as evidence that I haven;t personally witnessed still could count against a scientific claim.

In other words, there is no "should," unless you define it to be a specific should_x.

That isn't a fact. It may be an implication of the theory, but i seem to have good reason to reject the theory.

EY would define it as should_(human CEV) or something similar, and that's the "should" you should be running through the test.

That seems to be the same get-out clause as before: that there is somehting about the Coherenet and/or the Extrapolated that fixes the Michael-should-murder problem. But if there is, it should have been emphasised in your original statement of EY;s position.

It isn't. Egoism says be selfish. There's no reason why someone can't have altruistic preferences, and in fact people do. (Unless that's not what you mean by egoism, but sure, this is egoism, but that's a misleading definition and the connotations don't apply).

As originally stated, it has the same problems as egoism.

Comment author: Matt_Simpson 31 October 2012 03:22:28PM *  0 points [-]

What I'm trying to say is that within the theory there is no "should" apart from should_X's. So you need to pin down which should_X you're talking about when you run the theory through the test - you can ask "what should_Matt Matt do?" and "what should_Matt Peter do?", or you can ask "what should_Peter Matt do?" and what "should_Peter Peter do?", but it's unfair to ask "what should_Matt Matt do?" and "what should_Peter Peter do?" - you're changing the definition of "should" in the middle of the test!

Now the question is, which should_X should you use in the test? If X is running the theory through the test, X should use should_X since X is checking the theory against X's moral intuitions. (If X is checking the test against Y's moral intutions, then X should use should_Y). In other words, X should ask, "what should_X Matt do?" and "what should_X Peter do?". If there is a such a thing as should_human, then if X is a human, this amounts to using should_human.

As a side note, to display"a_b" correctly, type "a\_b"

Comment author: Peterdjones 31 October 2012 06:50:36PM *  -1 points [-]

We have intutions that certain things are wrong -- murder, robbery and so forth -- and we have the intution that those things are wrong, not just wrong-for-peope-that-don't-like-them. This intuition of objectivity is what makes ethics a problem, in conjunction with the absence of obvious moral objects as part of the furniture of the world.

ETA: again, a defence of moral subjectivism seems to be needed as part of CEV..

Comment author: Matt_Simpson 31 October 2012 07:31:57PM 3 points [-]

Traditional moral subjectivism usually says that what X should do depends on who X is in some intrinsic way. In other words, when you ask "what should X do?", the answer you get is the answer to "what should_X X do?" On EY's theory, when you ask "what should X do?", the answer you get is the answer to "what should_Y X do?" where Y is constant across all X's. So "should" is a rigid designator -- is corresponds to the same set of values no matter who we're asking about.

Now the subjectivity may appear to come in because two different people might have a different Y in mind when they ask "what should X do?" The answer depends on who's asking! Subjectivity!

Actually, no. The answer only depends on what the asker means by should. If should = should_Y, then it doesn't matter who's asking or who they're asking about, we'll get the same answer. If should = should_X, the same conclusion follows. The apparent subjectivity comes from thinking that there is a separate "should" apart from any "should_X, and then subtly changing the definition of "should" when someone different asks or someone different is asked about.

Now many metaethicists may still have a problem with the theory related to what's driving it's apparent subjectivity, but calling it subjective is incorrect.

I'll note that the particular semantics I'm using are widely regarded to confuse readers into thinking the theory is a form of subjectivism or moral relativism -- and frankly, I agree with the criticism. Using this terminology just so happen to be how I finally understood the theory, so it's appealing to me. Let's try a different terminology (hat tip to wedrifid): every time I wrote should_X, read that as would_want_X. In other words, should_X = would_want_X = X's implicit preferences -- what X would want if X were able to take into account all n-order preferences she has in our somewhat simplified example. Then, in the strongest form of EY's theory, should = would_want_Human. In other words, only would_want_Human has normativity. Every time we ask "what should X do?" we're asking "what would_want_Human X do?" which gives the same answer no matter who X is or who is asking the question (though nonhumans won't often ask this question).

Comment author: Peterdjones 01 November 2012 09:20:18AM 0 points [-]

he answer you get is the answer to "what shouldX X do?" On EY's theory, when you ask "what should X do?", the answer you get is the answer to "what shouldY X do?" where Y is constant across all X's.

Y is presuably varying wth somethjng, or why put it in?.

. The apparent subjectivity comes from thinking that there is a separate "should" apart from any "should_X, and then subtly changing the definition of "should" when someone different asks or someone different is asked about.

I don't follow. Thinkking there is a should that is separate from any should_X is the basis of objecivity.

The basis of subjectivity is having a quesstion that can be valdily answered by reference to a speakers beliefs and desires alone. "What flavour of ice cream would I choose" works that way. So does any other case of acti g ona prefrerence, any other "would". Since you have equated shoulds with woulds, the shoulds are subjective as well..

There are objective facts about what a subject would do, just as it isan objective fact that sos-and-so has a liking for Chocoalte Chip, but these objective facts don't negate the existence of subjectivity. Something is objectice and not subjective where there are no valud answers based on reference to a subjects beliefs and desires. I don't think that is the case here.

Then, in the strongest form of EY's theory, should = wouldwantHuman. In other words, only wouldwantHuman has normativity.

The claim that only should_Human is normative contradicts the claim that any would-want isa a should-want. If normativity kicks in for any "would", what does bringing in the human level add.

Every time we ask "what should X do?" we're asking "what wouldwantHuman X do?" which gives the same answer no matter who X is or who is asking the question (though nonhumans won't often ask this question).

Well, that version of the theory is objective, or intersubjecive enough. It just isnt the same as the version of the theory that equates individual woulds and shoulds. And it relies on a convergence that might not arrive in practice.

Comment author: Matt_Simpson 01 November 2012 05:29:04PM 1 point [-]

Y is presuably varying wth somethjng, or why put it in?.

To make it clear that "should" is just a particular "should_Y." Or, using the other terminology, "should" is a particular "would_want_Y."

The basis of subjectivity is having a quesstion that can be valdily answered by reference to a speakers beliefs and desires alone.

I agree with this. If the question was "how do I best satisfy my preferences?" then the answer changes with who the speaker is. But, on the theory, "should" is a rigid designator and refers ONLY to a specific should_X (or would_want_X if you prefer that terminology). So if the question is "what should I do?" That's the same as asking "what should_X I do?" or equivalently "what would_want_X I do?" The answer is the same no matter who is asking.

The "X" is there because 1) the theory says that "should" just is a particular "should_X," or equivalently a particular "would_want_X" and 2) there's some uncertainty about which X belongs there. In EY's strongest form of the theory, X = Human. A weaker form might say X = nonsociopath human.

Just to be clear, "should_Y" doesn't have any normativity unless Y happens to be the same as the X in the previous paragraph. "Should_Y" isn't actually a "should" - this is why I started calling it "would_want_Y" instead.

Something is objectice and not subjective where there are no valud answers based on reference to a subjects beliefs and desires. I don't think that is the case here.

But it is. Consider the strong form where should = would_want_Human. Suppose an alien race came and modified humans so that their implicit preferences were completely changed. Is should changed? Well, no. "should" refers to a particular preference structure - a particular mathematical object. Changing the preference structure that humans would_want doesn't change "should" any more than changing the number of eyes a human has changes "2." Or to put it another way, distinguish between would_want_UnmodifiedHuman and would_want_ModifiedHuman. Then should = would_want_UnmodifiedHuman. "Should" refers to a particular implicit preference structure, a particular mathematical object, instantiated in some agent or group of agents.

If normativity kicks in for any "would..."

Hopefully this is clear now, but it doesn't, even if I was calling them all "should_Y."

Comment author: wedrifid 31 October 2012 01:53:39AM 0 points [-]

In other words, there is no "should," unless you define it to be a specific shouldx. EY would define it as should(human CEV) or something similar, and that's the "should" you should be running through the test.

In the usages he has made EY actually seems to say there is a "should", which we would describe as should<Eliezer>. For other preferences he has suggested would_want<John>. So if John wants to murder people he should not murder people but would_want<John> to murder them. (But that is just his particular semantics, the actual advocated behavior is as you describe it.)

When it comes to CEV Eliezer has never (that I have noticed) actually acknowledged that Coherent Extrapolated Volition can be created for any group other than "humanity". Others have used it as something that must be instantiated for a particular group in order to make sense. I personally consider any usage of "CEV" where the group being extrapolated is not given or clear from the context to be either a mistake or sneaking in connotations.

Comment author: Matt_Simpson 31 October 2012 06:01:26PM 0 points [-]

In the usages he has made EY actually seems to say there is a "should", which we would describe as should<Eliezer>. For other preferences he has suggested would_want<John>. So if John wants to murder people he should not murder people but would_want<John> to murder them. (But that is just his particular semantics, the actual advocated behavior is as you describe it.)

I don't remember the would_want semantics anywhere in EY's writings, but I see the appeal - especially given how my discussion with Peterdjones is going,

Comment author: wedrifid 01 November 2012 02:44:29AM *  0 points [-]

I don't remember the would_want semantics anywhere in EY's writings

It was in a past conversation on the subject of what Eliezer means by "should" and related terms. That was the answer he gave in response to the explicit question. In actual writings there hasn't been a particular need to refer concisely to the morality of other agents independently of their actual preferences. When describing Baby Eaters, for example, natural language worked just fine.

Comment author: Peterdjones 30 October 2012 02:18:15AM -2 points [-]

In a nutshell, Eliezer's metaethics says you should maximize your preferences whatever they may be

My current prefernces? Why shouldn't I change them?

Comment author: Matt_Simpson 30 October 2012 04:32:11PM *  0 points [-]

What wedrifid said. But also, what is the criterion by which you would change your (extrapolated) preferences? This criterion must contain some or all of the things that you care about. Therefore, by definition it's part of your current (extrapolated) preferences. Edit: Which tells you that under "normal" circumstances you won't prefer to change your preferences.

Comment author: Peterdjones 30 October 2012 05:14:42PM 0 points [-]

But also, what is the criterion by which you would change your (extrapolated) preferences?

It would probably be a higher-order preference, like being more fair, more consistent, etc.

Which tells you that under "normal" circumstances you won't prefer to change your preferences.

That would require a lot of supplementaty assumptions. For instance, if I didn't care about consistency, i wouldn't revise my prefernces to be more consistent. I might also "stick" if I cared about consistency and knew myself to be consistent. But how often does that happen?

Comment author: Matt_Simpson 30 October 2012 09:48:26PM *  1 point [-]

My intuition is that if you have preferences over (the space of possible preferences over states of the world), that implicitly determines preferences over states of the world - call these "implicit preferences". This is much like if you have a probability distribution over (the set of probability distributions over X), that determines a probability distribution over X (though this might require X to be finite or perhaps something weaker).

So when I say "your preferences" or "your extrapolated preferences" I'm referring to your implicit preferences. In other words, "your preferences" refers to what you your 1st order preferences over the state of the world would look like if you took into account all n-order preferences, not the current 1st order preferences with which you are currently operating.

Edit: Which is just another way of saying "what wedrifid said."

One interpretation of CEV is that it's supposed to find these implicit preferences, assuming that everyone has the same, or "similar enough", implicit preferences.

Comment author: Peterdjones 31 October 2012 01:23:35PM 0 points [-]

One interpretation of CEV is that it's supposed to find these implicit preferences, assuming that everyone has the same, or "similar enough", implicit preferences.

Where does the "everyone" come in? Your initial statement of EY;s metaethics is that it is about my preferences, hoever implicit or extrapolated. Are individual's extrapolated preferences supposed to converge or not? That's a very important issue. If they do converge, then why the emphasis on the difference between shouldPeter and shouldMatt? If they don't converge, how do you avoid Prudent Predation. The whle thing's as clear as mud.

Comment author: Matt_Simpson 31 October 2012 03:15:25PM 1 point [-]

Where does the "everyone" come in?

One part of EY's theory is that all humans have similar enough implicit preferences that you can talk about implicit human preferences. CEV is supposed to find implicit human preferences.

Others have noted that there's no reason why you can't run CEV on other groups, or a single person, or perhaps only part of a single person. In which case, you can think of CEV(X) as a function that returns the implicit preferences of X, if they exist. This probably accounts for the ambiguity.

Comment author: Peterdjones 31 October 2012 06:43:19PM *  0 points [-]

there's no reason why you can't run CEV on other groups, or a single person, or perhaps only part of a single person

There's no reason you can't as an exercise in bean counting or logic chopping,, but there is a question as to what that would add up to metaethically. If individual extrapolations converge, all is good. If not, then CEV is a form of ethical subjectivism, and if that is wrong, then CEV doens't work. Traditional philosophical concerns have not been entirely sidestepped.

Comment author: wedrifid 30 October 2012 02:41:48AM 0 points [-]

My current prefernces? Why shouldn't I change them?

Current extrapolated preferences. That is, maximise whatever it is that you want to change your preferences to.