Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

What is Eliezer Yudkowsky's meta-ethical theory?

33 Post author: lukeprog 29 January 2011 07:58PM

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is. And at least at this point, professionals like Robin Hanson and Toby Ord couldn't figure it out, either.

Part of the problem is that because Eliezer has gotten little value from professional philosophy, he writes about morality in a highly idiosyncratic way, using terms that would require reading hundreds of posts to understand. I might understand Eliezer's meta-ethics better if he would just cough up his positions on standard meta-ethical debates like cognitivism, motivation, the sources of normativity, moral epistemology, and so on. Nick Beckstead recently told me he thinks Eliezer's meta-ethical views are similar to those of Michael Smith, but I'm not seeing it.

If you think you can help me (and others) understand Eliezer's meta-ethical theory, please leave a comment!

Update: This comment by Richard Chappell made sense of Eliezer's meta-ethics for me.

 

 

Comments (369)

Comment author: RichardChappell 30 January 2011 08:40:28PM *  40 points [-]

Eliezer's metaethics might be clarified in terms of the distinctions between sense, reference, and reference-fixing descriptions. I take it that Eliezer wants to use 'right' as a rigid designator to denote some particular set of terminal values, but this reference fact is fixed by means of a seemingly 'relative' procedure (namely, whatever terminal values the speaker happens to hold, on some appropriate [if somewhat mysterious] idealization). Confusions arise when people mistakenly read this metasemantic subjectivism into the first-order semantics or meaning of 'right'.

In summary:

(i) 'Right' means, roughly, 'promotes external goods X, Y and Z'

(ii) claim i above is true because I desire X, Y, and Z.

Note that Speakers Use Their Actual Language, so murder would still be wrong even if I had the desires of a serial killer. But if I had those violent terminal values, I would speak a slightly different language than I do right now, so that when KillerRichard asserts "Murder is right!" what he says is true. We don't really disagree, but are instead merely talking past each other.

Virtues of the theory:

(a) By rigidifying on our actual, current desires (or idealizations thereupon), it avoids Inducing Desire Satisfactions.

(b) Shifting the subjectivity out to the metasemantic level leaves us with a first-order semantic proposal that at least does a better job than simple subjectivism at 'saving the phenomena'. (It has echoes of Mark Schroeder's desire-based view of reasons, according to which the facts that give us reasons are the propositional contents of our desires, rather than the desires themselves. Or something like that.)

(c) It's naturalistic, if you find moral non-naturalism 'spooky'. (Though I'd sooner recommend Mackie-style error theory for naturalists, since I don't think (b) above is enough to save the phenomena.)

Objections

(1) It's incompatible with the datum that substantive, fundamental normative disagreement is in fact possible. People may share the concept of a normative reason, even if they fundamentally disagree about which features of actions are the ones that give us reasons.

(2) The semantic tricks merely shift the lump under the rug, they don't get rid of it. Standard worries about relativism re-emerge, e.g. an agent can know a priori that their own fundamental values are right, given how the meaning of the word 'right' is determined. This kind of (even merely 'fundamental') infallibility seems implausible.

(3) Just as simple subjectivism is an implausible theory of what 'right' means, so Eliezer's meta-semantic subjectivism is an implausible theory of why 'right' means promoting external goods X, Y, Z. An adequately objective metaethics shouldn't even give preferences a reference-fixing role.

Comment author: komponisto 31 January 2011 04:57:59AM *  19 points [-]

I think this is an excellent summary. I would make the following comments:

Confusions arise when people mistakenly read this metasemantic subjectivism into the first-order semantics or meaning of 'right'.

Yes, but I think Eliezer was mistaken in identifying this kind of confusion as the fundamental source of the objections to his theory (as in the Löb's theorem discussion). Sophisticated readers of LW (or OB, at the time) are surely capable of distinguishing between logical levels. At least, I am -- but nevertheless, I still didn't feel that his theory was adequately "non-relativist" to satisfy the kinds of people who worry about "relativism". What I had in mind, in other words, was your objections (2) and (3).

The answer to those objections, by the way, is that an "adequately objective" metaethics is impossible: the minds of complex agents (such as humans) are the only place in the universe where information about morality is to be found, and there are plenty of possible minds in mind-design space (paperclippers, pebblesorters, etc.) from which it is impossible to extract the same information. This directly answers (3), anyway; as for (2), "fallibility" is rescued (on the object level) by means of imperfect introspective knowledge: an agent could be mistaken about what its own terminal values are.

Comment author: Matt_Simpson 31 January 2011 06:43:14PM *  3 points [-]

Note that your answer to (2) also answers (1): value uncertainty makes it seem as if there is substantive, fundamental normative disagreement even if there isn't. (Or maybe there is if you don't buy that particular element of EY's theory)

Comment author: Yosarian2 30 August 2017 09:44:22AM 0 points [-]

The answer to those objections, by the way, is that an "adequately objective" metaethics is impossible: the minds of complex agents (such as humans) are the only place in the universe where information about morality is to be found, and there are plenty of possible minds in mind-design space (paperclippers, pebblesorters, etc.) from which it is impossible to extract the same information.

Elizer attempted to deal with that problem by defining a certain set of things as "h-right", that is, morally right from the frame of reference of the human mind. He made clear that alien entities probably would not care about what is h-right, but that humans do, and that's good enough.

Comment author: lukeprog 01 February 2011 01:02:07PM 13 points [-]

Richard,

You're speaking my language, thanks! I hope this is EY's view, because I know what this means. Maybe now I can go back and read EY's sequence in light of this interpretation and it will make more sense to me now.

EY's theory as presented above makes me suspicious that making basic evaluative moral terms rigid designators is a kind of 'trick' which, though perhaps not intended, very easily has the effect of carrying along some common absolutist connotations of those terms where they no longer apply in EY's use of those terms.

At the moment, I'm not so worried about objection (1), but objections (2) and (3) are close to what bother me about EY's theory, especially if this is foundational for EY's thinking about how we ought to be designing a Friendly AI. If we're working on a project as important as Friendly AI, it becomes an urgent problem to get our meta-ethics right, and I'm not sure Eliezer has done it yet. Which is why we need more minds working on this problem. I hope to be one of those minds, even if my current meta-ethics turns out to be wrong (I've held my current meta-ethics for under 2 years, anyway, and it has shifted slightly since adoption).

But, at the moment it remains plausible to me that Eliezer is right, and I just don't see why right now. Eliezer is a very smart guy who has invested a lot of energy into training himself to think straight about things and respond to criticism either with adequate counterargument or by dropping the criticized belief.

Comment author: RichardChappell 01 February 2011 05:12:32PM 11 points [-]

invested a lot of energy into training himself to think straight about things and respond to criticism either with adequate counterargument or by dropping the criticized belief

Maybe; I can't say I've noticed that so much myself -- e.g. he just disappeared from this discussion when I refuted his assumptions about philosophy of language (that underpin his objection to zombies), but I haven't seen him retract his claim that zombies are demonstrably incoherent.

Comment author: Vladimir_Nesov 01 February 2011 08:12:32PM *  6 points [-]

e.g. he just disappeared from this discussion when I refuted his assumptions about philosophy of language (that underpin his objection to zombies), but I haven't seen him retract his claim that zombies are demonstrably incoherent.

Clearly, from his standpoint a lot of things you believed were confused, and he decided against continuing to argue. This is a statement about willingness to engage situations where someone's wrong on the Internet and presence of disagreement, not external evidence about correctness (distinct from your own estimate of correctness of your opponent's position).

Comment author: lukeprog 01 February 2011 10:49:38PM 4 points [-]

You think that "clearly" Eliezer believed many of Richard's beliefs were confused. Which beliefs, do you think?

Comment author: Vladimir_Nesov 01 February 2011 11:50:57PM 11 points [-]

I won't actually argue, just list some things that seem to be points where Richard talks past the intended meaning of the posts (irrespective of technical accuracy of the statements in themselves, if their meaning intended by Richard was what the posts referred to). Link to the post for convenience.

  • "premise that words refer to whatever generally causes us to utter them": There is a particular sense of "refer" in which we can trace the causal history of words being uttered.
  • "It's worth highlighting that this premise can't be right, for we can talk about things that do not causally affect us. ": Yes, we can consider other senses of "refer", make the discussion less precise, but those are not the senses used.
  • "We know perfectly well what we mean by the term 'phenomenal consciousness'.": Far from "perfectly well".
  • "We most certainly do not just mean 'whatever fills the role of causing me to make such-and-such utterances'" Maybe we don't reason so, but it's one tool to see what we actually mean, even if it explores this meaning in a different sense from what's informally used (as a way of dissolving a potentially wrong question).
  • "No, the example of unicorns is merely to show that we can talk about non-causally related things.": We can think/talk about ideas that cause us to think/talk about them in certain ways, and in this way the meaning of the idea (as set of properties which our minds see in it) causally influences uttering of words about it. Whether what the idea refers to causally influences us in other ways is irrelevant. On the other hand, if it's claimed that the idea talks about the world (and is not an abstract logical fact unrelated to the world), there must be a pattern (event) of past observations that causes the idea to be evaluated as "correct", and alternative observations that cause it to be evaluated as "wrong" (or a quantitative version of that). If that's not possible, then it can't be about our world.
Comment author: XiXiDu 01 February 2011 06:50:59PM 1 point [-]

This is the first time I saw anyone telling EY that what he wrote is plainly false.

Comment author: timtyler 02 February 2011 12:41:04AM 2 points [-]
Comment author: Kaj_Sotala 02 February 2011 12:04:52PM 8 points [-]

I agree with the first one of those being bad.

Yes, if you're talking about corporations, you cannot use exactly the same math than you do if you're talking about evolutionary biology. But there are still some similarities that make it useful to know things about how selection works in evolutionary biology. Eliezer seems to be saying that if you want to call something "evolution", then it has to meet these strictly-chosen criteria that he'll tell you. But pretty much the only justification he offers is "if it doesn't meet these criteria, then Price's equation doesn't apply", and I don't see why "evolution" would need to be strictly defined as "those processes which behave in a way specified by Price's equation". It can still be a useful analogy.

The rest are fine in my eyes, though the argument in The Psychological Unity of Humankind seems rather overstated for several reasons.

Comment author: timtyler 02 February 2011 02:50:08PM 2 points [-]

FWIW, cultural evolution is not an analogy. Culture literally evolves - via differential reproductive success of memes...

Comment author: Will_Newsome 15 May 2011 08:34:36AM 6 points [-]

Do you have recommendations for people/books that take this perspective seriously and then go on to explore interesting things with it? I haven't seen anyone include the memetic perspective as part of their everyday worldview besides some folk at SIAI and yourself, which I find pretty sad.

Also, I get the impression you have off-kilter-compared-to-LW views on evolutionary biology, though I don't remember any concrete examples. Do you have links to somewhere where I could learn more about what phenomena/perspectives you think aren't emphasized or what not?

Comment author: timtyler 15 May 2011 08:46:20AM *  9 points [-]

My current project is a book on memetics. I also have a blog on memetics.

Probably the best existing book on the topic is The Meme Machine by Susan Blackmore.

I also maintain some memetics links, some memetics references, a memetics glossary - and I have a bunch of memetics videos.

In academia, memetics is typically called "cultural evolution". Probably the best book on that is "Not by Genes Alone".

Your "evolutionary biology" question is rather vague. The nearest thing that springs to mind is this. Common views on that topic around here are more along the lines expressed in the The Robot's Rebellion. If I am in a good mood, I describe such views as "lacking family values" - and if I am not, they get likened to a "culture of death".

Comment author: Will_Newsome 15 May 2011 09:04:20AM 3 points [-]

Wow, thanks! Glad I asked. I will start a tab explosion.

Comment author: lukeprog 01 February 2011 08:44:53PM 1 point [-]

Really? That's kind of scary...

Comment author: Desrtopa 01 February 2011 08:49:08PM *  1 point [-]

His response to it, or that it's done so infrequently?

I for one am less worried the less often he writes things that are plainly false, so his being called out rarely doesn't strike me as a cause for concern.

Comment author: lukeprog 01 February 2011 09:28:46PM 11 points [-]

What scares me is that people say EY's position is "plainly false" so rarely. Even if EY is almost always right, you would still expect a huge number of people to say that his positions are plainly false, especially when talking about such difficult and debated questions as those of philosophy and predicting the future.

Comment author: wedrifid 03 February 2011 08:05:19AM 15 points [-]

What scares me is that people say EY's position is "plainly false" so rarely.

What scares me is how often people express this concern relative to how often people actually agree with EY. Eliezer's beliefs and assertions take an absolute hammering. I agree with him fairly often - no surprise, he is intelligent, has a similar cognitive style mine and has spent a whole lot of time thinking. But I disagree with him vocally whenever he seems wrong. I am far from the only person who does so.

Comment author: Desrtopa 01 February 2011 09:34:50PM 9 points [-]

If the topics are genuinely difficult, I don't think it's likely that many people who understand them would argue that Eliezer's points are plainly false. Occasionally people drop in to argue such who clearly don't have a very good understanding of rationality or the subject material. People do disagree with Eliezer for more substantive reasons with some frequency, but I don't find the fact that they rarely pronounce him to be obviously wrong particularly worrying.

Comment author: Kaj_Sotala 02 February 2011 11:49:58AM 6 points [-]

Most of the people who are most likely to think that EY's positions on things are plainly false probably don't bother registering here to say so.

There's one IRC channel populated with smart CS / math majors, where I drop LW links every now and then. Pretty frequently they're met with a rather critical reception, but while those people are happy to tear them apart on IRC, they have little reason to bother to come to LW and explain in detail why they disagree.

(Of the things they disagree on, I mainly recall that they consider Eliezer's treatment of frequentism / Bayesianism as something of a strawman and that there's no particular reason to paint them as two drastically differing camps when real statisticians are happy with using methods drawn from both.)

Comment author: lessdazed 29 March 2011 05:03:10AM *  4 points [-]

they consider Eliezer's treatment of frequentism / Bayesianism as something of a strawman and that there's no particular reason to paint them as two drastically differing camps when real statisticians are happy with using methods drawn from both.

In that case, we got very different impressions about how Eliezer described the two camps; here is what I heard: <channel righteous fury of Eliezer's pure Bayesian soul>

It's not Bayesian users on the one hand and Frequentists on the other, each despising the others' methods. Rather, it's the small group of epistemic statisticians and a large majority of instrumentalist ones.

The epistemics are the small band of AI researchers using statistical models to represent probability so as to design intelligence, learning, and autonomy. The idea is that ideal models are provably Baysian, and the task undertaken is to understand and implement close approximations of them.

The instrumentalist mainstream doesn't always claim that it's representing probability and doesn't feel lost without that kind of philosophical underpinning. Instrumentalists hound whatever problem is at hand with all statistical models and variables that they can muster to get the curve or isolated variable etc. they're looking for and think is best. The most important part of instrumentalist models is the statistician him or herself, which does the Bayesian updating adequately and without the need for understanding. </channel righteous fury of Eliezer's pure Bayesian soul>

Saying that the division is a straw man because most statisticians use all methods misses the point.

Edit: see for example here and here.

Comment author: lukeprog 02 February 2011 04:29:33PM 4 points [-]

Most of the people who are most likely to think that EY's positions on things are plainly false probably don't bother registering here to say so.

True, but I still wouldn't expect sharp disagreement with Eliezer to be so rare. One contributing factor may be that Eliezer at least appears to be so confident in so many of his positions, and does not put many words of uncertainty into his writing about theoretical issues.

Comment author: TheOtherDave 02 February 2011 05:06:38PM 11 points [-]

When I first found this site, I read through all the OB posts chronologically, rather than reading the Sequences as sequences. So I got to see the history of several commenters, many of whom disagreed sharply with EY, with their disagreement evolving over several posts.

They tend to wander off after a while. Which is not surprising, as there is very little reward for it.

So I guess I'd ask this a different way: if you were an ethical philosopher whose positions disagreed with EY, what in this community would encourage you to post (or comment) about your disagreements?

Comment author: orthonormal 01 February 2011 05:18:59PM 5 points [-]

It seems to me that EY himself addressed all three of the objections you list (though of course this doesn't imply he addressed them adequately).

(1) It's incompatible with the datum that substantive, fundamental normative disagreement is in fact possible. People may share the concept of a normative reason, even if they fundamentally disagree about which features of actions are the ones that give us reasons.

Moral Error and Moral Disagreement confronts this.

My own thinking is that humans tend to have the same underlying (evolved) structures behind our hard-to-articulate meta-ethical heuristics, even when we disagree broadly on object-level ethical issues (and of course hand-pick our articulations of the meta-criteria to support our object-level beliefs- the whole machinery of bias applies here).

This implies both that my object-level beliefs can be at odds with their meta-level criteria (if this becomes too obvious for me to rationalize away, I'm more likely to change one or other object-level belief than to change the meta-level heuristic), and that you and I can disagree fundamentally on the object level while still believing that there's something in common which makes argumentation relevant to our disagreement.

Comment author: RichardChappell 01 February 2011 05:24:28PM 8 points [-]

Moral Error and Moral Disagreement confronts this

Yeah, I'm the "Richard4" in the comments thread there :-)

Comment author: orthonormal 01 February 2011 06:08:49PM *  11 points [-]

OK. I'll reply here because if I reply there, you won't get the notifications.

The crux of your argument, it seems to me, is the following intuition:

Rather, it is essential to the concept of morality that it involves shared standards common to all fully reasonable agents.

This is certainly a property we would want morality to have, and one which human beings naturally assume it must have– but is that the central property of it? Should it turn out that nothing which looks like morality has this property, does it logically follow that all morality is dead, or is that reaction just a human impulse?

(I will note, with all the usual caveats, that believing one's moral sentiments to be universal in scope and not based on preference is a big advantage in object-level moral arguments, and that we happen to be descended from the winners of arguments about tribal politics and morality.)

If a certain set of moral impulses involves shared standards common to, say, every sane human being, then moral arguments would still work among those human beings, in exactly the way you would want them to work across all intelligent beings. Frankly, that's good enough for me. Why give baby-eating aliens in another universe veto powers over every moral intuition of yours?

Comment author: RichardChappell 01 February 2011 06:35:50PM *  5 points [-]

Thanks for the reply -- I find this a very interesting topic. One thing I should clarify is that my view doesn't entail giving aliens "veto powers", as you put it; an alternative response is to take them to be unreasonable to intrinsically desire the eating of babies. That isn't an intrinsically desirable outcome (I take it), i.e. there is no reason to desire such a thing. Stronger still, we may think it intrinsically undesirable, so that insofar as an agent has such desires they are contrary to reason. (This requires a substantive notion of reason that goes beyond mere instrumental rationality, of course.)

In any case, I'd put the crux of my argument slightly differently. The core intuition is just that it's possible to have irresolvable moral disagreements. We can imagine a case where Bob is stubbornly opposed to abortion, and Jane is just as stubbornly in favour of it, and neither agent is disposed to change their mind in light of any additional information. EY's view would seem to imply that the two agents mustn't really disagree. And that just seems a mistake: it's part of our concept of morality that this very concept could be shared by someone who fundamentally (and irresolvably) disagrees with us about what the substantive moral facts are. This is because we're aspiring to conform our judgments to a standard that is outside of ourselves. (If you don't think there are any such objective standards, then that's just to say that there are no normative facts, given my concept of normativity.)

Comment author: cousin_it 01 February 2011 09:14:51PM *  7 points [-]

Richard, hello.

Human beings are analogous to computers. Morality and other aspects of behavior and cognition are analogous to programs. It is a type error to ask whether a program "really exists" somewhere outside a computer, or is "intrinsic" to a computer, or is "contingent", or something like that. Such questions don't correspond to observations within the world that could turn out one way or the other. You see a computer running a certain program and that's the end of the story.

Your mind is a program too, and your moral intuitions are how your algorithm feels from inside, not a direct perception of external reality (human beings are physically incapable of that kind of thing, though they may feel otherwise). I know for a fact that you have no astral gate in your head to pull answers from the mysterious source of morality. But this doesn't imply that your moral intuitions "should" be worthless to you and you "should" seek external authority! There's nothing wrong with mankind living by its internal moral lights.

Yes, it's possible that different computers will have different programs. Our world contains billions of similar "moist robots" running similar programs, perhaps because we were all created from design documents that are 99% identical for historical reasons, and also because we influence each other a lot. Your intuition that all "possible" sentient agents must share a common morality is unlikely to survive an encounter with any sentient agent that's substantially different from a human. We can imagine such agents easily, e.g. a machine that will search for proofs to Goldbach's conjecture and turn surrounding matter and energy into computing machinery to that end. Such a machine may be more ingenious than any human in creating other machines, discovering new physics, etc., but will never gravitate toward your intuition that one shouldn't kill babies. Most possible "intelligent agents" (aka algorithms that can hit small targets in large search spaces) aren't humans in funny suits.

Comment author: Vladimir_Nesov 01 February 2011 09:27:30PM *  6 points [-]

I expect Richard's memeset has understanding of all your points that doesn't move his current position. You're probably exposing him to arguments he has already encountered, so there's little point in expecting a different result. I'm not saying that Richard can't be moved by argument, just not by standard argument that is already known to have failed to move him. He even probably "agrees" with a lot of your points, just with a different and more sophisticated understanding than yours.

On the other hand, it might work for the benefit of more naive onlookers.

Comment author: cousin_it 01 February 2011 09:56:38PM *  10 points [-]

The intent of my comment wasn't to convince Richard (I never do that), but to sharpen our points and make him clarify whatever genuine insight he possesses and we don't.

Comment author: Vladimir_Nesov 01 February 2011 10:00:41PM 2 points [-]

That's a motivation I didn't consider. (Agreed.)

Comment author: RichardChappell 01 February 2011 10:08:49PM *  4 points [-]

Yeah, as Vladimir guessed, this is all familiar.

Your last paragraph suggests that you've misunderstood my view. I'm not making an empirical claim to the effect that all agents will eventually converge to our values -- I agree that that's obviously false. I don't even think that all formally intelligent agents are guaranteed to have normative concepts like 'ought', 'reason', or 'morality'. The claim is just that such a radically different agent could share our normative concepts (in particular, our aspiration to a mind-independent standard), even if they would radically disagree with us about which things fall under the concept. We could both have full empirical knowledge about our own and each other's desires/dispositions, and yet one (or both) of us might be wrong about what we really have reason to want and to do.

(Aside: the further claim about "reasons" in your last sentence presupposes a subjectivist view about reasons that I reject.)

Comment author: cousin_it 01 February 2011 10:32:45PM *  5 points [-]

What use is this concept of "reasonability"? Let's say I build an agent that wants to write the first 1000 Fibonacci numbers in mile-high digits on the Moon, except skipping the 137th one. When you start explaining to the agent that it's an "arbitrary omission" and it "should" amend its desires for greater "consistency", the agent just waves you off because listening to you isn't likely to further its current goals. Listening to you is not rational for the agent in the sense that most people on LW use the term: it doesn't increase expected utility. If by "rational" you mean something else, I'd like to understand what exactly.

Comment author: RichardChappell 02 February 2011 01:13:52AM *  1 point [-]

I mean 'rational' in the ordinary, indefinable sense, whereby calling a decision 'irrational' expresses a distinctive kind of criticism -- similar to that expressed by the words 'crazy', 'foolish', 'unwise', etc. (By contrast, you can just say "maximizes expected utility" if you really mean nothing more than maximizes expected utility -- but note that that's a merely descriptive concept, not a normative one.)

If you don't possess this concept -- if you never have thoughts about what's rational, over and above just what maximizes expected utility -- then I can't help you.

Comment author: cousin_it 02 February 2011 10:43:37AM *  1 point [-]

I don't think we can make progress with such imprecise thinking. Eliezer has a nice post about that.

Comment author: Vladimir_Nesov 01 February 2011 08:36:29PM *  3 points [-]

The core intuition is just that it's possible to have irresolvable moral disagreements.

What is the difference between an in-principle irresolvable disagreement (moral or otherwise), and talking past each other (i.e. talking of different subject matters, or from different argument-processing frameworks)?

Comment author: orthonormal 01 February 2011 08:38:36PM 4 points [-]

First, EY makes it abundantly clear that two agents can have a fundamental disagreement on values– it's just not the best (or most helpful) assumption when you're talking about two sane human beings with a vast sea of common frameworks and heuristics.

Secondly, I'm worried about what you're trying to do with words when you suggest we "take them to be unreasonable to intrinsically desire the eating of babies".

If you're making an empirical claim that an alien with fundamentally different terminal values will (say) be uninterested in negotiating mutually beneficial deals, or will make patently suboptimal decisions by its own criteria, or exhibit some other characteristic of what we mean by "unreasonable", then you'd need some strong evidence for that claim.

If instead you openly redefine "reasonable" to include "shares our fundamental moral standards", then the property

it is essential to the concept of morality that it involves shared standards common to all fully reasonable agents

becomes a tautology which no longer excludes "meta-semantic subjectivism", as you put it. So I'm puzzled what you mean.

Comment author: RichardChappell 01 February 2011 10:26:23PM 4 points [-]

Talking past each other a bit here. Let me try again.

EY makes it abundantly clear that two agents can have a fundamental disagreement on values

EY allows for disagreement in attitude: you might want one thing, while the babyeaters want something different. Of course I'm not charging him with being unable to accommodate this. The objection is instead that he's unable to accommodate disagreement in moral judgment (at the fundamental level). Normativity as mere semantics, and all that.

Your second point rests on a false dichotomy. I'm not making an empirical claim, but nor am I merely defining the word "reasonable". Rather, I'm making a substantive normative (non-empirical) hypothesis about which things are reasonable. If you can't make sense of the idea of a substantive non-empirical issue, you may have fallen victim to scientism.

Comment author: Vladimir_Nesov 01 February 2011 08:33:12PM 2 points [-]

an alternative response is to take them to be unreasonable to intrinsically desire the eating of babies

What fact have you established by manipulating the definition of a word in this manner? I want a meta-ethical theory that at least describes baby-eaters, because I don't expect to have object-level understanding of human morality that is substantially more accurate than what you'd get if you add baby-eating impulses to it.

Comment author: orthonormal 01 February 2011 05:42:39PM 1 point [-]

Ah! Sorry for carrying coals to Newcastle, then. Let me catch up in that thread.

Comment author: utilitymonster 31 January 2011 01:39:04PM 5 points [-]

Yes, this is what I thought EY's theory was. EY? Is this your view?

Comment author: Wei_Dai 27 June 2011 05:50:20PM *  4 points [-]

This summary of Eliezer's position seems to ignore the central part about computation. That is, Eliezer does not say that 'Right' means 'promotes external goods X, Y and Z' but rather that it means a specific computation that can be roughly characterized as 'renormalizing intuition'

I see the project of morality as a project of renormalizing intuition. We have intuitions about things that seem desirable or undesirable, intuitions about actions that are right or wrong, intuitions about how to resolve conflicting intuitions, intuitions about how to systematize specific intuitions into general principles.

which eventually outputs something like 'promotes external goods X, Y and Z'. I think Eliezer would argue that at least some of the objections list here are not valid if we add the part about computation. (Specifically, disagreements and fallibility can result from from lack of logical omniscience regarding the output of the 'morality' computation.)

Is the reason for skipping over this part of Eliezer's idea that standard (Montague) semantic theory treats all logically equivalent language as having the same intension? (I believe this is known as "the logical omniscience problem" in linguistics and philosophy of language.)

Comment author: RichardChappell 27 June 2011 08:55:41PM *  3 points [-]

The part about computation doesn't change the fundamental structure of the theory. It's true that it creates more room for superficial disagreement and fallibility (of similar status to disagreements and fallibility regarding the effective means to some shared terminal values), but I see this as an improvement in degree and not in kind. It still doesn't allow for fundamental disagreement and fallibility, e.g. amongst logically omniscient agents.

(I take it to be a metaethical datum that even people with different terminal values, or different Eliezerian "computations", can share the concept of a normative reason, and sincerely disagree about which (if either) of their values/computations is correctly tracking the normative reasons. Similarly, we can coherently doubt whether even our coherently-extrapolated volitions would be on the right track or not.)

Comment author: Wei_Dai 28 June 2011 04:01:05AM 5 points [-]

It still doesn't allow for fundamental disagreement and fallibility, e.g. amongst logically omniscient agents.

It's not clear to me why there must be fundamental disagreement and fallibility, e.g. amongst logically omniscient agents. Can you refer me to an argument or intuition pump that explains why you think that?

Comment author: RichardChappell 30 June 2011 03:55:55PM 3 points [-]

One related argument is the Open Question Argument: for any natural property F that an action might have, be it promotes my terminal values, or is the output of an Eliezerian computation that models my coherent extrapolated volition, or whatever the details might be, it's always coherent to ask: "I agree that this action is F, but is it good?"

But the intuitions that any metaethics worthy of the name must allow for fundamental disagreement and fallibility are perhaps more basic than this. I'd say they're just the criteria that we (at least, many of us) have in mind when insisting that any morality worthy of the name must be "objective", in a certain sense. These two criteria are proposed as capturing that sense of objectivity that we have in mind. (Again, don't you find something bizarrely subjectivist about the idea that we're fundamentally morally infallible -- that we can't even question whether our fundamental values / CEV are really on the right track?)

Comment author: Wei_Dai 02 July 2011 01:57:47AM *  9 points [-]

I'd say they're just the criteria that we (at least, many of us) have in mind when insisting that any morality worthy of the name must be "objective", in a certain sense.

What would you say to someone who does not share your intuition that such "objective" morality likely exists?

My main problem with objective morality is that while it's hard to deny that there seem to be mind-independent moral facts like "pain is morally bad", there doesn't seem to be enough such facts to build an ethical system out of them. What natural phenomena count as pain, exactly? How do we trade off between pain and pleasure? How do we trade off between pain in one person, and annoyance in many others? How do we trade off pain across time (i.e., should we discount future pain, if so how)? Across possible worlds? How do we morally treat identical copies? It seems really hard, perhaps impossible, to answer these questions without using subjective preferences or intuitions that vary from person to person, or worse, just picking arbitrary answers when we don't even have any relevant preferences or intuitions. If it turns out that such subjectivity and/or arbitrariness can't be avoided, that would be hard to square with objective morality actually existing.

(Again, don't you find something bizarrely subjectivist about the idea that we're fundamentally morally infallible -- that we can't even question whether our fundamental values / CEV are really on the right track?)

I do think there's something wrong with saying that we can't question whether CEV is really on the right track. But I wouldn't use the words "bizarrely subjectivist". To me the problem is just that I clearly can and do question whether CEV is really on the right track. Fixing this seems to require retreating quite a bit from Eliezer's metaethical position (but perhaps there is some other solution that I'm not thinking of). At this point I would personally take the following (minimalist) position:

  1. At least some people, at least some of the time, refer to the same concept by "morality" as me and they have substantive disagreements over its nature and content.
  2. I'm not confident about any of its properties.
  3. Running CEV (if it were practical to) seems like a good way to learn more about the nature and content of morality, but there may be (probably are) better ways.
Comment author: Vladimir_Nesov 02 July 2011 10:13:20AM 3 points [-]

If it turns out that such subjectivity and/or arbitrariness can't be avoided, that would be hard to square with objective morality actually existing.

Compare with formal systems giving first-order theories of standard model of natural numbers. You can't specify the whole thing, and at some point you run into (independent of what comes before) statements for which it's hard to decide whether they hold for the standard naturals, and so you could add to the theory either those statements or their negation. Does this break the intuition that there is some intended structure corresponding to natural numbers, or more pragmatically that we can still usefully seek better theories that capture it? For me, it doesn't in any obvious way.

Comment author: Wei_Dai 02 July 2011 04:37:59PM 2 points [-]

It seems to be an argument in favor of arithmetic being objective that almost everyone agree that a certain a set of axioms correctly characterize what natural numbers are (even if incompletely), and from that set of axioms we can derive much (even if not all) of what we want to know about the properties of natural numbers. If arithmetic were in the same situation as morality is today, it would be much harder (i.e., more counterintuitive) to claim that (1) everyone is referring to the same thing by "arithmetic" and "natural numbers" and (2) arithmetic truths are mind-independent.

To put it another way, conditional on objective morality existing, you'd expect the situation to be closer to that of arithmetic. Conditional on it not existing, you'd expect the situation to be closer to what it actually is.

Comment author: RichardChappell 02 July 2011 03:59:52PM 2 points [-]

What would you say to someone who does not share your intuition that such "objective" morality likely exists?

I'd say: be an error theorist! If you don't think objective morality exists, then you don't think that morality exists. That's a perfectly respectable position. You can still agree with me about what it would take for morality to really exist. You just don't think that our world actually has what it takes.

Comment author: Wei_Dai 03 July 2011 12:30:05AM 4 points [-]

Yes, that makes sense, except that my intuition that objective morality does not exist is not particularly strong either. I guess what I was really asking was, do you have any arguments to the effect that objective morality exists?

Comment author: orthonormal 01 February 2011 05:29:21PM *  3 points [-]

(2) The semantic tricks merely shift the lump under the rug, they don't get rid of it. Standard worries about relativism re-emerge, e.g. an agent can know a priori that their own fundamental values are right, given how the meaning of the word 'right' is determined. This kind of (even merely 'fundamental') infallibility seems implausible.

EY bites this bullet in the abstract, but notes that it does not apply to humans. An AI with a simple utility function and full ability to analyze its own source code can be quite sure that maximizing that function is the meaning of "that-AI-right" in the sense EY is talking about.

But there is no analogue to that situation in human psychology, given how much we now know about self-deception, our conscious and unconscious mental machinery, and the increasing complexity of our values the more we think on them. We can, it's true, say that "the correct extrapolation of my fundamental values is what's right for me to do", but this doesn't guarantee whether value X is or is not a member of that set. The actual work of extrapolating human values (through moral arguments and other methods) still has to be done.

So practical objections to this sort of bullet-biting don't apply to this metaethics; are there any important theoretical objections?

EDIT: Changed "right" to "that-AI-right". Important clarification.

Comment author: TheOtherDave 01 February 2011 06:14:02PM 0 points [-]

Agreed that on EY's view (and my own), human "fundamental values" (1) have not yet been fully articulated/extrapolated; that we can't say with confidence whether X is in that set.

But AFAICT, EY rejects the idea (which you seem here to claim that he endorses?) that an AI with a simple utility function can be sure that maximizing that function is the right thing to do. It might believe that maximizing that function is the right thing to do, but it would be wrong. (2)

AFAICT this is precisely what RichardChappell considers implausible: the idea that unlike the AI, humans can correctly believe that maximizing their utility function is the right thing to do.

==

(1) Supposing there exist any such things, of which I am not convinced.

(2) Necessarily wrong, in fact, since on EY's view as I understand it there's one and only one right set of values, and humans currently implement it, and the set of values humans implement is irreducably complex and therefore cannot be captured by a simple utility function. Therefore, an AI maximizing a simple utility function is necessarily not doing the right thing on EY's view.

Comment author: orthonormal 01 February 2011 08:14:33PM *  0 points [-]

Sorry, I meant to use the two-place version; it wouldn't be what's right; what I meant is that the completely analogous concept of "that-AI-right" would consist simply of that utility function.

Comment author: TheOtherDave 01 February 2011 09:18:40PM 1 point [-]

To the extent that you are still talking about EY's views, I still don't think that's correct... I think he would reject the idea that "that-AI-right" is analogous to right, or that "right" is a 2-place predicate.

That said, given that this question has come up elsethread and I'm apparently in the minority, and given that I don't understand what all this talk of right adds to the discussion in the first place, it becomes increasingly likely that I've just misunderstood something.

In any case, I suspect we all agree that the AI's decisions are motivated by its simple utility function in a manner analogous to how human decisions are motivated by our (far more complex) utility function. What disagreement exists, if any, involves the talk of "right" that I'm happy to discard altogether.

Comment author: Kutta 31 January 2011 10:49:06AM *  3 points [-]

(1): I think it's a prominent naturalistic feature; as EY said above, in a physical universe there are only quantum amplitudes, and if two agents have sufficiently accurate knowledge about the physical configuration of something, including their respective minds, they have to agree about that configuration, regardless of that they possibly have different values.

(2): I'm personally a bit confused about Eliezer's constant promotion of a language that de-subjectivizes morality. In most debates "objective" and "subjective" may entail a confusion when viewed in a naturalistic light; however, as I understand Eliezer's stance does boil down to a traditionally subjective viewpoint in the sense that it opposes the religious notion of morality as light shining down from the skies (and the notion of universally compelling arguments).

In regards to infallibility, an agent at most times has imperfect knowledge of right; I can't see how subjectivity entails infallibility. I don't even have perfect access to my current values, and there is also a huge set of moral arguments that would compel me to modify my current values if I heard them.

(3) The "why right means promoting X and Y" question is addressed by a recursive justification as discussed here and very specifically in the last paragraphs of Meaning of Right. If I ask "why should I do what is right?", that roughly means "why should I do what I should do?" or "why is right what is right?". I happen to be a mind that is compelled by a certain class of moral arguments, and I can reflect on this fact using my current mind, and, naturally, find that I'm compelled by a certain class of moral arguments.

EDIT: see also komponisto's comment.

Comment author: RichardChappell 01 February 2011 05:00:54PM 1 point [-]

re: infallibility -- right, the objection is not that you could infallibly know that XYZ is right. Rather, the problem is that you could infallibly know that your fundamental values are right (though you might not know what your fundamental values are).

Comment author: Kutta 02 February 2011 11:31:48AM *  3 points [-]

Rephrased, this knowledge is just the notion that you instantiate some computation instead of not doing (or being) anything. This way, my confidence in its truth is very high, although of course not 1.

Comment author: RichardChappell 02 February 2011 11:21:52PM 3 points [-]

We know we instantiate some computation. But it's a pre-theoretic datum that we don't know that our fundamental values are right. So EY's theory misdescribes the concept of rightness.

(This is basically a variation on Moore's Open Question Argument.)

Comment author: cousin_it 03 February 2011 12:16:36PM *  4 points [-]

are right

Huh?

I'd be okay with a strong AI that correctly followed my values, regardless of whether they're "right" by any other criterion.

If you think you wouldn't be okay with such an AI, I suspect the most likely explanation is that you're confused about the concept of "your values". Namely, if you yearn to discover some simple external formula like the categorical imperative and then enact the outcomes prescribed by that formula, then that's just another fact about your personal makeup that has to be taken into account by the AI.

And if you agree that you would be okay with such an AI, that means Eliezer's metaethics is adequate for its stated goal (creating friendly AI), whatever other theoretical drawbacks it might have.

Comment author: lukeprog 31 January 2011 01:15:55AM 2 points [-]

Thanks, Richard, for putting so much effort into your comment! When I find the time to parse this, I'll come back here to comment.

Comment author: lukeprog 09 March 2011 03:04:14AM 1 point [-]

Thinking more about this, it may have been better if Eliezer had not framed his meta-ethics sequence around "the meaning of right."

If we play rationalist's taboo with our moral terms and thus avoid moral terms altogether, what Eliezer seems to be arguing is that what we really care about is not (a) that whatever states of affairs our brains are wired to send reward signals in response to be realized, but (b) that we experience peace and love and harmony and discovery and so on.

His motivation for thinking this way is a thought experiment - which might become real in the relatively near future - about what would happen if a superintelligent machine could rewire our brains. If what we really care about is (a), then we shouldn't object if the superintelligent machine rewires our brains to send reward signals only when we are sitting in a jar. But we would object to that scenario. Thus, what we care about seems not to be (a) but (b).

In a meta-ethicists terms, we could interpret Eliezer not as making an argument about the meaning of moral terms, but instead as making an argument that (b) is what gives us Reasons, not (a).

Now, all this meta-babble might not matter much. I'm pretty sure even if I was persuaded that the correct meta-ethical theory states that I should be okay with releasing a superintelligence that would rewire me to enjoy sitting in a jar, I would do whatever I could to prevent such a scenario and instead promote a superintelligence that would bring peace and joy and harmony and discovery and so on.

Comment author: Nisan 09 March 2011 03:25:19AM 1 point [-]

I thought being persuaded of a metaethical theory entails that whenever the theory tells you you should do X, you would feel compelled to do X.

Comment author: lukeprog 09 March 2011 03:56:42AM 2 points [-]

Only if motivational internalism is true. But motivational internalism is false.

Comment author: [deleted] 09 March 2011 04:00:07AM 0 points [-]

What's that?

Comment author: lukeprog 09 March 2011 04:41:07AM 0 points [-]
Comment author: [deleted] 09 March 2011 04:54:07AM 1 point [-]

I could get into how much I hate this kind of rejoinder if you bait me some more. I wasn't asking you for the number of acres in a square mile. Let me just rephrase:

I hadn't heard of motivational internalism before, could you expand your comment?

Comment author: [deleted] 09 March 2011 03:48:38AM 1 point [-]

This is a cool formulation. It's interesting that there are other things that can happen to you not similar to "being persuaded of a metaethical theory" that entail that whenever you are told to do X you're compelled to do X. (Voodoo or whatever.)

Comment author: Vladimir_Nesov 09 March 2011 11:53:08AM *  0 points [-]

what Eliezer seems to be arguing is that what we really care about is not (a) that whatever states of affairs our brains are wired to send reward signals in response to be realized, but (b) that we experience peace and love and harmony and discovery and so on.

His motivation for thinking this way is a thought experiment - which might become real in the relatively near future - about what would happen if a superintelligent machine could rewire our brains. If what we really care about is (a), then we shouldn't object if the superintelligent machine rewires our brains to send reward signals only when we are sitting in a jar.

I don't see what plausible reasoning process could lead you to infer this unlikely statement (about motivation, given how many detail would need to be just right for the statement to happen to be true).

Also, even if you forbid modifying definition of human brain, things that initiate high-reward signals in our brains (or that we actually classify as "harmony" or "love") are very far from what we care about, just as whatever a calculator actually computes is not the same kind of consideration as the logically correct answer, even if you use a good calculator and aren't allowed sabotage. There are many reasons (and contexts) for reward in human brain to not be treated as indicative of goodness of a situation.

Comment author: lukeprog 09 March 2011 12:44:26PM *  0 points [-]

I don't understand your second paragraph. It sounds like you are agreeing to me, but your tone suggests you think you are disagreeing with me.

Comment author: Vladimir_Nesov 09 March 2011 04:02:49PM 1 point [-]

It was an explanation for why your thought experiment provides a bad motivation: we can just forbid modification of human brains to stop the thought experiment from getting through, but that would still leave a lot of problems, which shows that just this thought experiment is not sufficient motivation.

Comment author: lukeprog 09 March 2011 07:35:02PM 2 points [-]

Sure, the superintelligence thought experiment is not the fully story.

One problem with the suggestion of writing a rule to not alter human brains comes in specifying how the machine is not allowed to alter human brains. I'm skeptical about our ability to specify that rule in a way that does not lead to disastrous consequences. After all, our brains are being modified all the time by the environment, by causes that are on a wide spectrum of 'direct' and 'indirect.'

Other problems with adding such a rule are given here.

Comment author: Vladimir_Nesov 09 March 2011 08:22:03PM 2 points [-]

(I meant that subjective experience that evaluates situations should be specified using unaltered brains, not that brains shouldn't be altered.)

Comment author: lukeprog 09 March 2011 09:18:57PM 0 points [-]

You've got my curiosity. What does this mean? How would you realize that process in the real world?

Comment author: Vladimir_Nesov 10 March 2011 10:55:54AM 1 point [-]

Come on, this tiny detail isn't worth the discussion. Classical solution to wireheading, asking the original and not the one under the influence, referring to you-at-certain-time and not just you-concept that resolves to something unpredicted at any given future time in any given possible world, rigid-designator-in-time.

Comment author: lessdazed 02 July 2011 01:08:15PM 1 point [-]

What is objection (1) saying? That asserting there are moral facts is incompatible with the fact that people disagree about what they are? Specifically, when people agree that there is such a thing as a reason that applies to both of them, they disagree about how the reason is caused by reality?

Do we not then say they are both wrong about there being one "reason"?

I speak English(LD). You speak English(RC). The difference between our languages is of the same character as that between a speaker of Spanish and a speaker of French. I say "I" and you correctly read it as referring to lessdazed. You say "I" and I correctly read it as referring to RichardChapell. I have reasons(LD). You have reasons(RC). Do you think that were we perfect at monitoring what we each meant when we said anything and knew the relevant consequences of actions, the two of us would be capable of disagreeing when one of us asserted something in a sentence using the word "moral"? Why?

Or have I misread things?

Comment author: RichardChappell 02 July 2011 04:09:13PM *  3 points [-]

That asserting there are moral facts is incompatible with the fact that people disagree about what they are?

No, I think there are moral facts and that people disagree about what they are. But such substantive disagreement is incompatible with Eliezer's reductive view on which the very meaning of 'morality' differs from person to person. It treats 'morality' like an indexical (e.g. "I", "here", "now"), which obviously doesn't allow for real disagreement.

Compare: "I am tall." "No, I am not tall!" Such an exchange would be absurd -- the people are clearly just talking past each other, since there is no common referent for 'I'. But moral language doesn't plausibly function like this. It's perfectly sensible for one person to say, "I ought to have an abortion", and another to disagree: "No, you ought not to have an abortion". (Even if both are logically omniscient.) They aren't talking past each other. Rather, they're disagreeing about the morality of abortion.

Comment author: lessdazed 02 July 2011 04:59:41PM 2 points [-]

But moral language doesn't plausibly function like this.

It's not plausible(RC, 7/1/2011 4:25 GMT), but it is plausible(LD, 7/1/2011 4:25 GMT).

Compare: "I am tall." "No, I am not tall!" Such an exchange would be absurd -- the people are clearly just talking past each other, since there is no common referent for 'I'.

It's not impossible for people to be confused in exactly such a way.

It's perfectly sensible for one person to say, "I ought to have an abortion", and another to disagree: "No, you ought not to have an abortion". (Even if both are logically omniscient.)

That's begging the question.

That intuition pump imagines intelligent people disagreeing, finds it plausible, notices that intelligent people disagreeing proves nothing, then replaces the label "intelligent" with "omniscient" (since that, if proven, would prove something) without showing the work that would make the replacement valid. If the work could be shown, the intuition pump wouldn't be very valuable, as one could just use the shown work for persuasion rather than the thought experiment with the disagreeing people. I strongly suspect that the reason the shown work is unavailable is because it does not exist.

Eliezer's reductive view on which the very meaning of 'morality' differs from person to person.

Forget morality for one second. Doesn't the meaning of the word "hat" differ from person to person?

It's perfectly sensible for one person to say, "I ought to have an abortion"

It's only sensible to say if/because context forestalls equivocation (or tries to, anyway). Retroactively removing the context by coming in the conversation with a different meaning of ought (even if the first meaning of "ought" was "objective values, as I think they are, as I think I want them to be, that are universally binding on all possible minds, and I would maintain under any coherent extrapolation of my values" where the first person is wrong about those facts and the second meaning of "ought" is the first person's extrapolated volition) introduces equivocation. It's really analogous to saying "No, I am not tall".

Where the first person says "X would make me happy, I want to feel like doing X, and others will be better off according to balancing equation Y if I do X, and the word "ought" encompasses when those things coincide according to objective English, so I ought to do X", and the second person says "X would make you happy, you want to feel like doing X, and others will not be better off according to balancing equation Z if you do X, and the word "ought" encompasses when those things coincide according to objective English, so you ought not do X", they are talking past each other. Purported debates about the true meaning of "ought" reveal that everyone has their own balancing equation, and the average person thinks all others are morally obliged by objective morality to follow his or her equation. In truth, the terms "make happy" and want to feel like doing" are rolled up into the balancing equation, but in it (for Westerners) terms for self and others seem as if they are of different kind.

Comment author: RichardChappell 03 July 2011 04:05:03PM 2 points [-]

Purported debates about the true meaning of "ought" reveal that everyone has their own balancing equation, and the average person thinks all others are morally obliged by objective morality to follow his or her equation.

You're confusing metaethics and first-order ethics. Ordinary moral debates aren't about the meaning of "ought". They're about the first-order question of which actions have the property of being what we ought to do. People disagree about which actions have this property. They posit different systematic theories (or 'balancing equations' as you put it) as a hypothesis about which actions have the property. They aren't stipulatively defining the meaning of 'ought', or else their claim that "You ought to follow the prescriptions of balancing equation Y" would be tautological, rather than a substantive claim as it is obviously meant to be.

Comment author: orthonormal 01 February 2011 05:40:24PM 1 point [-]

(3) Just as simple subjectivism is an implausible theory of what 'right' means, so Eliezer's meta-semantic subjectivism is an implausible theory of why 'right' means promoting external goods X, Y, Z. An adequately objective metaethics shouldn't even give preferences a reference-fixing role.

This seems to me like begging the question. Can you expand on this?

Comment author: Matt_Simpson 29 January 2011 10:36:31PM *  10 points [-]

In a nutshell, Eliezer's metaethics says you should maximize your preferences whatever they may be, or rather, you should_you maximize your preferences, but of course you should_me maximize my preferences. (Note that I said preferences and not utility function. There is no assumption that your preferences HAVE to be a utility function, or at least I don't think so. Eliezer might have a different view). So ethics is reduced to decision theory. In addition, according to Eliezer, human have tremendous value uncertainty. That is, we don't really know what our terminal values are, so we don't really know what we should be maximizing. The last part, and the most controversial around here I think, is that Eliezer thinks that human preferences are similar enough across humans that it makes sense to think about should_human.

There are some further details, but that's the nutshell description. The big break from many philosophers, I think, is considering edit ones own /edit preferences the foundation of ethics. But really, this is in Hume (on one interpretation).

edit: I should add that the language I'm using to describe EY's theory is NOT the language that he uses himself. Some people find my language more enlightening (me, for one), others find EY's more enlightening. Your mileage may vary.

Comment author: wedrifid 30 January 2011 04:41:58AM 7 points [-]

In a nutshell, Eliezer's metaethics says you should maximize your preferences whatever they may be, or rather, you shouldyou maximize your preferences, but of course you shouldme maximize my preferences. (Note that I said preferences and not utility function.

Eliezer is a bit more aggressive in the use of 'should'. What you are describing as should<matt> Eliezer has declared to be would_want<matt> while 'should' is implicitly would_want<Eliezer>, with no allowance for generic instantiation. That is he is comfortable answering "What should a Paperclip Maximiser do when faced with Newcomb's problem?" with "Rewrite itself to be an FAI".

There have been rather extended (and somewhat critical) discussions in comment threads of Eliezer's slightly idiosyncratic usage of 'should' and related terminology but I can't recall where. I know it was in a thread not directly related to the subject!

Comment author: Matt_Simpson 30 January 2011 08:54:28PM 3 points [-]

You're right about Eliezer's semantics. Count me as one of those who thought his terminology was confusing, which is why I don't use it when I try to describe the theory to anyone else.

Comment author: lessdazed 02 July 2011 05:30:09PM *  0 points [-]

Are you sure? I thought "should" could mean would_want<being with aggregated/weighted [somehow] desires of all humanity>. Note I could follow this by saying "That is he is comfortable answering "What should a Paperclip Maximiser do when faced with Newcomb's problem?" with "Rewrite itself to be an FAI".", but that would be affirming the consequent ;-), i.e. I know he says such a thing, but my and your formulation both plausibly explain it, as far as I know.

Comment author: Raemon 30 January 2011 02:45:37AM 2 points [-]

I had a hard time parsing "you shouldyou maximize your preferences, but of course you shouldme maximize my preferences." Can someone break that down without jargon and/or explain how the "should_x" jargon works?

Comment author: Broggly 30 January 2011 04:09:37AM 1 point [-]

I think the difficulty is that in English "You" is used for "A hypothetical person". In German they use the word "Man" which is completely distinct from "Du". It might be easier to parse as "Man shouldRaemon maximize Raemon's preferences, but of course man shouldMatt maximize Matt's preferences."

On the jargon itself, Should_X means "Should, as X would understand it".

Comment author: XiXiDu 30 January 2011 12:59:59PM 2 points [-]

"Man" is the generalization of the personal subject. You can translate it with "one".

Comment author: NihilCredo 30 January 2011 06:50:25AM *  1 point [-]

I think it's better phrased by putting Man in all instances of Raemon.

Also: \ is the escape character on LW, so if you want to type an actual asterisk or underscore (or \ itself), instead of using it for formatting purposes, put a \ in front of it. This way they will not be interpreted as marking lists, italics, or bold.

Comment author: ata 30 January 2011 12:25:53AM *  2 points [-]

(Note that I said preferences and not utility function. There is no assumption that your preferences HAVE to be a utility function, or at least I don't think so. Eliezer might have a different view).

Your preferences are a utility function if they're consistent, but if you're a human, they aren't.

Comment author: lukeprog 29 January 2011 11:23:14PM *  2 points [-]

I'd appreciate clarification on what you mean by "You should_me maximize my preferences."

I understand that the "objective" part is that we could both come to agree on the value of shouldyou and the value of shouldme, but what do you mean when you say that I should_MattSimpson maximize your preferences?

I certainly balk at the suggestion that there is a should_human, but I'd need to understand Eliezer in more detail on that point.

And yes, if one's own preferences are the foundation of ethics, most philosophers would simply call this subject matter practical rationality rather than morality. "Morality" is usually thought to be a term that refers to norms with a broader foundation and perhaps even "universal bindingness" or something. On this point, Eliezer just has an unusual way of carving up concept space that will confuse many people. (And this is coming from someone who rejects the standard analytic process of "conceptual analysis", and is quite open to redefining terms to make them more useful and match the world more cleanly.)

Also, even if you think that the only reasons for action that exist come from relations between preferences and states of affairs, there are still ways to see morality as a system of hypothetical imperatives that is "broader" (and therefore may fit common use of the term "morality" better) than Eliezer's meta-ethical theory. See for example Peter Railton or 1980s Philippa Foot or, well, Alonzo Fyfe and Luke Muehlhauser.

We already have a term that matches Eliezer's use of "ought" and "should" quite nicely: it's called the "prudential ought." The term "moral ought" is usually applied to a different location in concept space, whether or not it successfully refers.

Anyway, are my remarks connecting with Eliezer's actual stated position, do you think?

Comment author: Matt_Simpson 29 January 2011 11:56:24PM 3 points [-]

but what do you mean when you say that I should_MattSimpson maximize your preferences?

I mean that according to my preferences, you, me, and everyone else should maximize them. If you ask what should_MattSimpson be done, the short answer is maximize my preferences. Similarly, if you ask what should_lukeproq be done, the short answer is to maximize your preferences. It doesn't matter who does the asking. If you ask should_agent should be done, you should maximize agent's preferences. There is no "should" only should_agent's. (Note, Eliezer calls should_human "should." I think it's an error of terminology, personally. It obscures his position somewhat).

We already have a term that matches Eliezer's use of "ought" and "should" quite nicely: it's called the "prudential ought." The term "moral ought" is usually applied to a different location in concept space, whether or not it successfully refers.

Then Eliezer's position is that all normativity is prudential normativity. But without the pop-culture connotations that come with this position. In other words, this doesn't mean you can "do whatever you want." You probably do, in fact, value other people, you're a human after all. So murdering them is not ok, even if you know you can get away with it. (Note that this last conclusion might be salvageable even if there is no should_human.)

As for why Eliezer (and others here) think there is a should_human (or that human values are similar enough to talk about such a thing), the essence of the argument rests on ev-psych, but I don't know the details beyond "ev-psych suggests that our minds would be very similar."

Comment author: lukeprog 30 January 2011 12:02:15AM *  2 points [-]

Okay, that make sense.

Does Eliezer claim that murder is wrong for every agent? I find it highly likely that in certain cases, an agent's murder of some person will best satisfy that agent's preferences.

Comment author: Matt_Simpson 30 January 2011 09:02:03PM 2 points [-]

Murder is certainly not wrong_x for every agent x - we can think of an agent with a preference for people being murdered, even itself. However, it is almost always wrong_MattSimpson and (hopefully!) almost always wrong_lukeproq. So it depends on which question your are asking. If you're asking "is murder wrong_human for every agent?" Eliezer would say yes. If you're asking "is murder wrong_x for every agent x?" Eliezer would say no.

(I realize it was clear to both you and me which of the two you were asking, but for the benefit of confused readers, I made sure everything was clear)

Comment author: TheOtherDave 30 January 2011 09:06:22PM *  3 points [-]

I would be very surprised if EY gave those answers to those questions.

It seems pretty fundamental to his view of morality that asking about "wrong_human" and "wrong_x" is an important mis-step.

Maybe murder isn't always wrong, but it certainly doesn't depend (on EY's view, as I understand it) on the existence of an agent with a preference for people being murdered (or the absence of such an agent).

Comment author: Matt_Simpson 30 January 2011 09:20:12PM *  2 points [-]

Maybe murder isn't always wrong, but it certainly doesn't depend (on EY's view, as I understand it) on the existence of an agent with a preference for people being murdered (or the absence of such an agent).

That's because for EY, "wrong" and "wrong_\human" mean the same thing. It's semantics. When you ask "is X right or wrong?" in the every day sense of the term, you are actually asking "is X right_human or wrong_human?" But if murder is wrong_human, that doesn't mean it's wrong_clippy, for example. In both cases you are just checking a utility function, but different utility functions give different answers.

Comment author: TheOtherDave 30 January 2011 09:58:24PM *  3 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human, EY would say that making many paperclips doesn't therefore become right.

So it seems clear that at least under some circumstances, "wrong" and "wrong_human" don't mean the same thing for EY, and that at least sometimes EY would say that "is X right or wrong?" doesn't depend on what humans happen to want that day.

Now, if by "wrong_human" you don't mean what humans would consider wrong the day you evaluate it, but rather what is considered wrong by humans today, then all of that is irrelevant to your claim.

In that case, yes, maybe you're right that what you mean by "wrong_human" is also what EY means by "wrong." But I still wouldn't expect him to endorse the idea that what's wrong or right depends in any way on what agents happen to prefer.

Comment author: Matt_Simpson 30 January 2011 10:55:54PM *  2 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human

No one can change right_human, it's a specific utility function. You can change the utility function that humans implement, but you can't change right_human. That would be like changing e^x or 2 to something else. In other words, you're right about what the metaethics posts say, and that's what I'm saying too.

edit: or what jimrandomh said (I didn't see his comment before I posted mine)

Comment author: Lightwave 01 February 2011 10:11:03AM *  1 point [-]

What if we use 'human' as a rigid designator for unmodified-human. Then in case aliens convert people into paperclip-maximizers, they're no longer human, hence human_right no longer applies to them, but itself remains unchanged.

Comment author: jimrandomh 30 January 2011 10:54:55PM 2 points [-]

It seems clear from the metaethics posts is that if a powerful alien race comes along and converts humanity into paperclip-maximizers, such that making many paperclips comes to be right_human, EY would say that making many paperclips doesn't therefore become right.

In that case, we would draw a distinction between rightunmodifiedhuman and rightmodifiedhuman, and "right" would refer to the former.

Comment author: torekp 01 February 2011 01:14:23AM 0 points [-]

If you ask what should_MattSimpson be done, the short answer is maximize my preferences.

I find the talk of "should_MattSimpson" very unpersuasive given the availability of alternative phrasings such as "approved_MattSimpson" or "valued_MattSimpson". I have read below that EY discourages such talk, but it seems that's for different reasons than mine. Could someone please point me to at least one post in the sequence which (almost/kinda/sorta) motivates such phrasings?

Comment author: Matt_Simpson 01 February 2011 09:43:58PM 0 points [-]

Alternate phrasings such as those you listed would probably be less confusing, i.e. replacing "should" in "should_X" with "valued" and reserving "should" for "valued_human".

Comment author: orthonormal 29 January 2011 11:52:05PM *  2 points [-]

And yes, if one's own preferences are the foundation of ethics, most philosophers would simply call this subject matter practical rationality rather than morality.

They would be missing some important distinctions between what we think of as our moral values and what we think of as "chocolate/vanilla" preferences. For one obvious example, consider an alien ray gun that 'switches the way I feel' about two things, X and Y, without otherwise affecting my utility function or anything else of value to me.

If X were, say, licorice jelly beans (yum) and Y were, say, buttered popcorn jelly beans (yuck), then I wouldn't be too deeply bothered by the prospect of being zapped with this gun. (Same for sexual preference, etc.) But if X were "autonomy of individuals" and Y were "uniformity of individuals", I would flee screaming from the prospect of being messed with that way, and would take some extreme actions (if I knew I'd be zapped) to prevent my new preferences from having large effects in the world.

Now we can develop whole theories about what this kind of difference consists in, but it's at least relevant to the question of metaethics. In fact, I think that calling this wider class of volitions "preferences" is sneaking in an unfortunate connotation that they "shouldn't really matter then".

Comment author: XiXiDu 30 January 2011 12:47:21PM 2 points [-]

Now we can develop whole theories about what this kind of difference consists in...

Huh? You simply weigh "chocolate/vanilla" preferences differently than decisions that would affect goal-oriented agents.

Comment author: Matt_Simpson 30 January 2011 09:15:26PM 1 point [-]

This sounds, to me, like it's just the distinction between terminal and instrumental values. I don't terminally value eating licorice jelly beans, I just like the way they taste and the feeling of pleasure they give me. If you switched the tastes of buttered popcorn jelly beans (yuck indeed) and licorice jelly beans, that would be fine by me. Hell, it would be an improvement since no one else likes that flavor (more for me!). The situation is NOT the same for "autonomy of individuals" and "uniformity of individuals" before I really do have terminal values for these things, apart from the way they make me feel.

Comment author: TheOtherDave 30 January 2011 10:26:47PM 1 point [-]

The situation is NOT the same for "autonomy of individuals" and "uniformity of individuals" before I really do have terminal values for these things, apart from the way they make me feel.

How do you know that?

What would you expect to experience if your preference for individual autonomy in fact derived from something else?

Comment author: TheOtherDave 30 January 2011 01:42:38AM 1 point [-]

I agree that by using a single term for the wider class of volitions -- for example, by saying both that I "prefer" autonomy to uniformity and also that I "prefer" male sexual partners to female ones and also that I "prefer" chocolate to vanilla -- I introduce the connotation that the distinctions between these various "preferences" aren't important in the context of discourse.

To call that an unfortunate connotation is question-begging. Sometimes we deliberately adopt language that elides a distinction in a particular context, precisely because we don't believe that distinction ought to be made in that context.

For example, in a context where I believe skin color ought not matter, I may use language that elides the distinction between skin colors. I may do this even if I care about that distinction: for example, if I observe that I do, in fact, care about my doctor's skin color, but I don't endorse caring about it, I might start using language that elides that distinction as a way of changing the degree to which I care about it.

So it seems worth asking whether, in the particular context you're talking about, the connotations introduced by the term "preferences" are in fact unfortunate.

For instance, you class sexual preference among the "chocolate/vanilla" preferences for which the implication that they "shouldn't really matter" is appropriate.

I would likely have agreed with you twenty years ago, when I had just broken up with my girlfriend and hadn't yet started dating my current husband. OTOH, today I would likely "flee screaming" from a ray that made me heterosexual, since that would vastly decrease the value to me of my marriage.

Of course, you may object that this sort of practical consequence isn't what you mean. But there are plenty of people who would "flee screaming" from a sexual-preference-altering ray for what they classify as moral reasons, without reference to practical consequences. And perhaps I'm one of them... after all, it's not clear to me that my desire to preserve my marriage isn't a "moral value."

Indeed, it seems that there simply is no consistent fact of the matter as to whether my sexual preference is a "flee screaming" thing or not... it seems to depend on my situation. 20-year-old single me and 40-year-old married me disagree, and if tomorrow I were single again perhaps I'd once again change my mind.

Now, perhaps that just means that for me, sexual preference is a mere instrumental value, best understood in terms of what other benefits I get from it being one way or another, and is therefore a poor example of the distinction you're getting at, and I should pick a different example.

On the other hand, just because I pick an different preference P such that I can't imagine how a change in environment or payoff matrix might change P, doesn't mean that P actually belongs in a different class from sexual preference. It might be equally true that a similarly pragmatic change would change P, I just can't imagine the change that would do it.

Perhaps, under the right circumstances, I would not wish to flee from an autonomy/uniformity switching ray.

My point is that it's not clear to me that it's a mistake to elide over the distinction between moral values and aesthetic preferences. Maybe calling all of these things "preferences" is instead an excellent way of introducing the fortunate connotation that the degree to which any of them matter is equally arbitrary and situational, however intense the feeling that some preferences are "moral values" or "terminal values" or whatever other privileged term we want to apply to them.

Comment author: lessdazed 02 July 2011 05:38:51PM 0 points [-]

20-year-old single me and 40-year-old married me disagree

These are two different people, many objections from the fact they disagree one ought to have from the fact that one and some random other contemporary person disagree.

Comment author: TheOtherDave 02 July 2011 07:21:39PM 0 points [-]

And yet, a lot of our culture presumes that there are important differences between the two.

E.g., culturally we think it's reasonable for someone at 20 to make commitments that are binding on that person at 40, whereas we think it's really strange for someone at 20 or 40 to make commitments that are binding on some random other contemporary person.

Comment author: Vladimir_Nesov 29 January 2011 11:34:02PM *  2 points [-]

I certainly balk at the suggestion that there is a should_human, but I'd need to understand Eliezer in more detail on that point.

We'd need to do something specific with the world, there's no reason any one person gets to have the privilege, and creating an agent for every human and having them fight it out is probably not the best possible solution.

Comment author: Wei_Dai 01 February 2011 07:09:24AM *  3 points [-]

I don't think that adequately addresses lukeprog's concern. Even granting that one person shouldn't have the privilege of deciding the world's fate, nor should an AI be created for every human to fight it out (although personally I don't think an would-be FAI designer should rule these out as possible solutions just yet), that leaves many other possibilities for how to decide what to do with the world. I think the proper name for this problem is "should_AI_designer", not "should_human", and you need some other argument to justify the position that it makes sense to talk about "should_human".

I think Eliezer's own argument is given here:

Between neurologically intact humans, there is indeed much cause to hope for overlap and coherence; and a great and reasonable doubt as to whether any present disagreement is really unresolvable, even it seems to be about "values". The obvious reason for hope is the psychological unity of humankind, and the intuitions of symmetry, universalizability, and simplicity that we execute in the course of our moral arguments.

Comment deleted 29 January 2011 10:49:18PM [-]
Comment author: Matt_Simpson 29 January 2011 10:52:43PM *  2 points [-]

No, this is called preference utilitarianism.

Usually utilitarianism means maximize the utility of all people/agents/beings of moral worth (average or sum depending on the flavor of utilitarianism). Eliezer's metaethics says only maximize your own utility. There is a clear distinction.

Edit: but you are correct about considering preferences the foundation of ethics. I should have been more clear

Comment author: Jayson_Virissimo 30 January 2011 06:37:51AM *  2 points [-]

Eliezer's metaethics says only maximize your own utility.

Isn't that bog-standard ethical egoism? If that is the case, then I really misunderstood the sequences.

Comment author: Eliezer_Yudkowsky 30 January 2011 02:49:14AM 20 points [-]

The closest point I've found to my metaethics in standard philosophy was called "moral functionalism" or "analytical descriptivism".

Cognitivism: Yes, moral propositions have truth-value, but not all people are talking about the same facts when they use words like "should", thus creating the illusion of disagreement.

Motivation: You're constructed so that you find some particular set of logical facts and physical facts impel you to action, and these facts are what you are talking about when you are talking about morality: for example, faced with the problem of dividing a pie among 3 people who all worked equally to obtain it and are all equally hungry, you find the mathematical fact that 1/3, 1/3, 1/3 is an equal division compelling - and more generally you name the compelling logical facts associated with this issue as "fairness", for example.

(Or as it was written in Harry Potter and the Methods of Rationality:

"Mr. Potter, in the end people all do what they want to do. Sometimes people give names like 'right' to things they want to do, but how could we possibly act on anything but our own desires?"

"Well, obviously I couldn't act on moral considerations if they lacked the power to move me. But that doesn't mean my wanting to hurt those Slytherins has the power to move me more than moral considerations!")

Moral epistemology: Statements can be true only when there is something they are about which makes them true, something that fits into the Tarskian schema "'X' is true iff X". I know of only two sorts of bearers of truth-value, two sorts of things that sentences can be about: physical facts (chains of cause and effect; physical reality is made out of causes a la Judea Pearl) and logical validities (which conclusions follow from which premises). Moral facts are a mixture of both; if you throw mud on a painting it becomes physically less beautiful, but for a fixed painting its "beauty" is a logical fact, the result of running the logical "beauty" function on it.

Comment author: lukeprog 30 January 2011 04:59:42AM *  7 points [-]

Eliezer,

Thanks for your reply! Hopefully you'll have time to answer a few questions...

  1. Can anything besides Gary's preferences provide a justification for saying that "Gary should_gary X"? (My own answer would be "No.")

  2. By saying "Gary should_gary X", do you mean that "Gary would X if Gary was fully informed and had reached a state of reflective equilibrium with regard to terminal values, moral arguments, and what Gary considers to be a moral argument"? (This makes should-statements "subjectively objective" even if they are computationally intractable, and seems to capture what you're saying in the paragraph here that begins "But the key notion is the idea that...")

  3. Or, perhaps you are saying that one cannot give a concise definition of "should," as Larry D'Anna interprets you to be saying?

Comment author: Eliezer_Yudkowsky 30 January 2011 04:06:19PM 16 points [-]

Can anything besides Gary's preferences provide a justification for saying that "Gary should_gary X"? (My own answer would be "No.")

This strikes me as an ill-formed question for reasons I tried to get at in No License To Be Human. When Gary asks "What is right?" he is asking the question e.g. "What state of affairs will help people have more fun?" and not "What state of affairs will match up with the current preferences of Gary's brain?" and the proof of this is that if you offer Gary a pill to change his preferences, Gary won't take it because this won't change what is right. Gary's preferences are about things like fairness, not about Gary's preferences. Asking what justifies shouldGary to Gary is either answered by having shouldGary wrap around and judge itself ("Why, yes, it does seem better to care about fairness than about one's own desires") or else is a malformed question implying that there is some floating detachable ontologically basic property of rightness, apart from particular right things, which could be ripped loose of happiness and applied to pain instead and make it good to do evil.

By saying "Gary should_gary X", do you mean

Shouldness does incorporate a concept of reflective equilibrium (people recognize apparent changes in their own preferences as cases of being "mistaken"), but should_Gary makes no mention of Gary (except insofar as Gary's welfare is one of Gary's terminal values) but instead is about a large logical function which explicitly mentions things like fairness and beauty. This large function is rightness which is why Gary knows that you can't change what is right by messing with Gary's brain structures or making Gary want to do something else.

Or, perhaps you are saying that one cannot give a concise definition of "should"

You can arrive at a concise metaethical understanding of what sort of thing shouldness is. It is not possible to concisely write out the large function that any particular human refers to by "should", which is why all attempts at definition seem to fall short; and since for any particular definition it always seems like "should" is detachable from that definition, this reinforces the false impression that "should" is an undefinable extra supernatural property a la Moore's Open Question.

By far the hardest part of naturalistic metaethics is getting people to realize that it changes absolutely nothing about morals or emotions, just like the fact of a deterministic physical universe never had any implications for the freeness of our will to begin with.

I also note that although morality is certainly not written down anywhere in the universe except human brains, what is written is not about human brains, it is about things like fairness; nor is it written that "being written in a human brain" grants any sort of normative status. So the more you talk about "fulfilling preferences", the less the subject matter of what you are discussing resembles the subject matter that other people are talking about when they talk about morality, which is about how to achieve things like fairness. But if you built a Friendly AI, you'd build it to copy "morality" out of the brains where that morality is written down, not try to manually program in things like fairness (except insofar as you were offering a temporary approximation explicitly defined as temporary). It is likewise extremely hard to get people to realize that this level of indirection, what Bostrom terms "indirect normativity", is as close as you can get to getting any piece of physical matter to compute what is right.

If you want to talk about the same thing other people are talking about when they talk about what's right, I suggest consulting William Frankena's wonderful list of some components of the large function:

"Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom; beauty, harmony, proportion in objects contemplated; aesthetic experience; morally good dispositions or virtues; mutual affection, love, friendship, cooperation; just distribution of goods and evils; harmony and proportion in one's own life; power and experiences of achievement; self-expression; freedom; peace, security; adventure and novelty; and good reputation, honor, esteem, etc."

(Just wanted to quote that so that I didn't entirely fail to talk about morality in between all this stuff about preferences and metaethics.)

Comment author: lukeprog 30 January 2011 07:57:21PM 10 points [-]

Damn. I still haven't had my "Aha!" moment on this. I'm glad that ata, at least, appears to have it, but unfortunately I don't understand ata's explanation, either.

I'll understand if you run out of patience with this exercise, but I'm hoping you won't, because if I can come to understand your meta-ethical theory, then perhaps I will be able to explain it to all the other people on Less Wrong who don't yet understand it, either.

Let me start by listing what I think I do understand about your views.

1. Human values are complex. As a result of evolution and memetic history, we humans value/desire/want many things, and our values cannot be compressed to any simple function. Certainly, we do not only value happiness or pleasure. I agree with this, and the neuroscience supporting your position is nicely summarized in Tim Schroeder's Three Faces of Desire. We can value damn near anything. There is no need to design an artificial agent to value only one thing, either.

2. Changing one's meta-ethics need not change one's daily moral behavior. You write about this here, and I know it to be true from personal experience. When deconverting from Christianity, I went from divine command theory to error theory in the course of about 6 months. About a year after that, I transitioned from error theory to what was then called "desire utilitarianism" (now called "desirism"). My meta-ethical views have shifted in small ways since then, and I wouldn't mind another radical transition if I can be persuaded. But I'm not sure yet that desirism and your own meta-ethical theory are in conflict.

3. Onlookers can agree that Jenny has 5 units of Fred::Sexiness, which can be specified in terms of curves, skin texture, etc. This specification need not mention Fred at all. As explained here.

4. Recursive justification can't "hit bottom" in "an ideal philosophy student of perfect emptiness"; all I can do is reflect on my mind's trustworthiness, using my current mind, in a process of something like reflective equilibrium, even though reflective coherence isn't specified as the goal.

5. Nothing is fundamentally moral. There is nothing that would have value if it existed in an isolated universe all by itself that contained no valuers.

Before I go on... do I have this right so far?

Comment author: Eliezer_Yudkowsky 30 January 2011 08:25:52PM 10 points [-]

1-4 yes.

5 is questionable. When you say "Nothing is fundamentally moral" can you explain what it would be like if something was fundamentally moral? If not, the term "fundamentally moral" is confused rather than untrue; it's not that we looked in the closet of fundamental morality and found it empty, but that we were confused and looking in the wrong closet.

Indeed my utility function is generally indifferent to the exact state of universes that have no observers, but this is a contingent fact about me rather than a necessary truth of metaethics, for indifference is also a value. A paperclip maximizer would very much care that these uninhabited universes contained as many paperclips as possible - even if the paperclip maximizer were outside that universe and powerless to affect its state, in which case it might not bother to cognitively process the preference.

You seem to be angling for a theory of metaethics in which objects pick up a charge of value when some valuer values them, but this is not what I think, because I don't think it makes any moral difference whether a paperclip maximizer likes paperclips. What makes moral differences are things like, y'know, life, consciousness, activity, blah blah.

Comment author: lukeprog 30 January 2011 11:08:26PM 3 points [-]

Eliezer,

In Setting Up Metaethics, you wrote:

And if you've been reading along this whole time, you know the answer isn't going to be, "Look at this fundamentally moral stuff!"

I didn't know what "fundamentally moral" meant, so I translated it to the nearest term with which I'm more familiar, what Mackie called "intrinsic prescriptivity." Or, perhaps more clearly, "intrinsic goodness," following Korsgaard:

Objects, activities, or whatever have an instrumental value if they are valued for the sake of something else - tools, money, and chores would be standard examples. A common explanation of the supposedly contrasting kind, intrinsic goodness, is to say that a thing is intrinsically good if it is valued for its own sake, that being the obvious alternative to a thing's being valued for the sake of something else. This is not, however, what the words "intrinsic value" mean. To say that something is intrinsically good is not by definition to say that it is valued for its own sake: it is to say that it has goodness in itself. It refers, one might say, to the location or source of the goodness rather than the way we value the thing. The contrast between instrumental and intrinsic value is therefore misleading, a false contrast. The natural contrast to intrinsic goodness - the value a thing has "in itself" - is extrinsic goodness - the value a thing gets from some other source. The natural contrast to a thing that is valued instrumentally or as a means is a thing that is valued for its own sake or as an end.

So what I mean to say in (5) is that nothing is intrinsically good (in Korsgaard's sense). That is, nothing has value in itself. Things only have value in relation to something else.

I'm not sure whether this notion of intrinsic value is genuinely confused or merely not-understood-by-Luke-Muehlhauser, but I'm betting it is either confused or false. ("Untrue" is the term usually used to capture a statement's being either incoherent or meaningful-and-false: see for example Richard Joyce on error theory.)

But now, I'm not sure you agree with (5) as I intended it. Do you think life, consciousness, activity, and some other things have value-in-themselves? Do these things have intrinsic value?

Thanks again for your reply. I'm going to read Chappell's comment on this thread, too.

Comment author: Eliezer_Yudkowsky 31 January 2011 05:17:43AM 11 points [-]

Do you think a heap of five pebbles is intrinsically prime, or does it get its primeness from some extrinsic thing that attaches a tag with the five English letters "PRIME" and could in principle be made to attach the same tag to composite heaps instead? If you consider "beauty" as the logical function your brain's beauty-detectors compute, then is a screensaver intrinsically beautiful?

Does the word "intrinsic" even help, considering that it invokes bad metaphysics all by itself? In the physical universe there are only quantum amplitudes. Moral facts are logical facts, but not all minds are compelled by that-subject-matter-which-we-name-"morality"; one could as easily build a mind to be compelled by the primality of a heap of pebbles.

Comment author: wedrifid 31 January 2011 07:23:19AM 0 points [-]

Good answer!

Comment author: XiXiDu 31 January 2011 11:24:53AM *  1 point [-]

So the short answer is that there are different functions that use the same labels to designate different relations while we believe that the same labels designate the same functions?

Comment author: XiXiDu 31 January 2011 10:58:28AM 1 point [-]

I wonder if Max Tegmark would have written a similar comment. I'm not sure if there is a meaningful difference regarding Luke's question to say that there are only quantum amplitudes versus there are only relations.

Comment author: Eliezer_Yudkowsky 31 January 2011 01:46:01PM 5 points [-]

What I'm saying is that in the physical world there are only causes and effects, and the primeness of a heap of pebbles is not an ontologically basic fact operating as a separate and additional element of physical reality, but it is nonetheless about as "intrinsic" to the heap of pebbles as anything.

Once morality stops being mysterious and you start cashing it out as a logical function, the moral awfulness of a murder is exactly as intrinsic as the primeness of a heap of pebbles. Just as we don't care whether pebble heaps are prime or experience any affect associated with its primeness, the Pebblesorters don't care or compute whether a murder is morally awful; and this doesn't mean that a heap of five pebbles isn't really prime or that primeness is arbitrary, nor yet that on the "moral Twin Earth" murder could be a good thing. And there are no little physical primons associated with the pebble-heap that could be replaced by compositons to make it composite without changing the number of pebbles; and no physical stone tablet on which morality is written that could be rechiseled to make murder good without changing the circumstances of the murder; but if you're looking for those you're looking in the wrong closet.

Comment author: XiXiDu 31 January 2011 02:53:10PM *  -2 points [-]

Are you arguing that the world is basically a cellular automaton and that therefore beauty is logically implied to be a property of some instance of the universe? If some agent does perceive beauty then that is a logically implied fact about the circumstances. Asking if another agent would perceive the same beauty could be rephrased as asking about the equality of the expressions of an equation?

I think a lot of people are arguing about the ambiguity of the string "beauty" as it is multiply realized.

Comment author: Eugine_Nier 30 January 2011 08:33:39PM 1 point [-]

When you say "Nothing is fundamentally moral" can you explain what it would be like if something was fundamentally moral? If not, the term "fundamentally moral" is confused rather than untrue; it's not that we looked in the closet of fundamental morality and found it empty, but that we were confused and looking in the wrong closet.

BTW, in your post Are Your Enemies Innately Evil?, I think you are making a similar mistake about the concept of evil.

Comment author: ata 30 January 2011 09:45:04PM *  4 points [-]

"Innately" is being used in that post in the sense of being a fundamental personality trait or a strong predisposition (as in "Correspondance Bias", to which that post is a followup). And fundamental personality traits and predispositions do exist — including some that actually do predispose people toward being evil (e.g. sociopathy) — so, although the phrase "innately evil" is a bit dramatic, I find its meaning clear enough in that post's context that I don't think it's a mistake similar to "fundamentally moral". It's not arguing about whether there's a ghostly detachable property called "evil" that's independent of any normal facts about a person's mind and history.

Comment author: torekp 01 February 2011 01:05:45AM 0 points [-]

When you say "Nothing is fundamentally moral" can you explain what it would be like if something was fundamentally moral?

He did, by implication, in describing what it's like if nothing is:

There is nothing that would have value if it existed in an isolated universe all by itself that contained no valuers.

Clearly, many of the items on EY's list, such as fun, humor, and justice, require the existence of valuers. The question above then amounts to whether all items of moral goodness require the existence of valuers. I think the question merits an answer, even if (see below) it might not be the one lukeprog is most curious about.

Or, perhaps more clearly, "intrinsic goodness," following Korsgaard [...]

Unfortunately, lukeprog changed the terms in the middle of the discussion. Not that there is anything wrong with the new question (and I like EY's answer).

Comment author: XiXiDu 30 January 2011 04:44:17PM *  7 points [-]

After trying to read No License To Be Human I officially give up reading the sequences for now and postpone it until I learnt a lot more. I think it is wrong to suggest that anyone can read the sequences. Either you've to be a prodigy or a post-graduate. The second comment on that post expresses my own feelings, can people actually follow Yudkowsky's posts? It's over my head.

Comment author: Dr_Manhattan 30 January 2011 05:08:28PM 5 points [-]

I agree with you sentiment, but I suggest not giving up so easily. I have the same feeling after many sequence posts, but some of them that I groked were real gems and seriously affected my thinking.

Also, borrowing some advice on reading hard papers, it's re-reading that makes a difference.

Also, as my coach put it "the best stretching for doing sidekicks is actually doing sidekicks".

Comment author: wedrifid 30 January 2011 05:19:50PM *  5 points [-]

When Gary asks "What is right?" he is asking the question e.g. "What state of affairs will help people have more fun?" and not "What state of affairs will match up with the current preferences of Gary's brain?"

I do not necessarily disagree with this, but the following:

and the proof of this is that if you offer Gary a pill to change his preferences, Gary won't take it because this won't change what is right.

... does not prove the claim. Gary would still not take the pill if the question he was asking was "What state of affairs will match up with the current preferences of Gary's brain?". A reference to the current preferences of Gary's brain is different to asking the question "What is a state of affairs in which there is a high satisfaction of the preferences in the brain of Gary?".

Comment author: XiXiDu 30 January 2011 06:00:19PM *  2 points [-]

I do not necessarily disagree with this...

It seems so utterly wrong to me that I concluded it must be me who simply doesn't understand it. Why would it be right to help people to have more fun if helping people to have more fun does not match up with your current preferences. The main reason for why I was able to abandon religion was to realize that what I want implies what is right. That still feels intuitively right. I didn't expect to see many people on LW to argue that there exist preference/(agent/mind)-independent moral statements like 'it is right to help people' or 'killing is generally wrong'. I got a similar reply from Alicorn. Fascinating. This makes me doubt my own intelligence more than anything I've so far come across. If I parse this right it would mean that a Paperclip Maximizer is morally bankrupt?

Comment author: Eugine_Nier 30 January 2011 06:29:37PM 4 points [-]

The main reason for why I was able to abandon religion was to realize that what I want implies what is right. That still feels intuitively right. I didn't expect to see many people on LW to argue that there exist preference/(agent/mind)-independent moral statements like 'it is right to help people' or 'killing is generally wrong'.

Well, something I've been noticing is that in their tell your rationalist origin stories, the reason a lot of people give for why they left their religion aren't actually valid arguments. Make of that what you will.

If I parse this right it would mean that a Paperclip Maximizer is morally bankrupt?

Yes. It is morally bankrupt. (or would you not mind turning into paperclips if that's what the Paperclip Maximizer wanted?)

BTW, your current position is more-or-less what theists mean when they say atheists are amoral.

Comment author: XiXiDu 30 January 2011 06:45:59PM *  1 point [-]

Yes. It is morally bankrupt. (or would you not mind turning into paperclips if that's what the Paperclip Maximizer wanted?)

Yes, but that is a matter of taste.

BTW, your current position is more-or-less what theists mean when they say atheists are amoral.

Why would I ever change my current position? If Yudkowsky told me there was some moral laws written into the fabric of reality, what difference would that make? Either such laws are imperative, so that I am unable to escape them, or I simply ignore them if they are opposing my preferences.

Assume all I wanted to do is to kill puppies. Now Yudkowsky told me that this is prohibited and I will suffer disutility because of it. The crucial question would be, does the disutility outweigh the utility I assign to killing puppies? If it doesn't, why should I care?

Comment author: TheOtherDave 30 January 2011 09:38:18PM *  4 points [-]

Perhaps you assign net utility to killing puppies. If you do, you do. What EY tells you, what I tell you, what is prohibited, etc., has nothing to do with it. Nothing forces you to care about any of that.

If I understand EY's position, it's that it cuts both ways: whether killing puppies is right or wrong doesn't force you to care, but whether or not you care doesn't change whether it's right or wrong.

If I understand your position, it's that what's right and wrong depends on the agent's preferences: if you prefer killing puppies, then killing puppies is right; if you don't, it isn't.

My own response to EY's claim is "How do you know that? What would you expect to observe if it weren't true?" I'm not clear what his answer to that is.

My response to your claim is "If that's true, so what? Why is right and wrong worth caring about, on that model... why not just say you feel like killing puppies?"

Comment author: Matt_Simpson 31 January 2011 12:41:52AM *  3 points [-]

Why would it be right to help people to have more fun if helping people to have more fun does not match up with your current preferences

Because right is a rigid designator. It refers to a specific set of terminal values. If your terminal values don't match up with this specific set of values, then they are wrong, i.e. not right. Not that you would particularly care, of course. From your perspective, you only want to maximize your own values and no others. If your values don't match up with the values defined as moral, so much for morality. But you still should be moral because should, as it's defined here, refers to a specific set of terminal values - the one we labeled "right."

(Note: I'm using the term should exactly as EY uses it, unlike in my previous comments in these threads. In my terms, should=should_human and on the assumption that you, XiXiDu, don't care about the terminal values defined as right, should_XiXiDu =/= should)

Comment author: XiXiDu 31 January 2011 09:35:30AM *  3 points [-]

I'm getting the impression that nobody here actually disagrees but that some people are expressing themselves in a very complicated way.

I parse your comment to mean that the definition of moral is a set of terminal values of some agents and should is the term that they use to designate instrumental actions that do serve that goal?

Comment author: endoself 31 January 2011 10:00:54AM 1 point [-]

Your second paragraph looks correct. 'Some agents' refers to humanity rather than any group of agents. Technically, should is the term anything should use when discussing humanity's goals, at least when speaking Eliezer.

Your first paragraph is less clear. You definitely disagree with others. There are also some other disagreements.

Comment author: hairyfigment 30 January 2011 10:11:50PM 1 point [-]

The main reason for why I was able to abandon religion was to realize that what I want implies what is right.

And if you modify this to say a certain subset of what you want -- the subset you'd still call "right" given omniscience, I think -- then it seems correct, as far as it goes. It just doesn't get you any closer to a more detailed answer, specifying the subset in question.

Or not much closer. At best it tells you not to worry that you 'are' fundamentally evil and that no amount of information would change that.

Comment author: Pfft 01 February 2011 03:38:55AM 1 point [-]

Perhaps a better thought experiment, then, is to offer Gary the chance to travel back in time and feed his 2-year-old self the pill. Or, if you dislike time machines in your thought experiments, we can simply ask Gary whether or not he now would have wanted his parents to have given him the pill when he was a child. Presumably the answer will still be no.

Comment author: wedrifid 01 February 2011 03:57:08AM *  1 point [-]

If timetravel is to be considered then we must emphasize that when we say 'current preferences' we do not mean "preferences at time Time.now, whatever we can make those preferences be" but rather "I want things X, Y, Z to happen, regardless of the state of the atoms that make up me at this or any other time." Changing yourself to not want X, Y or Z will make X, Y and Z less likely to happen so you don't want to do that.

Comment author: Vladimir_Nesov 30 January 2011 10:41:06AM *  2 points [-]

Gary's preference is not itself justification, rather it recognizes moral arguments, and not because it's Gary's preference, but for its own specific reasons. Saying that "Gary's preference states that X is Gary_right" is roughly the same as "Gary should_Gary X".

(This should_T terminology was discouraged by Eliezer in the sequences, perhaps since it invites incorrect moral-relativistic thinking, as if any decision problem can be assumed as own by any other, and also makes you think of ways of referring to morality, while seeing it as a black box, instead of looking inside morality. And you have to look inside even to refer to it, but won't notice that until you stop referring and try looking.)

By saying "Gary should_gary X", do you mean that "Gary would X if Gary was fully informed and had reached a state of reflective equilibrium with regard to terminal values, moral arguments, and what Gary considers to be a moral argument"?

To a first approximation, but not quite, since it might be impossible to know what is right, for any computation not to speak of a mere human, only to make right guesses.

This makes should-statements "subjectively objective"

Every well-defined question has in a sense a "subjectively objective" answer: there's "subjectivity" in the way the question has to be interpreted by an agent that takes on a task of answering it, and "objectivity" in the rules of reasoning established by such interpretation, that makes some possible answers incorrect with respect to that abstract standard.

Or, perhaps you are saying that one cannot give a concise definition of "should,"

I don't quite see how this is opposed to the other points of your comment. If you actually start unpacking the notion, you'll find that it's a very long list. Alternatively, you might try referring to that list by mentioning it, but that's a tricky task for various reasons, including the need to use morality to locate (and precisely describe the location of) the list. Perhaps we can refer to morality concisely, but it's not clear how.

Comment author: Matt_Simpson 31 January 2011 12:13:27AM *  2 points [-]

(This should_T terminology was discouraged by Eliezer in the sequences, perhaps since it invites incorrect moral-relativistic thinking, as if any decision problem can be assumed as own by any other, and also makes you think of ways of referring to morality, while seeing it as a black box, instead of looking inside morality. And you have to look inside even to refer to it, but won't notice that until you stop referring and try looking.)

I had no idea what Eliezer was talking about originally until I started thinking in terms of should_T. Based on that and the general level of confusion among people trying to understand his metaethics, I concluded that EY was wrong - more people would understand if he talked in terms of should_T. Based on some of the back and forth here, I'm revising that opinion somewhat. Apparently this stuff is just confusing and I may just be atypical in being able to initially understand it better in those terms.

Comment author: Vladimir_Nesov 30 January 2011 04:10:09AM *  3 points [-]

Why consider physical facts separately? Can't they be thought of as logical facts, in the context of agent's epistemology? (You'll have lots of logical uncertainty about them, and even normative structures will look more like models of uncertainty, but still.) Is it just a matter of useful heuristic separation of the different kinds of data? (Expect not, in your theory, in some sense.)

Comment author: XiXiDu 30 January 2011 01:17:53PM *  1 point [-]

Yes, moral propositions have truth-value...

But are those truth-values intersubjectively recognizable?

The average person believes morality to be about imperative terminal goals. You ought to want that which is objectively right and good. But there does exist no terminal goal that is objectively desirable. You can assign infinite utility to any action and thereby outweigh any consequences. What is objectively verifiable is how to maximize the efficiency in reaching a discrete terminal goal.

Comment author: wedrifid 30 January 2011 01:46:49PM 1 point [-]

But there does exist no terminal goal that is objectively (intersubjectively) desirable.

If you mean intersubjectively say it. Objectively has a slightly different meaning. In particular, see 'objectively subjective'.

Comment author: XiXiDu 30 January 2011 02:23:21PM 1 point [-]

I changed it.

Comment author: syllogism 08 February 2011 12:32:02PM *  7 points [-]

When I read the meta-ethics sequence I mostly wondered why he made it so complicated and convoluted. My own take just seems a lot simpler --- which might mean it's wrong for a simple reason, too. I'm hoping someone can help.

I see ethics as about adopting some set of axioms that define which universes are morally preferable to others, and then reasoning from those axioms to decide whether an action, given the information available, has positive expected utility.

So which axioms should I adopt? Well, one simple, coherent answer is "none": be entirely nihilist. I would still prefer some universes over others, as I'd still have all my normal non-moral preferences, such as appetites etc. But it'd be all about me, and other people's interests would only count so far as they were instrumental to my own.

The problem is that the typical human mind has needs that are incompatible with nihilism. Nihilism thus becomes anti-strategic: it's an unlikely path to happiness. I feel the need to care about other people, and it doesn't help me to pretend I don't.[1]

So, nihilism is an anti-strategic ethical system for me to adopt, because it goes against my adapted and culturally learned intuitions about morality --- what I'll call my Emotional Moral Compass. My emotional moral compass defines my knee jerk reactions to what's right and what's not. Unfortunately, these knee jerk reactions are hopelessly contradictory. The strength of my emotional reaction to an injustice is heavily influenced by my mood, and can be primed easily. It doesn't scale properly. It's dominated by the connection I feel to the people involved, not by what's happening. And I know that if I took my emotional moral compass back in time, I'd almost certainly get the wrong result to questions that now seem obvious, such as slavery.

I can't in full reflection agree to define "rightness" with the results of my emotional moral compass, because I also have an emotional need for my beliefs to be internally consistent. I know that my emotional moral compass does not produce consistent judgments. It also does not reliably produce judgments that I would want other people to make. This is problematic because I have a need to believe that I'm the sort of person I would approve of if I were not me.

I really did try on nihilism and discard it, before trying to just follow my emotional moral compass, and discarded that too. Now I'm roughly a preference utilitarian. I'm working on trying to codify my ideas into axioms, but it's difficult. Should I prefer universes that maximise mean weighted preferences? But then what about population differences? How do I include the future? Is there a discounting rate? The details are surprisingly tricky, which may suggest I'm on the wrong track.

Adopting different ethical axioms hasn't been an entirely hand-waving sort of gesture. When I was in my "emotional moral compass" stage, I became convinced that a great many animals suffered a great deal in the meat industry. My answer to this was that eating meat still felt costless --- I have no real empathy with chickens, cows or pigs, and the magnitude of the problem left me cold (since my EMC can't do multiplication). I didn't feel guilty, so my EMC didn't compel me to do anything differently.

This dissonance got uncomfortable enough that I adopted Peter Singer's version of preference utilitarianism as an ethical system, and began to act more ethically. I set myself a deadline of six months to become vegetarian, and resolved to tithe to the charity I determined to have maximum utility once I got a post-degree job.

If ethics are based on reasoning from axioms, how do I deal with people who have different axioms from me? Well, one convenient thing is that few people adopt terrible axioms that have them preferring universes paved in paperclips or something. Usually people's ethics are just inconsistent.

A less convenient universe would present me with someone who had entirely consistent ethics based on completely different axioms that led to different judgments from mine, and maximising the resulting utility function would make the person feel happy and fulfilled. Ethical debate with this person would be fruitless, and I would have to regard them as On the Wrong Team. We want irreconcilably different things. But I couldn't say I was more "right" than they, except with special reference to my definition of "right" in preference to theirs.

[1] Would I change my psychology so that I could be satisfied with nihilism, instead of preference utilitarianism? No, but I'm making that decision based on my current values. Switching utilitarian-me for nihilist-me would just put another person On the Wrong Team, which is a negative utility move based on my present utility function. I can't want to not care while currently caring, because my current caring ensures that I care about caring.

There's also no reason to believe that it would be easier to be satisfied with this alternate psychology. Sure, satisfying ethics requires me to eat partially against my taste preferences, and my material standard of living takes an inconsequential hit. But I gain this whole other dimension of satisfaction. In other words, I get an itch that it costs a lot to scratch, but having scratched it I'm better off. A similar question would be, would I choose to have zero sexual or romantic interest, if I could? I emphatically answer no.

Comment author: cousin_it 08 February 2011 01:07:02PM *  3 points [-]

I think your take is pretty much completely correct. You don't fall into the trap of arguing whether "moral facts are out there" or the trap of quibbling over definitions of "right", and you very clearly delineate the things you understand from the things you don't.

Comment author: lessdazed 02 July 2011 09:11:29AM 1 point [-]

So which axioms should I adopt?

Isn't it a bit late for that question for any human, by the time a human can formulate the question?

So, nihilism is an anti-strategic ethical system for me to adopt

You don't really have the option of adopting it, just espousing it (including to yourself). No?

But I couldn't say I was more "right" than they, except with special reference to my definition of "right" in preference to theirs.

You really could, all else equal, because all the (other) humans have, as you said, very similar axioms rather than terrible ones.

Comment author: TimFreeman 16 April 2011 08:54:56AM 1 point [-]

Your argument against nihilism is fundamentally "I feel the need to care about other people, and it doesn't help me to pretend I don't".

(I'll accept for the purpose of this conversation that the empty ethical system deserves to be called "nihilism". I would have guessed the word had a different meaning, but let's not quibble over definitions.)

That's not an argument against nihilism. If I want to eat three meals a day, and I want other people not to starve, and I want my wife and kids to have a good life, that's all stuff I want. Caring for other people is entirely consistent with nihilism, it's just another thing you want.

Utiliarianism doesn't solve the problem of having a bunch of contradictory desires. It just leaves you trying to satisfy other people's contradictory desires instead of your own. However, I am unfamiliar with Peter Singer's version. Does it solve this problem?

Comment author: syllogism 17 April 2011 02:07:10AM 1 point [-]

I think the term nihilism is getting in the way here. Let's instead talk about "the zero axiom system". This is where you don't say that any universes are morally preferable to any others. They may be appetite-preferable, love-for-people-close-to-you preferable, etc.

If no universes are morally preferable, one strategy is to be as ruthlessly self-serving as possible. I predict this would fail to make most people happy, however, because most people have a desire to help others as well as themselves.

So a second strategy is to just "go with the flow" and let yourself give as much as your knee-jerk guilt or sympathy-driven reactions tell you to. You don't research charities and you still eat meat, but maybe you give to a disaster relief appeal when the people suffering are rich enough or similar enough to you to make you sympathetic.

All I'm really saying is that this second approach is also anti-strategic once you get to a certain level of self-consistency, and desire for further self-consistency becomes strong enough to over-rule desire for some other comforts.

I find myself in a bind where I can't care nothing, and I can't just follow my emotional moral compass. I must instead adopt making the world a better place as a top-level goal, and work strategically to make that happen. That requires me to adopt some definition of what constitutes a better universe that isn't rooted in my self-interest. In other words, my self-interest depends on having goals that don't themselves refer to my self-interest. And those goals have to do that in entirely good-faith. I can't fake this, because that contradicts my need for self-consistency.

In other words, I'm saying that someone becomes vegetarian when their need for a consistent self-image about whether they behave morally starts to over-rule the sensory, health and social benefits of eating meat. Someone starts to tithe to charity when their need for moral consistency starts to over-rule their need for an extra 10% of their income.

So you can always do the calculations about why someone did something, and take it back to their self-interest, and what strategies they're using to achieve that self-interest. Utilitarianism is just the strategy of adopting self-external goals as a way to meet your need for some self-image or guilt-reassurance. But it's powerful because it's difficult to fake: if you adopt this goal of making the world a better place, you can then start calculating.

There are some people who see the fact that this is all derivable from self-interest, and think that it means it isn't moral. They say "well okay, you just have these needs that make you do x, y or z, and those things just happen to help other people. You're still being selfish!".

This is just arguing about the meaning of "moral", and defining it in a way that I believe is actually impossible. What matters is that the people are helped. What matters is the actual outcomes of your actions. If someone doesn't care what happens to other people at all, they are amoral. If someone cares only enough to give $2 to a backpacker in a koala suit once every six months, they are a very little bit moral. Someone who cares enough to sincerely try to solve problems and gets things done is very moral. What matters is what's likely to happen.

Comment author: TimFreeman 17 April 2011 02:29:00AM 1 point [-]

I can't interpret your post as a reply to my post. Did you perhaps mean to post it somewhere else?

My fundamental question was, how is a desire to help others fundamentally different from a desire to eat pizza?

You seem to be defining a broken version of the zero ethical system that arbitrarily disregards the former. That's a strawman.

If you want to say that the zero ethical system is broken, you have to say that something breaks when people try to enact their desires, including the desires to help others.

What matters is that the people are helped.

Sorry, that's incoherent. Someone is helped if they get things they desire. If your entire set of desires is to help others, then the solution is that your desires (such as eating pizza) don't matter and theirs do. I don't think you can really do that. If you can do that, then I hope that few people do that, since somebody has to actually want something for themselves in order for this concept of helping others to make any sense.

(I do believe that this morality-is-selfless statement probably lets you get positive regard from some in-group you desire. Apparently I don't desire to have that in-group.)

Comment author: syllogism 17 April 2011 04:16:24AM *  2 points [-]

I can't interpret your post as a reply to my post. Did you perhaps mean to post it somewhere else?

I did intend to reply to you, but I can see I was ineffective. I'll try harder.

My fundamental question was, how is a desire to help others fundamentally different from a desire to eat pizza?

Fundamentally, it's not.

You seem to be defining a broken version of the zero ethical system that arbitrarily disregards the former. That's a strawman.

I'm saying that there's three versions here:

  1. The strawman where there's no desire to help others. Does not describe people's actual desires, but is a self-consistent and coherent approach. It's just that it wouldn't work for most people.

  2. Has a desire to help others, but this manifests in behaviour more compatible with guilt-aversion than actually helping people. This is not self-consistent. If the aim is actually guilt-aversion, this collapses back to position 1), because the person must admit to themselves that other people's desires are only a correlate of what they want (which is to not feel guilty).

  3. Has a desire to help others, and pursues it in good faith, using some definition of which universes are preferable that does not weight their own desires over the desires of others. There's self-reference here, because the person's desires do refer to other people's desires. But you can still maximise the measure even with the self-reference.

If your entire set of desires is to help others, then the solution is that your desires (such as eating pizza) don't matter and theirs do.

But you do have other desires. You've got a desire for pizza, but you've also got a desire to help others. So if a 10% income sacrifice meant you get 10% less pizza, but someone else gets 300% more pizza, maybe that works out. But you don't give up 100% of your income and live in a sack-cloth.

Comment author: TimFreeman 17 April 2011 04:55:34PM *  0 points [-]

Thanks, I think I understand better. We have some progress here:

  • We agree that the naive model of a selfish person who doesn't have any interest in helping others hardly ever describes real people .

  • We seem to agree that guilt-aversion as a desire doesn't make sense, but maybe for different reasons.

I think it doesn't make sense because when I say someone desires X, I mean that they prefer worlds with property X over worlds lacking that property, and I'm only interested in X's that describe the part of the world outside of their own thought process. For the purposes of figuring out what someone desires, I don't care if they want it because of guilt aversion or because they're hungry or some other motive; all I care is that I expect them to make some effort to make it happen, given the opportunity, and taking into account their (perhaps false) model of how the world works.

Maybe I do agree with you enough on this that the difference is unimportant. You said:

If the aim is actually guilt-aversion, this collapses back to position 1), because the person must admit to themselves that other people's desires are only a correlate of what they want (which is to not feel guilty).

I think you're assuming here that people who claim a desire to help people and are really motivated by guilt-aversion are ineffective. I'm not sure that's always true. Certainly, if they're ineffective at helping people due to their own internal process, in practice they don't really want to help people.

Has a desire to help others, and pursues it in good faith, using some definition of which universes are preferable that does not weight their own desires over the desires of others.

I don't know what it means to "weight their own desires over the desires of others". If I'm willing to donate a kidney but not donate my only liver, and the potential liver recipient desires to have a better liver, have I weighted my own desires over the desires of others? Maybe you meant "weight their own desires to the exclusion of the desires of others".

We might disagree about what it means to help others. Personally, I don't care much about what people want. For example, I have a friend who is alcoholic. He desires alcohol. I care about him and have provided him with room and board in the past when he needed it, but I don't want him to get alcohol. So my compassion for him is about me wanting to move the world (including him) to the place I want it to go, not some compromise between my desires and his desires.

So I want what I want, and my actions are based on what I want. Some of the things I want give other people some of the things they want. Should it be some other way?

Now a Friendly AI is different. When we're setting up its utility function, it has no built-in desires of its own, so the only reasonable thing for it to desire is some average of the desires of whoever it's being Friendly toward. But you and I are human, so we're not like that -- we come into this with our own desires. Let's not confuse the two and try to act like a machine.

Comment author: syllogism 17 April 2011 06:59:21PM *  2 points [-]

Yes, I think we're converging onto the interesting disagreements.

I think you're assuming here that people who claim a desire to help people and are really motivated by guilt-aversion are ineffective. I'm not sure that's always true. Certainly, if they're ineffective at helping people due to their own internal process, in practice they don't really want to help people.

This is largely an empirical point, but I think we differ on it substantially.

I think if people don't think analytically, and even a little ruthlessly, they're very ineffective at helping people. The list of failure modes is long. People prefer to help people they can see at the expense of those out of sight who could be helped more cheaply. They're irrationally intolerant of uncertainty of outcome. They're not properly sensitive to scale. I haven't cited these points, but hopefully you agree. If not we can dig a little deeper into them.

I don't know what it means to "weight their own desires over the desires of others". If I'm willing to donate a kidney but not donate my only liver, and the potential liver recipient desires to have a better liver, have I weighted my own desires over the desires of others? Maybe you meant "weight their own desires to the exclusion of the desires of others".

I just meant that self-utility doesn't get a huge multiplier when compared against others-utility. In the transplant donation example, you get just as much out of your liver as whoever you might give it to. So you'd be going down N utilons and they'd be going up N utilons, and there would be a substantial transaction cost of M utilons. So liver donation wouldn't be a useful thing to do.

In another example, imagine your organs could save, say, 10 lives. I wouldn't do that. There are two angles here.

The first is about strategy. You don't improve the world by being a sucker who can be taken advantage of. You do have to fight your corner, too, otherwise you just promote free-riding. If all the do-gooders get organ harvested, the world is probably not better off.

But even if extremes of altruism were not anti-strategic, I can't say I'd do them either. There are lots of actions which I would have to admit result in extreme loss of self-utility and extreme gain in net utility that I don't carry out. These actions are still moral, it's just that they're more than I'm willing to do. Some people are excessively uncomfortable about this, and so give up on the idea of trying to be more moral altogether. This is to make the perfect the enemy of the good. Others are uncomfortable about it and try to twist their definition of morality into knots to conform to what they're willing to do.

The moral ideal is to have a self-utility weight of 1.0: ie, you're completely impartial to whether the utility is going to you as opposed to someone else. I don't achieve this, and I don't expect many other people do either.

But being able to set this selfishness constant isn't a get-out-of-jail-free card. I have to think about the equation, and how selfish the action would imply I really am. For instance, as an empirical point, I believe that eating meat given the current practices of animal husbandry demands a very high selfishness constant. I can't reconcile being that selfish with my self-image, and my self-image is more important to me than eating meat. So, vegetarianism, with an attempt to minimise dairy consumption, but not strict veganism, even though veganism is more moral.

We might disagree about what it means to help others. Personally, I don't care much about what people want. For example, I have a friend who is alcoholic. He desires alcohol. I care about him and have provided him with room and board in the past when he needed it, but I don't want him to get alcohol. So my compassion for him is about me wanting to move the world (including him) to the place I want it to go, not some compromise between my desires and his desires.

Yes, there are problems with preference utilitarianism. I think some people try to get around the alcoholic example by saying something like "if their desires weren't being modified by their alcoholism they would want x, and would want you to act as though they wanted x, so those are the true preferences." As I write this it seems that has to be some kind of strawman, as the idea of some Platonic "true preferences" is quite visibly flawed. There's no way to distinguish the class of preference-modifiers that includes things like alcoholism from the other preference-modifiers that together constitute a person.

I use preferences because it works well enough most of the time, and I don't have a good alternate formulation. I don't actually think the specifics of the metric being maximised are usually that important. I think it would be better to agree on desiderata for the measure --- properties that it ought to exhibit.

Anyway. What I'm trying to say is a little clearer to me now. I don't think the key idea is really about meta-ethics at all. The idea is just that almost everyone follows a biased, heuristic-based strategy for satisfying their moral desires, and that this strategy isn't actually very productive. It satisfies the heuristics like "I am feeling guilty, which means I need to help someone now", but it doesn't scratch the deeper itch to believe you genuinely make a difference very well.

So the idea is just that morality is another area where many people would benefit from deploying rationality. But this one's counter-intuitive, because it takes a rather cold and ruthless mindset to carry it through.

Comment author: TimFreeman 18 April 2011 03:06:35AM 0 points [-]

Okay, I agree that what you want to do works most of the time, and we seem to agree that you don't have good solution to the alcoholism problem, and we also seem to agree that acting from a mishmash of heuristics without any reflection or attempts to make a rational whole will very likely flounder around uselessly.

Not to imply that our conversation was muddled by the following, but: we can reformulate the alcoholism problem to eliminate the addiction. Suppose my friend heard about that reality show guy who was killed by a stingray and wanted to spend his free time killing stingrays to get revenge. (I heard there are such people, but I have never met one.) I wouldn't want to help him with that, either.

Comment author: syllogism 18 April 2011 09:39:17AM 2 points [-]

There's a strip of an incredibly over-the-top vulgar comic called space moose that gets at the same idea. These acts of kindness aren't positive utility, even if the utility metric is based on desires, because they conflict with the desires of the stingrays or other victims. Preferences also need to be weighted somehow in preference utilitarianism, I suppose by importance to the person. But then hmm, anyone gets to be a utility monster by just really really really really wanting to kill the stringrays. So yeah there's a problem there.

I think I need to update, and abandon preference utilitarianism even as a useful correlate of whatever the right measure would be.

Comment author: TimFreeman 18 April 2011 11:18:42AM *  0 points [-]

While it's gratifying to win an argument, I'd rather not do it under false pretenses:

But then hmm, anyone gets to be a utility monster by just really really really really wanting to kill the stringrays.

We need a solution to the utility monster problem if we're going to have a Friendly AI that cares about people's desires, so it's better to solve the utility monster problem than to give up on preference utilitarianism in part because you don't know how to solve the utility monster problem. I've sketched proposed solutions to two types of utility monsters, one that has one entity with large utility and one that has a large number of entities with modest utility. If these putative solutions seem wrong to you, please post bugs, fixes, or alternatives as replies to those comments.

I agree that preference utilitarianism has the problem that it doesn't free you from choosing how to weight the preferences. It also has the problem that you have to separate yourself into two parts, the part that gets to have its preference included in the weighted sum, and the part that has a preference that is the weighted sum. In reality there's only one of you, so that distinction is artificial.

Comment author: endoself 17 February 2011 07:02:57AM 0 points [-]

Why distinguish between moral and non-moral preferences? Why are moral preferences more mutable than non-moral ones?

Also, a lot of this applies to your specific situation, so it is more morality than metaethics.

Comment author: syllogism 17 February 2011 08:18:18AM *  1 point [-]

Why distinguish between moral and non-moral preferences? Why are moral preferences more mutable than non-moral ones?

The basic drive to adopt some sort of ethical system is essentially the same as other preferences, and is non-mutable. It's a preference to believe that you are making the world a better place, rather than a worse place. This introduces a definitional question of what constitutes a good world and what constitutes a bad world, which is something I think people can change their minds about.

Having written that, one question that occurs to me now is, is the basic preference to believe that you're making the world a better place, or is it to simply believe you're a good person? I prefer people who make the world a better place, so the two produce the same outcomes for me. But other people might not. If you instead had a preference for people who followed good principles or exhibited certain virtues, you wouldn't feel it necessary to make the world a better place. I shouldn't assume that such people don't exist.

So maybe instead of talking adopting a definition of which universes are good and bad, I should talk about adopting a definition of good and bad people. If you define a good person by the consequences of their actions, then you'd go on to define which universes are good and bad. But otherwise you might instead define which principles are good, or which virtues.

Comment author: wedrifid 30 January 2011 05:01:56AM 4 points [-]

In You Provably Can't Trust Yourself, Eliezer tried to figured out why his audience didn't understand his meta-ethics sequence even after they had followed him through philosophy of language and quantum physics. Meta-ethics is my specialty, and I can't figure out what Eliezer's meta-ethical position is.

Is your difficulty in understanding how Eliezer thinks about ethics or in working out what side he fights for in various standardised intellectual battles? The first task seems fairly easy. He thinks like one would expect an intelligent reductionist programmer-type to think. Translating that into philosopher speak is somewhat more challenging.

Comment author: lukeprog 30 January 2011 05:10:13AM 3 points [-]

I'm okay with Eliezer dismissing lots of standard philosophical categories as unhelpful and misleading. I have much the same attitude toward Anglophone philosophy. But anything he or someone else can do to help me understand what he is saying will be appreciated.

Comment author: komponisto 30 January 2011 06:27:26AM *  5 points [-]

I have much the same attitude toward Anglophone philosophy

Non-anglophone philosophy is worse. (Phenomenology, deconstructionism,...)

Comment author: lukeprog 30 January 2011 06:42:35AM 3 points [-]
Comment author: Psy-Kosh 31 January 2011 08:33:55PM 3 points [-]

My super summarized summary would be something like this: There're a certain set of values (well, a certain sort of computation to judge the value of some state of affairs, including updates in the way we compute it, and the things that it approves of are what we are concerned with) that we call "morality".

We humans simply happen to be the sorts of beings that care about this morality stuff as opposed to caring about, say, maximizing paperclips.

Further, it is better (by which I mean "more moral") to be moral than to be paperclipish. We should (where by "should", I more or less am just referring to the morality criterion) indeed be moral.

Morality consists of multiple criteria including happiness, love, life (well, consciousness), creativity, novelty, self determination, growth, discovery, compassion, fairness, etc...

It's an objective criteria, just as "What is 2+3?" is a clear objective question with an objective answer. It simply happens to be that we're the sorts of beings that are, metaphorically speaking, concerned with "what is 2+3?" and not at all concerned with "what is 6*7?"

Comment author: Peterdjones 30 October 2012 02:02:50AM 0 points [-]

If we can't say why we morally-should care about our particular values, why should we deem them moral?

Comment author: ata 29 January 2011 09:01:53PM *  3 points [-]

Are you looking to have it summarized in the terminology of standard moral philosophy?

Are there any specific questions you could ask about it?

(The main thing I found to be insufficiently unpacked is the notion of moral arguments — it's not clear to me exactly what types of arguments would qualify, as he sees it — but other than that, I think I understand it well enough to answer questions about it.)

Comment author: lukeprog 29 January 2011 11:05:56PM *  7 points [-]

Sure, let me try some specific questions.

I'll start with what I think is clear to me about Eliezer's views:

(1) Whatever moral facts exist, they must be part of the natural world. (Moral naturalism.)

(2) Moral facts are not written into the "book" of the universe - values must be derived from a consideration of preferences. (In philosophical parlance, this would be something like the claim that "The only sources of normativity are relations between preferences and states of affairs.")

I'll propose a third claim that I'm not so sure Eliezer would endorse:

(3) What I "should" do is determined by what actions would best fulfill my preferences. (This is just a shorter way of saying that I "should" do "what I would do to satisfy my terminal values if I had correct and complete knowledge of what actions would satisfy my terminal values.")

In this sense, morality is both "subjective" and "objective". It is subjective in the sense that what is "right" for me to do at any given time is determined in part by my own brain states (my preferences, which result from my terminal values). But it is objective in the sense that there are objectively correct answers about what actions will or will not best satisfy my preferences. I could even be wrong about what will best satisfy my preferences.

Have I interpreted Eliezer correctly so far?

Comment author: ata 30 January 2011 12:10:15AM *  10 points [-]

(1) Whatever moral facts exist, they must be part of the natural world. (Moral naturalism.)

In a manner of speaking, yes. Moral facts are facts about the output of a particular computation under particular conditions, so they are "part of the natural world" essentially to whatever extent you'd say the same thing about mathematical deductions. (See Math is Subjunctively Objective, Morality as Fixed Computation, and Abstracted Idealized Dynamics.)

(2) Moral facts are not written into the "book" of the universe - values must be derived from a consideration of preferences. (In philosophical parlance, this would be something like the claim that "The only sources of normativity are relations between preferences and states of affairs.")

No. Caring about people's preferences is part of morality, and an important part, I think, but it is not the entirety of morality, or the source of morality. (I'm not sure what a "source of normativity" is; does that refer to the causal history behind someone being moved by a moral argument, or something else?)

(The "Moral facts are not written into the 'book' of the universe" bit is correct.)

(3) What I "should" do is determined by what actions would best fulfill my preferences. (This is just a shorter way of saying that I "should" do "what I would do to satisfy my terminal values if I had correct and complete knowledge of what actions would satisfy my terminal values.")

See Inseparably Right and No License To Be Human. "Should" is not defined by your terminal values or preferences; although human minds (and things causally entangled with human minds) are the only places we can expect to find information about morality, morality is not defined by being found in human minds. It's the other way around: you happen to care about(/prefer/terminally value) being moral. If we defined "should" such that an agent "should" do whatever satisfies its terminal values (such that pebblesorters should sort pebbles into prime heaps, etc.), then morality would be a Type 2 calculator; it would have no content, it could say anything and still be correct about the question it's being asked. I suppose you could define "should" that way, but it's not an adequate unpacking of what humans are actually thinking about when they talk about morality.

Comment author: Eliezer_Yudkowsky 30 January 2011 04:23:54PM 4 points [-]

I endorse the above.

Comment author: lukeprog 30 January 2011 12:21:46AM 4 points [-]

Thanks for this!

Concerning preferences, what else is part of morality besides preferences?

A "source of normativity" is just anything that can justify a should or ought statement. The uncontroversial example is that goals/desires/preferences can justify hypothetical ought statements (hypothetical imperatives). So Eliezer is on solid footing there.

What is debated is whether anything else can justify should or ought statements. Can categorical imperatives justify ought statements? Can divine commands do so? Can non-natural moral facts? Can intrinsic value? And if so, why is it that these things are sources of normativity but not, say, facts about which arrangements of marbles resemble Penelope Cruz when viewed from afar?

My own position is that only goals/desires/preferences provide normativity, because the other proposed sources of normativity either don't provide normativity or don't exist. But if Eliezer thinks that something besides goals/desires/preferences can provide normativity, I'd like to know what that is.

I'll do some reading and see if I can figure out what your last paragraph means; thanks for the link.

Comment author: Vladimir_Nesov 30 January 2011 01:46:11AM 2 points [-]

Concerning preferences, what else is part of morality besides preferences?

"Preference" is used interchangeably with "morality" in a lot of discussion, but here Adam referred to an aspect of preference/morality where you care about what other people care about, and stated that you care about that but other things as well.

What is debated is whether anything else can justify should or ought statements. Can categorical imperatives justify ought statements? Can divine commands do so? Can non-natural moral facts? Can intrinsic value? And if so, why is it that these things are sources of normativity but not, say, facts about which arrangements of marbles resemble Penelope Cruz when viewed from afar?

I don't think introducing categories like this is helpful. There are moral arguments that move you, and a framework that responds to the right moral arguments which we term "morality", things that should move you. The arguments are allowed to be anything (before you test them with the framework), and real humans clearly fail to be ideal implementations of the framework.

(Here, the focus is on acceptance/rejection of moral arguments; decision theory would have you generate these yourself in the way they should be considered, or even self-improve these concepts out of the system if that will make it better.)

Comment author: lukeprog 30 January 2011 01:59:45AM *  2 points [-]

"Preference" is used interchangeably with "morality" in a lot of discussion, but here Adam referred to an aspect of preference/morality where you care about what other people care about, and stated that you care about that but other things as well.

Oh, right, but it's still all preferences. I can have a preference to fulfill others' preferences, and I can have preferences for other things, too. Is that what you're saying?

It seems to me that the method of reflective equilibrium has a partial role in Eliezer's meta-ethical thought, but that's another thing I'm not clear on. The meta-ethics sequence is something like 300 pages long and very dense and I can't keep it all in my head at the same time. I have serious reservations about reflective equilibrium (ala Brandt, Stich, and others). Do you have any thoughts on the role of reflective equilibrium in Eliezer's meta-ethics?

Comment author: Vladimir_Nesov 30 January 2011 02:07:06AM 3 points [-]

Oh, right, but it's still all preferences. I can have a preference to fulfill others' preferences, and I can have preferences for other things, too. Is that what you're saying?

Possibly, but you've said that opaquely enough that I can imagine you intending a meaning I'd disagree with. For example, you refer to "other preferences", while there is only one morality (preference) in the context of any given decision problem (agent), and the way you care about other agents doesn't necessarily reference their "preference" in the same sense we are talking about our agent's preference.

It seems to me that the method of reflective equilibrium has a partial role in Eliezer's meta-ethical thought, but that's another thing I'm not clear on.

This is reflected in the ideas of morality being an abstract computation (something you won't see a final answer to), and the need for morality being found on a sufficiently meta level, so that the particular baggage of contemporary beliefs doesn't distort the picture. You don't want to revise the beliefs about morality yourself, because you might do it in a human way, instead of doing that in the right way.

Comment author: ata 30 January 2011 12:30:05AM *  1 point [-]

I'll do some reading and see if I can figure out what your last paragraph means; thanks for the link.

Ah, have you not actually read through the whole sequence yet? I don't recommend reading it out of order, and I do recommend reading the whole thing. Mainly because some people in this thread (and elsewhere) are giving completely wrong summaries of it, so you would probably get a much clearer picture of it from the original source.

Comment author: lukeprog 30 January 2011 12:38:40AM *  2 points [-]

I've read the series all the way through, twice, but large parts of it didn't make sense to me. By reading the linked post again, I'm hoping to combine what you've said with what it says and come to some understanding.

Comment author: XiXiDu 30 January 2011 12:31:40PM 1 point [-]

I read your last paragraph 5 times now and still can't make sense of it.

One should drink water if one wants satisfy one's thirst. Here should is loosely used to mean that it is the optimal instrumental action to reach one's terminal goal. One should not kill is however a psychological projection of one's utility function. Here should means that one doesn't want others to engage in killing. The term should is ambiguous and vague, that's all there is to it, that's the whole problem.

Comment author: David_Gerard 29 January 2011 09:39:08PM 4 points [-]

QM appears to be the sequence that even the people who say they've read the sequences didn't read (judging by low votes and few commenters).

Comment author: Normal_Anomaly 29 January 2011 10:52:15PM 1 point [-]

That's too bad; it may have been my favorite.

Comment author: jimrandomh 29 January 2011 09:30:38PM 4 points [-]

As I understand it, Eliezer has taken the position that human values are too complex for humans to reliably formalize, and that all formalizations presented so far are or probably are incorrect. This may explain some of your difficulty in trying to find Eliezer's preferred formalization.

Comment deleted 29 January 2011 10:08:45PM [-]
Comment author: lukeprog 29 January 2011 10:53:19PM 1 point [-]

One project is the descriptive one of moral psychology and moral anthropology. Because Coherent Extrapolated Volition begins with data from moral psychology and moral anthropology, that descriptive project is important for Eliezer's design of Friendly AI. Certainly, I agree with Eliezer that human values are too complex to easily formalize, because our terminal values are the product of millions of years of messy biological and cultural evolution.

"Morality" is a term usually used in speech acts to refer to a set of normative questions about what we ought to do, or what we ought to value. Even if you're an ethical reductionist as I am, and reduce 'ought' such that it is a particular species of 'is', there are lots of ways to do that, and I'm not clear on how Eliezer does it.

Comment author: Dorikka 30 January 2011 05:24:30PM 2 points [-]

An unusual amount of the comments here are feeling unnecessary to me, so let me see if I understand this.

I have a utility function which assigns an amount of utility (positive or negative) to different qualities of world-states. (Just to be clear, ‘me being exhausted’ is a quality of a world-state, and so is ‘humans have mastered Fun Theory and apply it in a fun-maximizing fashion to humankind.’) Other humans have their own utility functions, so they may assign a different amount of utility to different qualities of world-states.

I have a place in my utility function for the utility functions of other people. As a result, if enough other people (of sufficient significance to me) attach high utility to X being a quality of a future world-state, I may work to make X a quality of the future world-state even if my utility function attaches a higher utility to Y than X before considering other people’s utility functions (given that X and Y are mutually exclusive). Other rational agents will do something similar, depending on the weight that other people’s utility functions are likely to have on their own. Of course, if I gain new information, I may act differently in order to maximize my utility function and/or, more relevantly to this discussion, I may change my utility function itself to because I feel differently.

Do I grok? If so, I don’t really understand why people are trying to formulate a definition of the word ‘should’ because it doesn’t seem to have any use in maximizing a utility function. Saying ‘Gary should Y’ or ‘Gary should_Gary Y’ seems to be statements that people would make before learning to reduce desires to parts of a utility function.

Comment author: lukeprog 30 January 2011 06:23:33PM 2 points [-]

Dorikka,

If that's what Eliezer means, then this looks like standard practical rationality theory. You have reasons to act (preferences) so as to maximize your utility function (except that it may not be right to call it a "utility function" because there's no guarantee that each person's preference set is logically consistent). The fact that you want other people to satisfy their preferences, too, means that if enough other people want world-state X, your utility function will assign higher utility to world-state X than to world-state Y even if world-state Y has more utility in your utility function when not counting the utility in your utility function assigned to the utility functions of other people.

But I don't think that's all of what Eliezer is saying because, for example, he keeps talking about the significance of a test showing that you would be okay being hit with an alien ray gun that changed your ice cream preference from chocolate to vanilla, but you wouldn't be okay being hit with an alien ray gun that changed your preferences from not-wanting-to-rape-people to wanting-to-rape-people.

He also writes about the importance of a process of reflective equilibrium, though I'm not sure to what end.

Comment author: Matt_Simpson 30 January 2011 11:06:27PM 1 point [-]

He also writes about the importance of a process of reflective equilibrium, though I'm not sure to what end.

To handle value uncertainty. If you don't know your terminal values, you have to discover them somehow.

Comment author: lukeprog 30 January 2011 11:14:19PM 1 point [-]

Is that it? Eliezer employs reflective equilibrium as an epistemological method for figuring out what your terminal values are?

Comment author: Eugine_Nier 30 January 2011 11:35:40PM 2 points [-]

As I understand it, yes.

Comment author: XiXiDu 30 January 2011 07:27:52PM *  1 point [-]

...an alien ray gun that changed your ice cream preference from chocolate to vanilla, but you wouldn't be okay being hit with an alien ray gun that changed your preferences from not-wanting-to-rape-people to wanting-to-rape-people.

I'm completely lost about that. I don't see how vanilla preferences differ from rape preferences. We just happen to weigh them differently. But that is solely a fact about our evolutionary history.

Comment author: Dorikka 30 January 2011 07:05:46PM *  1 point [-]

Hm. I can say truthfully that I don't care whether I like vanilla or chocolate ice cream more. I suppose that the statement of my utility with regard to eating vanilla vs. chocolate ice cream would be 'I assign higher utility to eating the flavor of ice cream which tastes better to me.' That is, I only care about a state of my mind. So, if the circumstances changed so I could procure that state of mind by other means (ex: eating vanilla instead of chocolate ice cream), I would have no problem with that. The action that I would take after being hit by the alien ray gun does not give me any less utility after being hit by the alien ray gun than the action that I take now gives me in the present. So I don't care whether I get hit by the ray gun.

But my statement of utility with regard to people being raped would be "I assign much lower utility to someone being raped them not being raped." Here, I care about a state of the world outside of my mind. The action that I would take after being hit by the alien ray gun (rape) has less utility under my current utility function than (~rape), so my current utility function would assign negative utility to being hit by the ray gun.

This much makes sense to me.

I don't know what 'reflective equilibrium' means; this may be because I didn't really make it through the metaethics sequence. After I formulated what I've said in this comment and the above one, I wasn't getting much out of it.

Edit: Inserted some italics for the main difference between the two scenarios and removed a set of italics. No content changes.

Comment author: XiXiDu 30 January 2011 02:37:22PM 2 points [-]

An off-topic question:

In a sense should always implies if. Can anyone point me to a "should" assertion without an implied if? If humans implicitly assume an if whenever they say should then the term is never used to propose a moral imperative but to indicate an instrumental goal.

You shall not kill if:

  • You want to follow God's law.
  • You don't want to be punished.
  • You want to please me.

It seems nobody would suggest there to be an imperative that killing is generally wrong. So where does moral realism come from?

Comment author: Alicorn 30 January 2011 02:38:42PM 6 points [-]

It seems nobody would suggest there to be an imperative that killing is generally wrong.

Um, I'll suggest that. Killing: generally wrong.

Comment author: XiXiDu 31 January 2011 07:39:03PM *  0 points [-]

Killing: generally wrong.

Do you agree with EY on Torture vs Dust Specks? If you agree, would killing one person be justified to save 3^^^3 from being killed? If you agree, would you call killing to be right in that case?

Comment author: Alicorn 31 January 2011 07:40:25PM 3 points [-]

I say bring on the specks.

Comment author: XiXiDu 31 January 2011 07:48:13PM 0 points [-]

I find that topic troubling. I find it comforting to know how others would decide here. So please allow me to ask another question. Would you personally die to save 3^^^3 from being killed? I thought about it myself and I would probably do it. But what is the lower bound here? Can I find an answer to such a question if I read the sequences, or at least how I can come up with my own answer?

Comment author: wedrifid 30 January 2011 04:52:14PM 1 point [-]

In a sense should always implies if. Can anyone point me to a "should" assertion without an implied if? If humans implicitly assume an if whenever they say should then the term is never used to propose a moral imperative but to indicate an instrumental goal.

That is a way you can translate the use of should into a convenient logical model. But it isn't the way humans instinctively use the verbal symbol.

Comment author: jimrandomh 30 January 2011 01:07:21AM 2 points [-]

In my studies of philosophy, I've mostly just tried to figure out what's correct, and not bothered to learn who came up with and believes what or to keep track of the controversies.

It occurs to me that in you're doing the opposite - thinking about what Eliezer believes, rather than about what's correct. And that seems to have translated into taking a list of standard conroversies, and expecting one of a list of standard responses to each. And the really interesting thing is, you don't seem to have found them. It seems that, for each of those questions, there are three possibilities: either he hasn't taken a position, he took a position but it wasn't recognizable becaused he came at it from a different angle or used unusual terminology, or he skipped it because it was a wrong question in the first place. I read those posts a long time ago, but I think the answer is mostly #3.

Comment author: lukeprog 30 January 2011 01:26:28AM 10 points [-]

jimrandomh,

No, I have my own thoughts on what is correct, and have written hundreds of pages about what I think is correct. Check my blog if you're curious.

But for right now, I just want to at least understand what Eliezer's positions are.

Comment author: Vladimir_Nesov 29 January 2011 10:25:27PM *  1 point [-]

The standard debates ask wrong questions, there's little point answering them, you'd spend all the time explaining your preferred ways of disambiguating the hopelessly convoluted standard words. Unsurprisingly, Eliezer's metaethics doesn't actually solve all of decision theory, so it makes a lot of steps in the right direction, while still necessarily leaving you confused even if you understood every step. You'd need to ask more specific questions, clarification for specific claims. I agree that regurgitating a body of knowledge usually helps it compost, but a mere summary probably won't do the trick.