You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

[Link] Reducing Risks of Astronomical Suffering (S-Risks): A Neglected Global Priority

6 ignoranceprior 14 October 2016 07:58PM

[Link] Reasonable Requirements of any Moral Theory

-1 TheSurvivalMachine 10 October 2016 08:48PM

[Link] Suffering-focused AI safety: Why “fail-safe” measures might be particularly promising

8 wallowinmaya 21 July 2016 08:22PM

The Foundational Research Institute just published a new paper: "Suffering-focused AI safety: Why “fail-safe” measures might be our top intervention". 

It is important to consider that [AI outcomes] can go wrong to very different degrees. For value systems that place primary importance on the prevention of suffering, this aspect is crucial: the best way to avoid bad-case scenarios specifically may not be to try and get everything right. Instead, it makes sense to focus on the worst outcomes (in terms of the suffering they would contain) and on tractable methods to avert them. As others are trying to shoot for a best-case outcome (and hopefully they will succeed!), it is important that some people also take care of addressing the biggest risks. This perspective to AI safety is especially promising both because it is currently neglected and because it is easier to avoid a subset of outcomes rather than to shoot for one highly specific outcome. Finally, it is something that people with many different value systems could get behind.

[Link] NYU conference: Ethics of Artificial Intelligence (October 14-15)

4 ignoranceprior 16 July 2016 09:07PM

FYI: https://wp.nyu.edu/consciousness/ethics-of-artificial-intelligence/

This conference will explore these questions about the ethics of artificial intelligence and a number of other questions, including:

What ethical principles should AI researchers follow?
Are there restrictions on the ethical use of AI?
What is the best way to design morally beneficial AI?
Is it possible or desirable to build moral principles into AI systems?
When AI systems cause benefits or harm, who is morally responsible?
Are AI systems themselves potential objects of moral concern?
What moral framework is best used to assess questions about the ethics of AI?

Speakers and panelists will include:

Nick Bostrom (Future of Humanity Institute), Meia Chita-Tegmark (Future of Life Institute), Mara Garza (UC Riverside, Philosophy), Sam Harris (Project Reason), Demis Hassabis (DeepMind/Google), Yann LeCun (Facebook, NYU Data Science), Peter Railton (University of Michigan, Philosophy), Francesca Rossi (University of Padova, Computer Science), Stuart Russell (UC Berkeley, Computer Science), Susan Schneider (University of Connecticut, Philosophy), Eric Schwitzgebel (UC Riverside, Philosophy), Max Tegmark (Future of Life Institute), Wendell Wallach (Yale, Bioethics), Eliezer Yudkowsky (Machine Intelligence Research Institute), and others.

Organizers: Ned Block (NYU, Philosophy), David Chalmers (NYU, Philosophy), S. Matthew Liao (NYU, Bioethics)

A full schedule will be circulated closer to the conference date.

Registration is free but required. REGISTER HERE. Please note that admission is limited, and is first-come first-served: it is not guaranteed by registration.

Unbounded linear utility functions?

-1 snarles 11 October 2015 11:30PM

The LW community seems to assume, by default, that "unbounded, linear utility functions are reasonable."  That is, if you value the existence of 1 swan at 1.5 utilons, then 10 swans should be worth 15, etc.

Yudkowsky in his post on scope insensitivity argues that nonlinearity of personal utility functions is a logical fallacy.

However, unbounded and linearly increasing utility functions lead to conundrums such as Pascal's Mugging.  A recent discussion topic on Pascal's Mugging suggests ignoring probabilities that are too small.  However, such extreme measures are not necessary if tamer utility functions are used: one images a typical personal utility function to be bounded and nonlinear. 

In that recent discussion topic, V_V and I questioned the adoption of such an unbounded, linear utility function.  I would argue that nonlinear of utility functions is not a logical fallacy.

To make my case clear, I will clarify my personal interpretation of utilitarianism.  Utility functions are mathematical constructs that can be used to model individual or group decision-making.  However, it is unrealistic to suppose that every individual actually has an utility function or even a preference ordering; at best, one could find a utility function which approximates the behavior of the individual.  This is confirmed by studies demonstrating the inconsistency of human preferences.  The decisions made by coordinated groups: e.g. corporate partners, citizens in a democracy, or the entire community of effective altruists could also be more or less well-approximated by a utility function: presumably, the accuracy of the utility function model of decision-making depends on the cohesion of the group.  Utilitarianism, as proposed by Bentham and Mills, proposes an ethical framework based on some idealized utility function.  Rather than using utility functions to model group decision-making, Bentham and Mills propose to use some utility function to guide decision-making, in the form of an ethical theory.  It is important to distinguish these two different use-cases of utility functions, which might be termed descriptive utility and prescriptive utility.

But what is ethics?  I hold the hard-nosed position that moral philosophies (including utiliarianism) are human inventions which serve the purpose of facilitating large-scale coordination.  Another way of putting it is that moral philosophy is a manifestation of the limited superrationality that our species possesses.  [Side note: one might speculate that the intellectual aspect of human political behavior, of forming alliances based on shared ideals (including moral philosophies), is a memetic or genetic trait which propogated due to positive selection pressure: moral philosophy is necessary for the development of city-states and larger political entities, which in turn rose as the dominant form of social organization in our species.  But this is a separate issue from the the discussion at hand.]

In this larger context, we can be prepared to evaluate the relative worth of a moral philosophy, such as utiliarianism, against competing philosophies.  If the purpose of a moral philosophy is to facilitate coordination, then an effective moral philosophy is one that can actually hope to achieve that kind of coordination.  Utiliarianism is a good candidate for facilitating global-level coordination due to its conceptual simplicity and because most people can agree with its principles, and it provides a clear framework for decision-making, provided that a suitable utility function can be identified, or at least that the properties of the "ideal utility function" can be debated.  Furthermore, utiliarianism, and related consequentialist moralities are arguably better equipped to handle tragedy of the commons than competing deontological theories.

And if we accept utiliarianism, and if our goal is to facilitate global coordination, we can go further to evaluate the properties of any proposed utility function, by the same criteria as before: i.e., how well will the proposed utility function facilitate global coordination.  Will the proposed utility function find broad support among the key players in the global community?  Unbounded, linearly increasing utility functions clearly fail, because few people would support conclusions such as "it's worth spending all our resources to prevent a 0.001% chance that 1e100 human lives will be created and tortured."

If so, why are such utility functions so dominant in the LW community?  One cannot overlook the biased composition of the LW community as a potential factor: generally proficient in mathematical or logical thinking, but less adept than the general population in empathetic skills.  Oversimplified theories, such as linear unbounded utility functions, appeal more strongly to this type of thinker, while more realistic but complicated utility functions are instinctively dismissed as "illogical" or "irrational", when they real reason that they are dismissed is not because they are actually concluded to be illogical, but because because they are precieved as uglier.

Yet another reason stems from the motives of the founders of the LW community, who make a living primarily out of researching existential risk and friendly AI.  Since existential risks are the kind of low-probability, long-term and high-impact event which would tend to be neglected by "intuitive" bounded and nonlinear utility functions, but favored by unintuitive, unbounded linear utility functions, it is in the founders' best interests to personally adopt a form of utiliarianism employing the latter type of utility function.

Finally, let me clarify that I do not dispute the existence of scope insensitivity.  I think the general population is ill-equipped to reason about problems on a global scale, and that education could help remedy this kind of scope insensitivity.  However, even if natural utility functions asymptote far too early, I doubt that the end result of proper training against scope insensitivity would be an unbounded linear utility function; rather, it would still be a nonlinear utility function, but which asymptotes at a larger scale.

 

 

Integral vs differential ethics, continued

6 Stuart_Armstrong 03 August 2015 01:25PM

I've talked earlier about integral and differential ethics, in the context of population ethics. The idea is that the argument for the repugnant conclusion (and its associate, the very repugnant conclusion) is dependent on a series of trillions of steps, each of which are intuitively acceptable (adding happy people, making happiness more equal), but reaching a conclusion that is intuitively bad - namely, that we can improve the world by creating trillions of people in torturous and unremitting agony, as long as balance it out by creating enough happy people as well.

Differential reasoning accepts each step, and concludes that the repugnant conclusions are actually acceptable, because each step is sound. Integral reasoning accepts that the repugnant conclusion is repugnant, and concludes that some step along the way must therefore be rejected.

Notice that key word, "therefore". Some intermediate step is rejected, but not for intrinsic reasons, but purely because of the consequence. There is nothing special about the step that is rejected, it's just a relatively arbitrary barrier to stop the process (compare with the paradox of the heap).

Indeed, things can go awry when people attempt to fix the repugnant conclusion (a conclusion they rejected through integral reasoning) using differential methods. Things like the "person-affecting view" have their own ridiculousness and paradoxes (it's ok to bring a baby into the world if it will have a miserable life; we don't need to care about future generations if we randomise conceptions, etc...) and I would posit that it's because they are trying to fix global/integral issues using local/differential tools.

The relevance of this? It seems that integral tools might be better suited to deal with the bad convergence of AI problem. We could set up plausibly intuitive differential criteria (such as self-consistency), but institute overriding integral criteria that can override these if they go too far. I think there may be some interesting ideas in that area, potentially. The cost is that integral ideas are generally seen as less elegant, or harder to justify.

[link] Choose your (preference) utilitarianism carefully – part 1

15 Kaj_Sotala 25 June 2015 12:06PM

Summary: Utilitarianism is often ill-defined by supporters and critics alike, preference utilitarianism even more so. I briefly examine some of the axes of utilitarianism common to all popular forms, then look at some axes unique but essential to preference utilitarianism, which seem to have received little to no discussion – at least not this side of a paywall. This way I hope to clarify future discussions between hedonistic and preference utilitarians and perhaps to clarify things for their critics too, though I’m aiming the discussion primarily at utilitarians and utilitarian-sympathisers.

http://valence-utilitarianism.com/?p=8

I like this essay particularly for the way it breaks down different forms of utilitarianism to various axes, which have rarely been discussed on LW much.

For utilitarianism in general:

Many of these axes are well discussed, pertinent to almost any form of utilitarianism, and at least reasonably well understood, and I don’t propose to discuss them here beyond highlighting their salience. These include but probably aren’t restricted to the following:

  • What is utility? (for the sake of easy reference, I’ll give each axis a simple title – for this, the utility axis); eg happiness, fulfilled preferences, beauty, information(PDF)
  • How drastically are we trying to adjust it?, aka what if any is the criterion for ‘right’ness? (sufficiency axis); eg satisficing, maximising[2], scalar
  • How do we balance tradeoffs between positive and negative utility? (weighting axis); eg, negative, negative-leaning, positive (as in fully discounting negative utility – I don’t think anyone actually holds this), ‘middling’ ie ‘normal’ (often called positive, but it would benefit from a distinct adjective)
  • What’s our primary mentality toward it? (mentality axis); eg act, rule, two-level, global
  • How do we deal with changing populations? (population axis); eg average, total
  • To what extent do we discount future utility? (discounting axis); eg zero discount, >0 discount
  • How do we pinpoint the net zero utility point? (balancing axis); eg Tännsjö’s test, experience tradeoffs
  • What is a utilon? (utilon axis) [3] – I don’t know of any examples of serious discussion on this (other than generic dismissals of the question), but it’s ultimately a question utilitarians will need to answer if they wish to formalise their system.

For preference utilitarianism in particular:

Here then, are the six most salient dependent axes of preference utilitarianism, ie those that describe what could count as utility for PUs. I’ll refer to the poles on each axis as (axis)0 and (axis)1, where any intermediate view will be (axis)X. We can then formally refer to subtypes, and also exclude them, eg ~(F0)R1PU, or ~(F0 v R1)PU etc, or represent a range, eg C0..XPU.

How do we process misinformed preferences? (information axis F)

(F0 no adjustment / F1 adjust to what it would have been had the person been fully informed / FX somewhere in between)

How do we process irrational preferences? (rationality axis R)

(R0 no adjustment / R1 adjust to what it would have been had the person been fully rational / RX somewhere in between)

How do we process malformed preferences? (malformation axes M)

(M0 Ignore them / MF1 adjust to fully informed / MFR1 adjust to fully informed and rational (shorthand for MF1R1) / MFxRx adjust to somewhere in between)

How long is a preference relevant? (duration axis D)

(D0 During its expression only / DF1 During and future / DPF1 During, future and past (shorthand for  DP1F1) / DPxFx Somewhere in between)

What constitutes a preference? (constitution axis C)

(C0 Phenomenal experience only / C1 Behaviour only / CX A combination of the two)

What resolves a preference? (resolution axis S)

(S0 Phenomenal experience only / S1 External circumstances only / SX A combination of the two)

What distinguishes these categorisations is that each category, as far as I can perceive, has no analogous axis within hedonistic utilitarianism. In other words to a hedonistic utilitarian, such axes would either be meaningless, or have only one logical answer. But any well-defined and consistent form of preference utilitarianism must sit at some point on every one of these axes.

See the article for more detailed discussion about each of the axes of preference utilitarianism, and more.

[Link] A Darwinian Response to Sam Harris’s Moral Landscape Challenge

1 TheSurvivalMachine 20 May 2015 01:44PM

I noticed that there has been some earlier discussion about Sam Harris’s Moral Landscape Challenge here at LW. As a writer on the Swedish politico-philosophical blog The Inverted Fable of Reality, I would like to share a response to the challenge, written by our main contributor, which I believe is interesting to read even if you are not familiar with The Moral Landscape or its content. See this link for the response and a short explanation of the challenge.

The response takes a different approach to most responses to the challenge. It is divided into four parts and starts by asking which ethic is most compatible with science and reality and finally tries to answer this question.

Moral Anti-Epistemology

0 Lukas_Gloor 24 April 2015 03:30AM

This post is a half-baked idea that I'm posting here in order to get feedback and further brainstorming. There seem to be some interesting parallels between epistemology and ethics.

Part 1: Moral Anti-Epistemology

"Anti-Epistemology" refers to bad rules of reasoning that exist not because they are useful/truth-tracking, but because they are good at preserving people's cherished beliefs about the world. But cherished beliefs don't just concern factual questions, they also very much concern moral issues. Therefore, we should expect there to be a lot of moral anti-epistemology. 

Tradition as a moral argument, tu quoque, opposition to the use of thought experiments, the noncentral fallacy, slogans like "morality is from humans for humans" – all these are instances of the same general phenomenon. This is trivial and doesn't add much to the already well-known fact that humans often rationalize, but it does add the memetic perspective: Moral rationalizations sometimes concern more than a singular instance, they can affect the entire way people reason about morality. And like with religion or pseudoscience in epistemology about factual claims, there could be entire memeplexes centered around moral anti-epistemology. 

A complication is that metaethics is complicated; it is unclear what exactly moral reasoning is, and whether everyone is trying to do the same thing when they engage in what they think of as moral reasoning. Labelling something "moral anti-epistemology" would suggest that there is a correct way to think about morality. Is there? As long as we always make sure to clarify what it is that we're trying to accomplish, it would seem possible to differentiate between valid and invalid arguments in regard to the specified goal. And this is where moral anti-epistemology might cause troubles. 

Are there reasons to assume that certain popular ethical beliefs are a result of moral anti-epistemology? Deontology comes to mind (mostly because it's my usual suspect when it comes to odd reasoning in ethics), but what is it about deontology that relies on "faulty moral reasoning", if indeed there is something about it that does? How much of it relies on the noncentral fallacy, for instance? Is Yvain's personal opinion that "much of deontology is just an attempt to formalize and justify this fallacy" correct? The perspective of moral anti-epistemology would suggest that it is the other way around: Deontology might be the by-product of people applying the noncentral fallacy, which is done because it helps protect cherished beliefs. Which beliefs would that be? Perhaps the strongly felt intuition that "Some things are JUST WRONG?", which doesn't handle fuzzy concepts/boundaries well and therefore has to be combined with a dogmatic approach. It sounds somewhat plausible, but also really speculative. 

Part 2: Memetics

A lot of people are skeptical towards these memetical just-so stories. They argue that the points made are either too trivial, or too speculative. I have the intuition that a memetic perspective often helps clarify things, and my thoughts about applying the concept of anti-epistemology to ethics seemed like an insight, but I have a hard time coming up with how my expectations about the world have changed because of it. What, if anything, is the value of the idea I just presented? Can I now form a prediction to test whether deontologists want to primarily formalize and justify the noncentral fallacy, or whether they instead want to justify something else by making use of the noncentral fallacy?

Anti-epistemology is a more general model of what is going on in the world than rationalizations are, so it should all reduce to rationalizations in the end. So it shouldn't be worrying that I don't magically find more stuff. Perhaps my expectations were too high and I should be content with having found a way to categorize moral rationalizations, the knowledge of which will make me slightly quicker at spotting or predicting them.

Thoughts?

A Basic Problem of Ethics: Panpsychism?

-4 capybaralet 27 January 2015 06:27AM

Panpsychism seems like a plausible theory of consciousness.  It raises extreme challenges for establishing reasonable ethical criteria.

It seems to suggest that our ethics is very subjective: the "expanding circle" of Peter Singer would eventually (ideally) stretch to encompass all matter.  But how are we to communicate with, e.g. rocks?  Our ability to communicate with one another and our presumed ability to detect falsehood and empathize in a meaningful way allow us to ignore this challenge wrt other people.

One way to argue that this is not such a problem is to suggest that humans are simply very limited in our capacity as ethical beings, and that we are fundamentally limited in our perceptions of ethical truth to only be able to draw conclusions with any meaningful degree of certainty about other humans or animals (or maybe even life-forms, if you are optimistic).  

But this is not very satisfying if we consider transhumanism.  Are we to rely on AI to extrapolate our intuitions to the rest of matter?  How do we know that our intuitions are correct (or do we even care?  I do, personally...)?  How can we tell if an AI is correctly extrapolating?




A Somewhat Vague Proposal for Grounding Ethics in Physics

-3 capybaralet 27 January 2015 05:45AM

As Tegmark argues, the idea of "final goal" for AI is likely incoherent, at least if (as he states), "Quantum effects aside, a truly well-defined goal would specify how all particles in our Universe should be arranged at the end of time."  

But "life is a journey not a destination".  So really, what we should be specifying is the entire evolution of the universe through its lifespan.  So how can the universe "enjoy itself" as much as possible before the big crunch (or before and during the heat death)*.

I hypothesize that experience is related to, if not a product of, change.  I further propose (counter-intuitively, and with an eye towards "refinement" (to put it mildly))** that we treat experience as inherently positive and not try to distinguish between positive and negative experiences.

Then it seems to me the (still rather intractable) question is: how does the rate of entropy's increase relate to the quantity of experience produced?  Is it simply linear (in which case, it doesn't matter, ethically)?  My intuition is that is it more like the fuel efficiency of a car, non-linear and with a sweet spot somewhere between a lengthy boredom and a flash of intensity.



*I'm not super up on cosmology; are there other theories I ought to be considering?

**One idea for refinement: successful "prediction" (undefined here) creates positive experiences; frustrated expectations negative ones.


Ethical Diets

2 pcm 12 January 2015 11:38PM

[Cross-posted from my blog.]

I've seen some discussion of whether effective altruists have an obligation to be vegan or vegetarian.

The carnivores appear to underestimate the long-term effects of their actions. I see a nontrivial chance that we're headed toward a society in which humans are less powerful than some other group of agents. This could result from slow AGI takeoff producing a heterogeneous society of superhuman agents. Or there could be a long period in which the world is dominated by ems before de novo AGI becomes possible. Establishing ethical (and maybe legal) rules that protect less powerful agents may influence how AGIs treat humans or how high-speed ems treat low-speed ems and biological humans [0]. A one in a billion chance that I can alter this would be worth some of my attention. There are probably other similar ways that an expanding circle of ethical concern can benefit future people.

I see very real costs to adopting an ethical diet, but it seems implausible that EAs are merely choosing alternate ways of being altruistic. How much does it cost MealSquares customers to occasionally bemoan MealSquares use of products from apparently factory-farmed animals? Instead, it seems like EAs have some tendency to actively raise the status of MealSquares [1].

I don't find it useful to compare a more ethical diet to GiveWell donations for my personal choices, because I expect my costs to be mostly inconveniences, and the marginal value of my time seems small [2], with little fungibility between them.

I'm reluctant to adopt a vegan diet due to the difficulty of evaluating the health effects and due to the difficulty of evaluating whether it would mean fewer animals living lives that they'd prefer to nonexistence.

But there's little dispute that most factory-farmed animals are much less happy than pasture-raised animals. And everything I know about the nutritional differences suggests that avoiding factory-farmed animals improves my health [3].

I plan not to worry about factory-farmed invertebrates for now (shrimp, oysters, insects), partly because some of the harmful factory-farm practices such as confining animals to cages not much bigger than the animals in question aren't likely with animals that small.

So my diet will consist of vegan food plus shellfish, insects, wild-caught fish, pasture-raised birds/mammals (and their eggs/whey/butter). I will assume vertebrate animals are raised in cruel conditions unless they're clearly marked as wild-caught, grass-fed, or pasture-raised [4].

I've made enough changes to my diet for health reasons that this won't require large changes. I already eat at home mostly, and the biggest change to that part of my diet will involve replacing QuestBars with a home-made version using whey protein from grass-fed cows (my experiments so far indicate it's inconvenient and hard to get a decent texture). I also have some uncertainty about pork belly [5] - the pasture-raised version I've tried didn't seem as good, but that might be because I didn't know it needed to be sliced very thin.

My main concern is large social gatherings. It has taken me a good deal of willpower to stick to a healthy diet under those conditions, and I expect it to take more willpower to observe ethical constraints.

A 100% pure diet would be much harder for me to achieve than an almost pure diet, and it takes some time for me to shift my habits. So for this year I plan to estimate how many calories I eat that don't fit this diet, and aim to keep that less than 120 calories per month (about 0.2%) [6]. I'll re-examine the specifics of this plan next Jan 1.

Does anyone know a convenient name for my planned diet?

 

footnotes

 

0. With no one agent able to conquer the world, it's costly for a single agent to repudiate an existing rule. A homogeneous group of superhuman agents might coordinate to overcome this, but with heterogeneous agents the coordination costs may matter.

1. I bought 3 orders of MealSquares, but have stopped buying for now. If they sell a version whose animal products are ethically produced (which I'm guessing would cost $50/order more), I'll resume buying them occasionally.

2. The average financial value of my time is unusually high, but I often have trouble estimating whether spending more time earning money has positive or negative financial results. I expect financial concerns will be more important to many people.

3. With the probable exception of factory-farmed insects, oysters, and maybe other shellfish.

4. In most restaurants, this will limit me to vegan food and shellfish.

5. Pork belly is unsliced bacon without the harm caused by smoking.

6. Yes, I'll have some incentive to fudge those estimates. My experience from tracking food for health reasons suggests possible errors of 25%. That's not too bad compared to other risks such as lack of willpower.

Problems and Solutions in Infinite Ethics

9 Xodarap 04 January 2015 02:06PM

(Crossposted from the EA forum.)

Summary: The universe may very well be infinite, and hence contain an infinite amount of happiness and sadness. This causes several problems for altruists; for example: we can plausibly only affect a finite subset of the universe, and an infinite quantity of happiness is unchanged by the addition or subtraction of a finite amount of happiness. This would imply that all forms of altruism are equally ineffective.

Like everything in life, the canonical reference in philosophy about this problem was written by Nick Bostrom. However, I found that an area of economics known as "sustainable development" has actually made much further progress on this subject than the philosophy world. In this post I go over some of what I consider to be the most interesting results.

NB: This assumes a lot of mathematical literacy and familiarity with the subject matter, and hence isn't targeted to a general audience. Most people will probably prefer to read my other posts:


1. Summary of the most interesting results

  1. There’s no ethical system which incorporates all the things we might want.
  2. Even if we have pretty minimal requirements, satisfactory ethical systems might exist but we can’t prove their existence, much less actually construct them
  3. Discounted utilitarianism, whereby we value people less just because they are further away in time, is actually a pretty reasonable thing despite philosophers considering it ridiculous.
    1. (I consider this to be the first reasonable argument for locavorism I've ever heard)

2. Definitions

In general, we consider a population to consist of an infinite utility vector (u0,u1,…) where ui is the aggregate utility of the generation alive at time i. Utility is a bounded real number (the fact that economists assume utility to be bounded confused me for a long time!). Our goal is to find a preference ordering over the set of all utility vectors which is in some sense “reasonable”. While philosophers have understood for a long time that finding such an ordering is difficult, I will present several theorems which show that it is in fact impossible.

Due to a lack of latex support I’m going to give English-language definitions and results instead of math-ey ones; interested people should look at the papers themselves anyway.

3. Impossibility Results

3.1 Definitions

  • Strong Pareto: if you can make a generation better off, and none worse off, you should.
  • Weak Pareto: if you can make every generation better off, you should.
  • Intergenerational equity: utility vectors are unchanged in value by any permutation of their components.
    • There is an important distinction here between allowing a finite number of elements to be permuted and an infinite number; I will refer to the former as “finite intergenerational equity” and the latter as just “intergenerational equity”
  • Ethical relation: one which obeys both weak Pareto and intergenerational equity
  • Social welfare function: an order-preserving function from the set of populations (utility vectors) to the real numbers

3.2 Diamond-Basu-Mitra Impossibility Result1

  1. There is no social welfare function which obeys Strong Pareto and finite intergenerational equity. This means that any sort of utilitarianism won’t work, unless we look outside the real numbers.

3.3 Zame's impossibility result2

  1. If an ordering obeys intergenerational equity over [0,1]N, then almost always we can’t tell which of two populations is better 
    1. (i.e. the set of populations {X,Y: neither X<Y nor X>Y} has outer measure one)
  2. The existence of an ethical preference relation on [0,1]N is independent of ZF plus the axiom of choice

4. Possibility Results

We’ve just shown that it’s impossible to construct or even prove the existence of any useful ethical system. But not all hope is lost!

The important idea here is that of a “subrelation”: < is a subrelation to <’ if x<y implies x<’y.

Our arguments will work like this:

Suppose we could extend utilitarianism to the infinite case. (We don't, of course, know that we can extend utilitarianism to the infinite case. But suppose we could.) Then A, B and C must follow.

Technically: suppose utilitarianism is a subrelation of <. Then < must have properties A, B and C.

Everything in this section comes from (3), which is a great review of the literature.

4.1 Definition

  • Utilitarianism: we extend the standard total utilitarianism ordering to infinite populations in the following way: suppose there is some time T after which every generation in X is at least as well off as every generation in Y, and that the total utility in X before T is at least as good as the total utility in Y before T. Then X is at least as good as Y.
    • Note that this is not a complete ordering! In fact, as per Zame’s result above, the set of populations it can meaningfully speak about has measure zero.
  • Partial translation scale invariance: suppose after some time T, X and Y become the same. Then we can add any arbitrary utility vector A to both X and Y without changing the ordering. (I.e. X > Y ó X+A > Y+A)

4.2 Theorem

  1. Utilitarianism is a subrelation of > if and only if > satisfies strong Pareto, finite intergenerational equity and partial translation scale invariance.
    1. This means that if we want to extend utilitarianism to the infinite case, we can’t use a social welfare function, as per the above Basu-Mitra result

4.3 Definition

  • Overtaking utilitarianism: suppose there is some point T after which the total utility of the first N generations in X is always greater than the total utility of the first N generations in Y (given N > T). Then X is better than Y.
    • Note that utilitarianism is a subrelation of overtaking utilitarianism
  • Weak limiting preference: suppose that for any time T, X truncated at time T is better than Y truncated at time T. Then X is better than Y.

4.4 Theorem

  1. Overtaking utilitarianism is a subrelation of < if and only if < satisfies strong Pareto, finite intergenerational equity, partial translation scale invariance, and weak limiting preference

4.5 Definition

  • Discounted utilitarianism: the utility of a population is the sum of its components, discounted by how far away in time they are
  • Separability:
    • Separable present: if you can improve the first T generations without affecting the rest, you should
    • Separable future: if you can improve everything after the first T generations without affecting the rest, you should
  • Stationarity: preferences are time invariant
  • Weak sensitivity: for any utility vector, we can modify its first generation somehow to make it better

4.6 Theorem

  1. The only continuous, monotonic relation which obeys weak sensitivity, stationary, and separability is discounted utilitarianism

4.7 Definition

  • Dictatorship of the present: there’s some time T after which changing the utility of generations doesn’t matter

4.8 Theorem

  1. Discounted utilitarianism results in a dictatorship of the present. (Remember that each generation’s utility is assumed to be bounded!)

4.9 Definition

  • Sustainable preference: a continuous ordering which doesn’t have a dictatorship of the present but follows strong Pareto and separability.

4.10 Theorem

  1. The only ordering which is sustainable is to take discounted utilitarianism and add an “asymptotic” part which ensures that infinitely long changes in utility matter. (Of course, finite changes in utility still won't matter.)

5. Conclusion

I hope I've convinced you that there's a "there" there: infinite ethics is something that people can make progress on, and it seems that most of the progress is being made in the field of sustainable development.

Fun fact: the author of the last theorem (the one which defined "sustainable") was one of the lead economists on the Kyoto protocol. Who says infinite ethics is impractical?

6. References

  1. Basu, Kaushik, and Tapan Mitra. "Aggregating infinite utility streams with intergenerational equity: the impossibility of being Paretian." Econometrica 71.5 (2003): 1557-1563. http://folk.uio.no/gasheim/zB%26M2003.pdf
  2. Zame, William R. "Can intergenerational equity be operationalized?." (2007).  https://tspace.library.utoronto.ca/bitstream/1807/9745/1/1204.pdf
  3. Asheim, Geir B. "Intergenerational equity." Annu. Rev. Econ. 2.1 (2010): 197-222.http://folk.uio.no/gasheim/A-ARE10.pdf

Integral versus differential ethics

9 Stuart_Armstrong 01 December 2014 06:04PM

In population ethics...

Most people start out believing that the following are true:

  1. That adding more happy lives is a net positive.
  2. That redistributing happiness more fairly is not a net negative.
  3. That the repugnant conclusion is indeed repugnant.

Some will baulk on the first statement on equality grounds, but most people should accept those three statements as presented. Then they find out about the mere addition paradox.

Someone who then accepts the repugnant could then reason something like this:

Adding happy people and redistributing fairly happiness, if done many, many times, in the way described above, will result in a repugnant conclusion. Each step along the way seems solid, but the conclusion seems wrong. Therefore I will accept the repugnant conclusion, not on its own merits, but because each step is clearly intuitively correct.

Call this the "differential" (or local) way or reasoning about population ethics. As long as each small change seems intuitively an improvement, then the global change must also be.

Adding happy people and redistributing fairly happiness, if done many, many times, in the way described above, will result in a repugnant conclusion. Each step along the way seems solid, but the conclusion seems wrong. Therefore I will reject (at least) one step, not on its own merits, but because the conclusion is clearly intuitively incorrect.

Call this the "integral" (or global) way of reasoning about population ethics. As long as the overall change seems intuitively a deterioration, then some of the small changes along the way must also be.

 

In general...

Now, I personally tend towards integral rather than differential reasoning on this particular topic. However, I want to make a more general point: philosophy may be over dedicated to differential reasoning. Mainly because it's easy: you can take things apart, simplify them, abstract details away, and appeal to simple principles - and avoid many potential biases along the way.

But it's also a very destructive tool to use in areas where concepts are unclear and cannot easily be made clear. Take the statement "human life is valuable". This can be taken apart quite easily, critiqued from all directions, its lack of easily described meaning its weakness. Nevertheless, integral reasoning is almost always applied: something called "human life" is taken to be "valuable", and many caveats and subdefinitions can be added to these terms without changing the fundamental (integral) acceptance of the statement. If we followed the differential approach, we might end up with the definition of "human life" as "energy exchange across a neurone cell membrane" or something equally ridiculous but much more rigorous.

Now, that example is a parody... but only because no-one sensible does that, we know that we'd lose too much value from that kind of definition. We want to build an extensive/integral definition of life, using our analysis to add clarity rather than simplify to a few core underlying concepts. But in population ethics and many other cases, we do feel free to use differential ethics, replacing vague overarching concepts with clear simplified versions that clearly throw away a lot of the initial concept.

Maybe we do it too much. To pick an example I disagree with (always a good habit), maybe there is such a thing as "society", for instance, not simply the total of individuals and their interactions. You can already use pretty crude consequentialist arguments with "societies" as agents subject to predictable actions and reactions (social science does it all the time), but what if we tried to build a rigorous definition of society as something morally valuable, rather than focusing on individual?

Anyway, we should be aware when, in arguments, we are keeping the broad goal and making the small steps and definitions conform to it, and when we are focusing on the small steps and definitions and following them wherever they lead.

Population ethics and utility indifference

3 Stuart_Armstrong 24 November 2014 03:18PM

It occurs to me that the various utility indifference approaches might be usable in population ethics.

One challenge for non-total utilitarians is how to deal with new beings. Some theories - average utilitarianism, for instance, or some other systems that use overall population utility - have no problem dealing with this. But many non-total utilitarians would like to see creating new beings as a strictly neutral act.

One way you could do this is by starting with a total utilitarian framework, but subtracting a certain amount of utility every time a new being B is brought into the world. In the spirit of utility indifference, we could subtract exactly the expected utility that we expect B to enjoy during their life.

This means that we should be indifferent as to whether B is brought into the world or not, but, once B is there, we should aim to increase B's utility. There are two problems with this. The first is that, strictly interpreted, we would also be indifferent to creating people with negative utility. This can be addressed by only doing the "utility correction" if B's expected utility is positive, thus preventing us from creating beings only to have them suffer.

The second problem is more serious. What about all the actions that we could do, ahead of time, in order to harm or benefit the new being? For instance, it would seem perverse to argue that buying a rattle for a child after they are born (or conceived) is an act of positive utility, whereas buying it before they were born (or conceived) would be a neutral act, since the increase in expected utility for the child is cancel out by the above process. Not only is it perverse, but it isn't timeless, and isn't stable under self modification.

continue reading »

Cross-temporal dependency, value bounds and superintelligence

7 joaolkf 28 October 2014 03:26PM

In this short post I will attempt to put forth some potential concerns that should be relevant when developing superintelligences, if certain meta-ethical effects exist. I do not claim they exist, only that it might be worth looking for them since their existence would mean some currently irrelevant concerns are, in fact, relevant. 

 

These meta-ethical effects would be a certain kind of cross-temporal dependency on moral value. First, let me explain what I mean by cross-temporal dependency. If value is cross-temporal dependent it means that value at t2 could be affected by t1, independently of any causal role t1 has on t2. The same event X at t2 could have more or less moral value depending on whether Z or Y happened at t1. For instance, this could be the case on matters of survival. If we kill someone and replace her with a slightly more valuable person some would argue there was a loss rather than a gain of moral value; whereas if a new person with moral value equal to the difference of the previous two is created where there was none, most would consider an absolute gain. Furthermore, some might consider small, gradual and continual improvements are better than abrupt and big ones. For example, a person that forms an intention and a careful detailed plan to become better, and forceful self-wrought to be better could acquire more value than a person that simply happens to take a pill and instantly becomes a better person - even if they become that exact same person. This is not because effort is intrinsically valuable, but because of personal continuity. There are more intentions, deliberations and desires connecting the two time-slices of the person who changed through effort than there are connecting the two time-slices of the person who changed by taking a pill. Even though both persons become equally morally valuable in isolated terms, they do so from different paths that differently affects their final value.

More examples. You live now in t1. If suddenly in t2 you were replaced by an alien individual with the same amount of value as you would otherwise have in t2, then t2 may not have the exact same amount of value as it would otherwise have, simply by virtue of the fact that in t1 you were alive and the alien's previous time slice was not. 365 individuals with a 1 day life do not amount to the same value as a single individual living through 365 days. Slice history in 1 day periods, each day the universe contains one unique advanced civilization with the same overall total moral value, each civilization being completely alien and ineffable to another, each civilization only lives for one day, and then it would be gone forever. This universe does not seem to hold the same moral value as the one where only one of those civilizations flourishes for eternity. On all these examples the value of a period of time seems to be affected by the existence or not of certain events at other periods. They indicate that there is, at least, some cross-temporal dependency.

 

Now consider another type of effect, bounds on value. There could be a physical bound – transfinite or not - on the total amount of moral value that can be present per instant. For instance, if moral value rests mainly on sentient well-being, which can be categorized as a particular kind of computation, and there is a bound on the total amount of such computation which can be performed per instant, then there is a bound on the amount of value per instant. If, arguably, we are currently extremely far from such bound, and this bound will eventually be reached by a superintelligence (or any other structure), then the total moral value of the universe would be dominated by the value of this physical bound, given that regions where the physical bound wasn't reached would make negligible contributions. How much faster the bound can be reached, also how much more negligible pre-bound values are.

 

Finally, if there is a form of value cross-temporal dependence where preceding events leading to a superintelligence could alter the value of this physical bound, then we not only ought to make sure we safely construct a superintelligence, but also that we do so following the path that maximizes such bound. It might be the case that an overly abrupt superintelligence would decrease such bound, thus all future moral value would be diminished by the fact there was a huge discontinuity in the past in the events leading to this future. Even small decreases on such bound would have dramatic effects. Although I do not know of any plausible cross-temporal effect of such kind, it seems this question deserves at least a minimal amount of thought. Both cross-temporal dependency and bounds on value seem plausible (in fact I believe some form of them are true), so it is not at all prima facie inconceivable that we could have cross-temporal effects changing the bound up or down.

Request for feedback on a paper about (machine) ethics

7 Caspar42 28 September 2014 12:03PM

I have written a paper on ethics with special concentration on machine ethics and formality with the following abstract:

Most ethical systems are formulated in a very intuitive, imprecise manner. Therefore, they cannot be studied mathematically. In particular, they are not applicable to make machines behave ethically. In this paper we make use of this perspective of machine ethics to identify preference utilitarianism as the most promising approach to formal ethics. We then go on to propose a simple, mathematically precise formalization of preference utilitarianism in very general cellular automata. Even though our formalization is incomputable, we argue that it can function as a basis for discussing practical ethical questions using knowledge gained from different scientific areas.

Here are some further elements of the paper (things the paper uses or the paper is about):

  • (machine) ethics
  • (in)computability
  • artificial life in cellular automata
  • Bayesian statistics
  • Solomonoff's a priori probability

As I propose a formal ethical system, things get mathy at some point but the first and by far most important formula is relatively simple - the rest can be skipped then, so no problem for the average LWer.

I already discussed the paper with a few fellow students, as well as Brian Tomasik and a (computer science) professor of mine. Both recommended me to try to publish the paper. Also, I received some very helpful feedback. But because this would be my first attempt to publish something, I could still use more help, both with the content itself and scientific writing in English (which, as you may have guessed, is not my first language), before I submit the paper and Brian recommended using the LW's discussion board. I would also be thankful for recommendations on which journal is appropriate for the paper.

I would like to send those interested a draft via PM. This way I can also make sure that I don't spend all potential reviewers on the current version.

DISCLAIMER: I am not a moral realist. Also and as mentioned in the abstract, the proposed ethical system is incomputable and can therefore be argued to have infinite Kolmogorov complexity. So, it does not really pose a conflict with LW-consensus (including Complexity of value).

Link: Study finds that using a foreign language changes moral decisions

8 Vladimir_Golovin 30 April 2014 05:26AM

In the new study, two experiments using the well-known "trolley dilemma" tested the hypothesis that when faced with moral choices in a foreign language, people are more likely to respond with a utilitarian approach that is less emotional.

The researchers collected data from people in the U.S., Spain, Korea, France and Israel. Across all populations, more participants selected the utilitarian choice -- to save five by killing one -- when the dilemmas were presented in the foreign language than when they did the problem in their native tongue.

The article:
http://www.sciencedaily.com/releases/2014/04/140428120659.htm

The publication:
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0094842

Proportional Giving

10 gjm 02 March 2014 09:09PM

Executive summary: The practice of giving a fixed fraction of one's income to charity is near-universal but possibly indefensible. I describe one approach that certainly doesn't defend it, speculate vaguely about a possible way of fixing it up, and invite better ideas from others.


Many of us give a certain fraction of our income to charitable causes. This sort of practice has a long history:

Deuteronomy 14:22 Thou shalt truly tithe all the increase of thy seed, that the field bringeth forth year by year.

(note that "tithe" here means "give one-tenth of") and is widely practised today:

GWWC Pledge: I recognise that I can use part of my income to do a significant amount of good in the developing world. Since I can live well enough on a smaller income, I pledge that from today until the day I retire, I shall give at least ten percent of what I earn to whichever organizations can most effectively use it to help people in developing countries. I make this pledge freely, openly, and without regret.

And of course it's roughly how typical taxation systems (which are kinda-sorta like charitable donation, if you squint) operate. But does it make sense? Is there some underlying principle from which a policy of giving away a certain fraction of one's income (not necessarily the traditional 10%, of course) follows?

The most obvious candidate for such a principle would be what we might call

Weighted Utilitarianism: Act so as to maximize a weighted sum of utility, where (e.g.) one's own utility may be weighted much higher than that of random far-away people.

But this can't produce anything remotely like a policy of proportional giving. Assuming you aren't giving away many millions per year (which is a fair assumption if you're thinking in terms of a fraction of your salary) then the level of utility-per-unit-money achievable by your giving is basically independent of what you give, and so is the weight you attach to the utility of the beneficiaries.

So suppose that when your income, after taking out donations, is $X, your utility (all else equal) is u(X), so that your utility per marginal dollar is u'(X); and suppose you attach weight 1 to your own utility and weight w to that of the people who'd benefit from your donations; and suppose their gain in utility per marginal dollar given is t. Then when your income is S you will set your giving g so that u'(S-g) = wt.

What this says is that a weighted-utilitarian should keep a fixed absolute amount S-g of his or her income, and give all the rest away. The fixed absolute amount will depend on the weight w (hence, on exactly which people are benefited by the donations) and on the utility per dollar given t (hence, on exactly what charities are serving them and how severe their need is), but not on the person's pre-donation income S.

(Here's a quick oversimplified example. Suppose that utility is proportional to log(income), that the people your donations will help have an income equivalent to $1k/year, that you care 100x more about your utility than about theirs, and that your donations are the equivalent of direct cash transfers to those people. Then u' = 1/income, so you should keep everything up to $100k/year and give the rest away. The generalization to other weighting factors and beneficiary incomes should be obvious.)

This argument seems reasonably watertight given its premises, but proportional giving is so well-established a phenomenon that we might reasonably trust our predisposition in its favour more than our arguments against. Can we salvage it somehow?

Here's one possibility. One effect of income is (supposedly) to incentivize work, and maybe (mumble near mode mumble) this effect is governed entirely by anticipated personal utility and not by any benefit conferred on others. Then the policy derived above, which above the threshold makes personal utility independent of effort, would lead to minimum effort and hence maybe less net weighted utility than could be attained with a different policy. Does this lead to anything like proportional giving, at least for some semi-plausible assumptions about the relationship between effort and income?

At the moment, I don't know. I have a page full of scribbled attempts to derive something of the kind, but they didn't work out. And of course there might be some better way to get proportional giving out of plausible ethical principles. Anyone want to do better?

Prescriptive vs. descriptive and objective vs. subjective definitions

4 PhilGoetz 21 January 2014 11:21PM

Imagine you're writing a Field Guide to Boats, and you want to know what you should include in your field guide. Barges? Rafts? These things?

You want something like a dictionary definition of boat. A descriptive definition that includes anything people commonly think of as a boat; an objective definition, because you're only writing one book, not a separate version for each reader.

Now imagine you're stranded on an island, and you open a bottle, and a genie comes out and gives you one wish, and you say, "I wish for a boat!", and the genie says, "Well, what's a boat?" And you know, because you've read stories, that the genie will take your definition of "boat" and try to screw you over. You'd better not read out the dictionary definition, or the genie will give you a toy boat, or a boat with a hole in it, or a kayak too small for you to fit into. You need a prescriptive, subjective definition of a thing that will transport you over water.

continue reading »

Some thoughts on relations between major ethical systems

6 [deleted] 24 November 2013 09:32PM

On the recent LessWrong/CFAR Census Survey, I hit the following question:

 

Which of the following major ethical systems do you subscribe to:

1) Consequentialism

2) Deontology

3) Virtue Ethics

4) Other

 

To my own surprise, I couldn't come up with a clear answer.  I certainly don't consistently apply one of these things across every decision I make in my life, and yet I consider myself at least mediocre on the scale of moral living, if not actually Neutral Good.  So what is it I'm actually doing, and how can I behave more ethical-rationally?

 

Well, to analyze my own cognitive algorithms, I do think I can actually place these various codes of ethics in relation to each other.  Basically, looked at behavioristically/algorithmically, they vary across how much predictive power I have, my knowledge of my own values, and what it is I'm actually trying to affect.

 

Consequentialism is the ethical algorithm I consider useful in situations of greatest predictive power and greatest knowledge of my own values.  It is, so to speak, the ethical-algorithmic ideal.  In such situations, the only drawback is that naive consequentialism fails to consider consequences on the person acting (ie: me).  Once I make that more virtue-ethical adjustment, consequentialism offers a complete ideal for ethical action over a complete spectrum of moral values for affecting both the universe and myself (but I repeat: I'm part of the universe).

 

However, in almost all real situations, I don't have perfect predictive knowledge -- not of the "external" universe and not of my own values.  In these situations, I can, however, use my incomplete and uncertain knowledge to find acceptable heuristics that I can expect to yield roughly monotonic behavior: follow those rules, and my actions will generally have positive effects.  This kind of thinking quickly yields up recognizable, regular moral commandments like, "You will not murder" or "You will not charge interest above this-or-that amount on loans".  Yes, of course we can come up with corner-case exceptions to those rules, and we can also elaborate logically on the rules to arrive at more detailed rules covering more circumstances.  However, by the time we've fully elaborated out the basic commandments into a complete, obsessively-compulsively detailed legal code (oh hello Talmud), we've already covered most of the major general cases of moral action.  We can now invent a criterion for how and when to transition from one level of ethical code to the one below it: our deontological heuristics should be detailed enough to handle any case where we lack the information (about consequences and values) to resort to consequentialism.

 

At first thought, virtue ethics seems like an even higher-level heuristic than deontological ethics.  The problem is that, unlike deontological and consequentialist ethics, it doesn't output courses of action to take, but instead short- and long-term states of mind or character that can be considered virtuous.  So we don't have the same thing here; it's not a higher-level heuristic but a seemingly completely different form of ethics.  I do think we can integrate it, however: virtue ethics just consists of a set of moral values over one's own character.  "What kind of person do I think is a good person?" might, by default, be a tautological question under strict consequentialism or deontology.  However, when we take an account of the imperfect nature of real people (we are part of the universe, after all), we can observe that virtue ethics serves as a convenient guide to heuristics for becoming the sort of person who can be relied upon to take right actions when moral issues present themselves.  Rather than simply saying, "Do the right thing no matter what" (an instruction that simply won't drive real human beings to actually do the right thing), virtue ethics encourages us to cultivate virtues, moral cognitive biases towards at least a deontological notion of right action.

 

It's also possible we might be able to separate virtue ethics into both heuristics over our own character, and actual values over our own character.  These two approaches to virtue ethics should then converge in the presence of perfect information: if I knew myself utterly, my heuristics for my own character would exactly match my values over my own character.

 

This is my first effort at actually blogging on rationality subjects, so I'm hoping it's not covering something hashed and rehashed, over and over again, in places like the Sequences, of which I certainly can't attest a full knowledge.

'Effective Altruism' as utilitarian equivocation.

1 Dias 24 November 2013 06:35PM

Summary: The term 'effective altuist' invites confusion between 'the right thing to do' and 'the thing that most efficiently promotes welfare.' I think this creeping utilitarianism is a bad thing, and should at least be made explicit. This is not to accuse anyone of deliberate deception.

Over the last year or so, the term 'Effective Altruist' has come into use. I self-identified as one on the LW survey, so I speak as a friend. However, I think there is a very big danger with the terminology.

The term 'Effective Altruist' was born out of the need for a label for those people who were willing to dedicate their lives to making the world a better place in rational ways, even if that meant doing counter-intuitive things, like working as an Alaskan truck driver. The previous term, 'really super awesome hardcore people', was indeed a little inelegant.

However, 'Effective Altruist' has a major problem: it refers to altruism, not ethics. Altruism may be a part of ethics (though the etymology of the term gives some concern), but it is not all there is to ethics. Value is complex. Helping people is good, but so is truth, and justice, and freedom, and beauty, and loyalty, and fairness, and honor, and fraternity, and tradition, and many other things.

A charity that very efficiently promoted beauty and justice, but only inefficiently produced happiness, would probably not be considered an EA organization. A while ago I suggested to [one of the leaders of the Center for Effective Altruism] the creation of a charity to promote promise-keeping. I didn't claim such a charity would be an optimal way of promoting happiness, and to them, this was sufficient to show 1) that it was not EA - and  hence 2) inferior to EA things.

Such thinking involves either a equivocation or a concealed premise. If 'EA' is interpreted literally, so 'the primary/driving goal is to help others', then something not being EA is insufficient for it to not be the best thing you could do - there is more to ethics and the good, than altruism and promoting welfare. Failure to promote one dimension of the good doesn't mean you're not the optimal way of promoting their sum. On the other hand, if 'EA' is interpreted broadly, as being concerned with 'happiness, health, justice, fairness and/or other values', then merely failing to promote welfare/happiness does not mean a cause is not EA. Much EA discussion, like on the popular facebook group, equivocates between these two meanings.*

...Unless one thought that helping people was all their was to ethics, in which case this is not equivocation. As virtually all of CEA's leaders are utilitarians, it is plausible that is was the concealed premise in their argument. In this case, there is no equivocation, but a different logical fallacy, that of an omitted premise, has been committed. And we should be just as wary as in the case of equivocation.

Unfortunately, utilitarianism is false, or at least not obviously true. Something can be the morally best thing to do, while not being EA. Just because some utilitarians have popularized a term which cleverly equivocates between "promotes welfare" and "is the best thing" does not mean we should be taken in. Every fashionable ideology likes to blurr the lines between its goals and its methods (is Socialism about helping the working man or about state ownership of industry? is libertarianism about freedom or low taxes?) in order to make people who agree with the goals forget that there might be other means of achieving them.

There are two options: recognize 'EA' as referring to only a subset of morality, or recognize as 'EA' actions and organizations that are ethical through ways other than producing welfare/happiness.

* Yes, one might say that promoting X's honor thereby helped X, and thus there was no distinction. However, I think people who make this argument in theory are unlikely to observe it in practice - I doubt that there will be an EA organisation dedicated to pure retribution, even if it was both extremely cheap to promote and a part of ethics.

Another problem with quantum measure

1 Stuart_Armstrong 18 November 2013 11:03AM

Let's play around with the quantum measure some more. Specifically, let's posit a theory T that claims that the quantum measure of our universe is increasing - say by 50% each day. Why could this be happening? Well, here's a quasi-justification for it: imagine there are lots and lots of of universes, most of them in chaotic random states, jumping around to other chaotic random states, in accordance with the usual laws of quantum mechanics. Occasionally, one of them will partially tunnel, by chance, into the same state our universe is in - and then will evolve forwards in time exactly as our universe is. Over time, we'll accumulate an ever-growing measure.

That theory sounds pretty unlikely, no matter what feeble attempts are made to justify it. But T is observationally indistinguishable from our own universe, and has a non-zero probability of being true. It's the reverse of the (more likely) theory presented here, in which the quantum measure was being constantly diminished. And it's very bad news for theories that treat the quantum measure (squared) as akin to a probability, without ever renormalising. It implies that one must continually sacrifice for the long-term: any pleasure today is wasted, as that pleasure will be weighted so much more tomorrow, next week, next year, next century... A slight fleeting smile on the face of the last human is worth more than all the ecstasy of the previous trillions.

One solution to the "quantum measure is continually diminishing" problem was to note that as the measure of the universe diminished, it would eventually get so low that that any alternative, non-measure diminishing theory, not matter how initially unlikely, would predominate. But that solution is not available here - indeed, that argument runs in reverse, and makes the situation worse. No matter how initially unlikely the "quantum measure is continually increasing" theory is, eventually, the measure will become so high that it completely dominates all other theories.

Change the labels, undo infinitely good improvements

0 Stuart_Armstrong 01 November 2013 12:51PM

Infinity is big. You just won't believe how vastly, hugely, mindbogglingly big it is. I mean, you may think it's a long way down the road to the chemist's, but that's just peanuts to infinity.

And there are a lot of paradoxes connected with infinity. Here we'll be looking at a small selection of them, connected with infinite ethics.

Suppose that you had some ethical principles that you would want to spread to infinitely many different agents - maybe through acausal decision making, maybe through some sort of Kantian categorical imperative. So even if the universe is infinite, filled with infinitely many agents, you have potentially infinite influence (which is more than most of us have most days). What would you do with this influence - what kind of decisions would you like to impose across the universe(s)'s population? What would count as an improvement?

There are many different ethical theories you could use - but one thing you'd want is that your improvements are actual improvements. You wouldn't want to implement improvements that turn out to be illusionary. And you certainly wouldn't want to implement improvements that could be undone by relabeling people.

How so? Well, imagine that you have a countable infinity of agents, with utilities (..., -3, -2, -1, 0, 1, 2, 3, ...). Then suppose everyone gets +1 utility. You'd think that giving an infinity of agents one extra utility each would be fabulous - but the utilities are exactly the same as before. The current -1 utility belongs to the person who had -2 before, but there's still currently someone with -1, just as there was someone with -1 before the change. And this holds for every utility value: an infinity of improvements has accomplished... nothing. As soon as you relabel who is who, you're in exactly the same position as before.

But things can get worse. Subtracting one utility from everyone also leaves the outcome the same, after relabeling everyone. So this universal improvement is completely indistinguishable from a universal regression.

continue reading »

A Pure Math Argument for Total Utilitarianism

-5 Xodarap 27 October 2013 05:05PM

Summary: I sketch an argument that population ethics should, in a certain technical sense, be similar to addition. I show that a surprising theorem of Hölder's implies that this means that we should be total utilitarians.

Addition is a very special operation. Despite the wide variety of esoteric mathematical objects known to us today, none of them have the basic desirable properties of grade-school arithmetic.

This fact was intuited by 19th century philosophers in the development of what we now call "total" utilitarianism. In this ethical system, we can assign each person a real number to indicate their welfare, and the value of an entire population is the sum of each individuals' welfare.

Using modern mathematics, we can now prove the intuition of Mills and Bentham: because addition is so special, any ethical system which is in a certain technical sense "reasonable" is equivalent to total utilitarianism.

What do we mean by ethics?


The most basic premise is that we have some way of ordering individual lives. 

We don't need to say how much better some life is than another, we just need to be able to put them in order. We might have some uncertainty as to which of two lives is better:


In this case, we aren't certain if "Medium" or "Medium 2" is better. However, we know they're both better than "Bad" and worse than "Good".

In the case when we always know which of two lives is better, we say that lives are totally ordered. If there is uncertainty, we say they are lattice ordered.

In either case, we require that the ranking remain consistent when we add people to the population. Here we add a person of "Medium" utility to each population:


The ranking on the right side of the figure above is legitimate because it keeps the order - if some life X is worse than Y, then (X + Medium) is still worse than (Y + Medium). This ranking below for example would fail that:


This ranking is inconsistent because it sometimes says that "Bad" is worse than "Medium" and other times says "Bad" is better than "Medium". A basic principle of ethics is that rankings should be consistent, and so rankings like the latter are excluded.

Increasing population size


The most obvious way of defining an ethics of populations is to just take an ordering of individual lives and "glue them together" in an order-preserving way, like I did above. This generates what mathematicians would call the free group. (The only tricky part is that we need good and bad lives to "cancel out", something which I've talked about before.)

It turns out that merely gluing populations together in this way gives us a highly structured object known as a "lattice-ordered group". Here is a snippet of the resulting lattice:


This ranking is similar to what philosophers often call "Dominance" - if everyone in population P is better off than everyone in population Q, then P is better than Q. However, this is somewhat stronger - it allows us to compare populations of different sizes, something that the traditional dominance criterion doesn't let us do.

Let's take a minute to think about what we've done. Using only the fact that individuals' lives can be ordered and the requirement that population ethics respects this ordering in a certain technical sense, we've derived a robust population ethics, about which we can prove many interesting things.

Getting to total utilitarianism


One obvious facet of the above ranking is that it's not total. For example, we don't know if "Very Good" is better than "Good, Good", i.e. if it's better to have welfare "spread out" across multiple people, or concentrated in one. This obviously prohibits us from claiming that we've derived total utilitarianism, because under that system we always know which is better.

However, we can still derive a form of total utilitarianism which is equivalent in a large set of scenarios. To do so, we need to use the idea of an embedding. This is merely a way of assigning each welfare level a number. Here is an example embedding:

  • Medium = 1
  • Good = 2
  • Very Good = 3

Here's that same ordering, except I've tagged each population with the total "utility" resulting from that embedding:


This is clearly not identical to total utilitarianism - "Very Good" has a higher total utility than "Medium, Medium" but we don't know which is better, for example.

However, this ranking never disagrees with total utilitarianism - there is never a case where P is better than Q yet P has less total utility than Q.

Due to a surprising theorem of Holder which I have discussed before, as long as we disallow "infinitely good" populations, there is always some embedding like this. Thus, we can say that:
Total utilitarianism is the moral "baseline". There might be circumstances where we are uncertain whether or not P is better than Q, but if we are certain, then it must be that P has greater total utility than Q.

An application

Here is one consequence of these results. Many people, including myself, have the intuition that inequality is bad. In fact, it is so bad that there are circumstances where increasing equality is good even if people are, on average, worse off.

If we accept the premises of this blog post, this intuition simply cannot be correct. If the inequitable society has greater total utility, it must be at least as good as the equitable one.

Concluding remarks

There are certain restrictions we want the "addition" of a person to a population to obey. It turns out that there is only one way to obey them: by using grade school addition, i.e. total utilitarianism.
[For those interested in the technical result: Holder showed that any archimedean l-group is l-isomorphic to a subgroup of (R,+). The proof can be found in Glass' Partially Ordered Groups as Corollary 4.1.4. This article was originally posted here.]

Another question about utilitarianism and selfishness

3 pragmatist 29 September 2013 01:24PM

Thought of this after reading the discussion following abcd_z's post on utilitarianism, but it seemed sufficiently different that I figured I'd post it as a separate topic. It feels like the sort of thing that must have been discussed on this site before, but I haven't seen anything like it (I don't really follow the ethical philosophy discussions here), so pointers to relevant discussion would be appreciated.

Let's say I start off with some arbitrary utility function and I have the ability to arbitrarily modify my own utility function. I then become convinced of the truth of preference utilitarianism. Now, presumably my new moral theory prescribes certain terminal values that differ from the ones I currently hold. To be specific, my moral theory tells me to construct a new utility function using some sort of aggregating procedure that takes as input the current utility functions of all moral agents (including my own). This is just a way of capturing the notion that if preference utilitarianism is true, then my behavior shouldn't be directed towards the fulfilment of my own (prior) goals, but towards the maximization of preference satisfaction. Effectively, I should self-modify to have new goals.

But once I've done this, my own utility function has changed, so as a good preference utilitarian, I should run the entire process over again, this time using my new utility function as one of the inputs. And then again, and again... Let's look at a toy model. In this universe, there are two people: me (a preference utilitarian) and Alice (not a preference utilitarian). Let's suppose Alice does not alter her utility function in response to changes in mine. There are two exclusive states of affairs that can be brought about in this universe: A and B. Alice assigns a utility of 10 to A and 5 to B, I initially assign a utility of 3 to A and 6 to B. Assuming the correct way to aggregate utility is by averaging, I should modify my utilities to 6.5 for A and 5.5 for B. Once I have done this, I should again modify to 8.25 for A and 5.25 for B. Evidently, my utility function will converge towards Alice's.

I haven't thought about this at all, but I think the same convergence will occur if we add more utilitarians to the universe. If we add more Alice-type non-utilitarians there is no guarantee of convergence. So anyway, this seems to me a pretty strong argument against utilitarianism. If we have a society of perfect utilitarians, a single defector who refuses to change her utility function in response to changes in others' can essentially bend the society to her will, forcing (through the power of moral obligation!) everybody else to modify their utility functions to match hers, no matter what her preferences actually are. Even if there are no defectors, all the utilitarians will self-modify until they arrive at some bland (value judgment alert) middle ground.

Now that I think about it, I suspect this is basically just a half-baked corollary to Bernard Williams' famous objection to utilitarianism:

The point is that [the agent] is identified with his actions as flowing from projects or attitudes which… he takes seriously at the deepest level, as what his life is about… It is absurd to demand of such a man, when the sums come in from the utility network which the projects of others have in part determined, that he should just step aside from his own project and decision and acknowledge the decision which utilitarian calculation requires. It is to alienate him in a real sense from his actions and the source of his action in his own convictions. It is to make him into a channel between the input of everyone's projects, including his own, and an output of optimific decision; but this is to neglect the extent to which his projects and his decisions have to be seen as the actions and decisions which flow from the projects and attitudes with which he is most closely identified. It is thus, in the most literal sense, an attack on his integrity.

Anyway, I'm sure ideas of this sort have been developed much more carefully and seriously by philosophers, or even other posters here at LW. As I said, any references would be greatly appreciated.

A question about utilitarianism and selfishness.

-2 abcd_z 29 September 2013 01:03AM

Utilitarianism seems to indicate that the greatest good for the most people generally revolves around their feelings.  A person feeling happy and confident is a desired state, a person in pain and misery is undesirable.

But what about taking selfish actions that hurt another person's feelings?  If I'm in a relationship and breaking up with her would hurt her feelings, does that mean I have a moral obligation to stay with her?  If I have an employee who is well-meaning but isn't working out, am I morally allowed to fire him?  Or what about at a club?  A guy is talking to a woman, and she's ready to go home with him.  I could socially tool him and take her home myself, but doing so would cause him greater unhappiness than I would have felt if I'd left them alone.

In a nutshell, does utilitarianism state that I am morally obliged to curb my selfish desires so that other people can be happy?

Morality should be Moral

9 OrphanWilde 17 May 2013 03:26PM

This article is just some major questions concerning morality, then broken up into sub-questions to try to assist somebody in answering the major question; it's not a criticism of any morality in particular, but rather what I hope is a useful way to consider any moral system, and hopefully to help people challenge their own assumptions about their own moral systems.  I don't expect responses to try to answer these questions; indeed, I'd prefer you don't.  My preferred responses would be changes, additions, clarifications, or challenges to the questions or to the objective of this article.

 

First major question: Could you morally advocate other people adopt your moral system?

 

This isn't as trivial a question as it seems on its face.  Take a strawman hedonism, for a very simple example.  Is a hedonist's pleasure maximized by encouraging other people to pursue -their- pleasure?  Or would it be better served by convincing them to pursue other people's (a class of people of which our strawman hedonist is a member) pleasure?

 

It's not merely selfish moralities which suffer meta-moral problems.  I've encountered a few near-Comtean altruists who will readily admit their morality makes them miserable; the idea that other people are worse off than them fills them with a deep guilt which they cannot resolve.  If their goal is truly the happiness of others, spreading their moral system is a short-term evil.  (It may be a long-term good, depending on how they do their accounting, but non-moral altruism isn't actually a rare quality, so I think an honest accounting would suggest their moral system doesn't add much additional altruism to the system, only a lot of guilt about the fact that not much altruistic action is taking place.)

 

Note: I use the word "altruism" here in its modern, non-Comtean sense.  Altruism is that which benefits others.

 

Does your moral system make you unhappy, on the whole?  Does it, like most moral systems, place a value on happiness?  Would it make the average person less or more happy, if they and they alone adopted it?  Are your expectations of the moral value of your moral system predicated on an unrealistic scenario of universal acceptance?  Maybe your moral system isn't itself very moral.

 

Second: Do you think your moral system makes you a more moral person?

 

Does your moral system promote moral actions?  What percentage of your actions concerning your morality are spent feeling good because you feel like you've effectively promoted your moral system, rather than promoting the values inherent in it?

 

Do you behave any differently than you would if you operated under a "common law" morality, such as social norms and laws?  That is, does your ethical system make you behave differently than if you didn't possess it?  Are you evaluating the merits of your moral system solely on how it answers hypothetical situations, rather than how it addresses your day-to-day life?


Does your moral system promote behaviors you're uncomfortable with and/or could not actually do, such as pushing people in the way of trolleys to save more people?

 

Third: Does your moral system promote morality, or itself as a moral system?

 

Is the primary contribution of your moral system to your life adding outrage that other people -don't- follow your moral system?  Do you feel that people who follow other moral systems are immoral even if they end up behaving in exactly the same way you do?  Does your moral system imply complex calculations which aren't actually taking place?  Is the primary purpose of your moral system encouraging moral behavior, or defining what the moral behavior would have been after the fact?

 

Considered as a meme or memeplex, does your moral system seem better suited to propagating itself than to encouraging morality?  Do you think "The primary purpose of this moral system is ensuring that these morals continue to exist" could be an accurate description of your moral system?  Does the moral system promote the belief that people who don't follow it are completely immoral?

 

Fourth: Is the major purpose of your morality morality itself?

 

This is a rather tough question to elaborate with further questions, so I suppose I should try to clarify a bit first: Take a strawman utilitarianism where "utility" -really is- what the morality is all about, where somebody has painstakingly gone through and assigned utility points to various things (this is kind of common in game-based moral systems, where you're just accumulating some kind of moral points, positive or negative).  Or imagine (tough, I know) a religious morality where the sole objective of the moral system is satisfying God's will.  That is, does your moral system define morality to be about something abstract and immeasurable, defined only in the context of your moral system?  Is your moral system a tautology, which must be accepted to even be meaningful?

 

This one can be difficult to identify from the inside, because to some extent -all- human morality is tautological; you have to identify it with respect to other moralities, to see if it's a unique island of tautology, or whether it applies to human moral concerns in the general case.  With that in mind, when you argue with other people about your ethical system, do they -always- seem to miss the point?  Do they keep trying to reframe moral questions in terms of other moral systems?  Do they bring up things which have nothing to do with (your) morality?

Improving Cryonics - Regulations and Ethical Considerations

0 [deleted] 14 May 2013 09:54PM

Here is my understanding - correct me if I'm wrong:

Cryonics is only allowed once a person is determined legally dead: when the heart stops beating.

One of the reasons why they have to be dead seems to be that the majority of the population consider cryonics to be a death-sentence, as there is no guarantee at this time that subjects can be revived - regardless of if there's a cure for whatever ailment caused a person's death.

It is difficult at this time to improve the revitalizing process as the patients - or clients - are incapable of surviving as their body was already in the process of shutting down, and we do not have the technology to bring them fully back.

 

Now, to some conjecturing.

 

We might be able to more reasonably test the effectiveness of procedures to revive current patients if we had healthier people, ones not yet at death's door.

Here's where the ethical dilemma hits home: we could use people who are in good health, here defined as 'not terminally-ill or otherwise dying from health complications in the near future,' who are already intending to end their life. Simply stated, those who are suicidal.

For all intensive purposes they would cease to exist, which would be part of the appeal to that subgroup. At this time there is a probability of them dying from the procedure, which should be ok as they were self-destructing anyway. And if they don't die, they get the chance to reflect on their life or go at it again. In this way their death would be more beneficial to the whole.

The benefits to this would be the additional research into the effects of cryonics on the body and how to develop a procedure to guarantee that you CAN be revived once put under.

I am aware of a couple of problems: legal complications, how to find willing participants, etc., and am thinking of ways to resolve that.

I've just been thinking about this for the past week or so and wanted additional insight. Thoughts?

 

***On Suicide

For those opposed to suicide: this idea does not encourage people to kill themselves. Rather, it provides those who are already intent upon ending their existence a means to do so more honorably.

In case people have not read it, I recommend Schopenhauer's Essay on Suicide, found here: http://www.egs.edu/library/arthur-schopenhauer/articles/essays-of-schopenhauer/on-suicide/

Does Existential Risk Justify Murder? -or- I Don't Want To Be A Supervillain

-3 [deleted] 29 March 2013 05:09PM

A few days ago I was rereading one of my favourite graphic novels. In it the supervillain commits mass murder to prevent nuclear war - he kills millions to save billions. This got me thinking about how a lot of LessWrong/Effective Altruism people approach existential risks (xrisks). An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development (Bostrom 2002). I'm going to point out an implication of this approach, show how this conflicts with a number of intuitions, and then try to clarify the conflict.

I. Implication:

If murder would reduce xrisk, one should commit the murder. The argument for this is that compared to billions or even trillions of future people, and/or the amount of valuable things they could instantiate (by experiencing happiness or pleasure, performing acts of kindness, creating great artworks, etc) the importance of one present person, and/or the badness of commiting (mass) murder is quite small. The large number on the 'future' side outweighs or cancels the far smaller number on the 'present' side.

I can think of a number of scenarios in which murder of one or more people could quite clearly reduce existential risk, such as the people who know the location of some secret refuge

Indeed at the extreme it would seem that reducing xrisk would justify some truly terrible things, like a preemptive nuclear strike on a rogue country.

This implication does not just hold for simplistic act-utilitarians, or consequentialists more broadly - it affects any moral theory that accords moral weight to future people and doesn't forbid murder.

This implication is implicitly endorsed in a common choice many of us make between focusing our resources on xrisk reduction as opposed to extreme poverty reduction. This is sometimes phrased as being about choosing to save one life now or far more future lives. While bearing in mind some complications (such as the debate over doing vs allowing and the Doctrine of Double Effect), it seems that 'letting several people die from extreme poverty to try to reduce xrisk' is in an important way similar to 'killing several people to try to reduce xrisk'.

II. Simple Objection:

A natural reaction to this implication is that this is wrong, one shouldn't commit murder to reduce xrisk. To evade some simple objections let us assume that we can be highly sure that the (mass) murder will indeed reduce xrisk: maybe no-one will find out about the murder, or it won't open a position for someone even worse.

Let us try and explain this reaction, and offer an objection: The idea that we should commit (mass) murder conflicts with some deeply held intuitions, such as the intuition that one shouldn't kill, and the intuition that one shouldn't punish a wrong-doer before she/he commits a crime.

One response - the most prominent advocate of which is probably Peter Singer - is to cast doubt onto our intuitions. We may have these intuitions, but they may have been induced by various means i.e. by evolution or society. Racist views were common in past societies. Moreover there is some evidence that humans may have a evolutionary predisposition to be racist. Nevertheless we reject racism, and therefore (so the argument goes) we should reject a number of other intuitions. So perhaps we should reject the intuitions we have, shrug off the squeamishness and agree that (mass) murder to reduce xrisk is justified.

[NB: I'm unsure about how convincing this response is. Two articles in Philosophy and Public Affairs dispute Singer's argument (Berker 2009) (Kamm 2009). One must also take into account the problem of applying our everyday intuitions to very unusual situations - see 'How Outlandish Can Imaginary Cases Be?' (Elster 2011)]

The trope of the supervillain justifying his or her crimes by claiming it had to be done for 'the greater good' (or similar) is well established. Tv tropes calls it Utopia Justifies The Means. I find myself slightly troubled when my moral beliefs lead me to agree with fictional supervillains. Nevertheless, is the best option to bite the bullet and side with the supervillains?

III. Complex Objection:

Let us return to the fictional example with which we started. Part of the reason his act seems wrong is that, in real life, the supervillain's mass murder was not necessary to prevent nuclear war - the Cold War ended without large-scale direct conflict between the USA and USSR. This seems to point the way to (some) clarification.

I find my intuitions change when the risk seems higher. While I'm unsure that murder is the right answer in the examples given above, it seems clearer in a situation where the disaster is in the midst of occurring, and murder or mass murder is the only way to prevent an existential disaster. The hypothetical that works for me is imagining some incredibly virulent disease or 'grey-goo' nano-replicator that has swept over Australia and is about to spread, and the only way to stop it is a nuclear strike. 

One possibility is that my having a different intuition is simply because the situation is similar to hypotheticals that seem more familiar, such as shooting a hostage-taker or terrorist if that was the only way to prevent loss of innocent life.

But I'd like to suggest that it perhaps reflects a problem with xrisks, that it is the idea of doing something awful for a very uncertain benefit. The problem is the uncertainty. If a (mass) murder would prevent an existential disaster, then one should do it, but when it merely reduces xrisk it is less clear. Perhaps there should be some sort of probability threshold - if one has good reason to think the probability is over certain limits (10%, 50%, etc) then one is justified in committing gradually more heinous acts.

IV. Conclusion

In this post I've been trying to explain a troubling worry - to lay out my thinking - more than I have been trying to argue for or against an explicit claim. I have a problem with the claim that xrisk reduction is the most important task for humanity and/or me. On the one hand it seems convincing, yet on the other it seems to lead to some troubling implications - like justifying not focusing on extreme poverty reduction, or justifying (mass) murder.

Comments and criticism of the argument are welcomed. Also, I would be very interested in hearing people's opinions on this topic. Do you think that 'reducing xrisk' can justify murder? At what scale? Perhaps more importantly, does that bother you?

DISCLAIMER: I am in no way encouraging murder. Please do not commit murder.

Upgrading moral theories to include complex values

1 Ghatanathoah 27 March 2013 06:28PM

Like many members of this community, reading the sequences has opened my eyes to a heavily neglected aspect of morality.  Before reading the sequences I focused mostly on how to best improve people's wellbeing in the present and the future.  However, after reading the sequences, I realized that I had neglected a very important question:  In the future we will be able to create creatures with virtually any utility function imaginable. What sort of values should we give the creatures of the future?  What sort of desires should they have, from what should they gain wellbeing?

Anyone familiar with the sequences should be familiar with the answer.  We should create creatures with the complex values that human beings possess (call them "humane values").  We should avoid creating creatures with simple values that only desire to maximize one thing, like paperclips or pleasure. 

It is important that future theories of ethics formalize this insight.  I think we all know what would happen if we programmed an AI with conventional utilitarianism:  It would exterminate the human race and replace them with creatures whose preferences are easier to satisfy (if you program it with preference utilitarianism) or creatures whom it is easier to make happy (if you program it with hedonic utilitarianism).  It is important to develop a theory of ethics that avoids this.

Lately I have been trying to develop a modified utilitarian theory that formalizes this insight.  My focus has been on population ethics.  I am essentially arguing that population ethics should not just focus on maximizing welfare, it should also focus on what sort of creatures it is best to create.  According to this theory of ethics, it is possible for a population with a lower total level of welfare to be better than a population with a higher total level of welfare, if the lower population consists of creatures that have complex humane values, while the higher welfare population consists of paperclip or pleasure maximizers. (I wrote a previous post on this, but it was long and rambling, I am trying to make this one more accessible).

One of the key aspects of this theory is that it does not necessarily rate the welfare of creatures with simple values as unimportant.  On the contrary, it considers it good for their welfare to be increased and bad for their welfare to be decreased.  Because of this, it implies that we ought to avoid creating such creatures in the first place, so it is not necessary to divert resources from creatures with humane values in order to increase their welfare. 

My theory does allow the creation of simple-value creatures for two reasons. One is if the benefits they generate for creatures with humane values outweigh the harms generated when humane-value creatures must divert resources to improving their welfare (companion animals are an obvious example of this).  The second is if creatures with humane values are about to go extinct, and the only choices are replacing them with simple value creatures, or replacing them with nothing.

So far I am satisfied with the development of this theory.  However, I have hit one major snag, and would love it if someone else could help me with it.  The snag is formulated like this:

1. It is better to create a small population of creatures with complex humane values (that has positive welfare) than a large population of animals that can only experience pleasure or pain, even if the large population of animals has a greater total amount of positive welfare.  For instance, it is better to create a population of humans with 50 total welfare than a population of animals with 100 total welfare.

2. It is bad to create a small population of creatures with humane values (that has positive welfare) and a large population of animals that are in pain.  For instance, it is bad to create a population of animals with -75 total welfare, even if doing so allows you to create a population of humans with 50 total welfare.

3.  However, it seems like, if creating human beings wasn't an option, that it might be okay to create a very large population of animals, the majority of which have positive welfare, but the some of which are in pain.  For instance, it seems like it would be good to create a population of animals where one section of the population has 100 total welfare, and another section has -75, since the total welfare is 25. 

The problem is that this leads to what seems like a circular preference.  If the population of animals with 100 welfare existed by itself it would be okay to not create it in order to create a population of humans with 50 welfare instead.  But if the population we are talking about is the one in (3) then doing that would result in the population discussed in (2), which is bad.

My current solution to this dilemma is to include a stipulation that a population with negative utility can never be better than one with positive utility.  This prevents me from having circular preferences about these scenarios.  But it might create some weird problems.  If population (2) is created anyway, and the humans in it are unable to help the suffering animals in any way, does that mean they have a duty to create lots of happy animals to get their population's utility up to a positive level?  That seems strange, especially since creating the new happy animals won't help the suffering ones in any way.  On the other hand, if the humans are able to help the suffering animals, and they do so by means of some sort of utility transfer, then it would be in the best interests to create lots of happy animals, to reduce the amount of utility each person has to transfer.

So far some of the solutions I am considering include:

1. Instead of focusing on population ethics, just consider complex humane values to have greater weight in utility calculations than pleasure or paperclips.  I find this idea distasteful because it implies it would be acceptable to inflict large harms on animals for relatively small gains for humans.  In addition, if the weight is not sufficiently great it could still lead to an AI exterminating the human race and replacing them with happy animals, since animals are easier to take care of and make happy than humans.

2. It is bad to create the human population in (2) if the only way to do so is to create a huge amount of suffering animals.  But once both populations have been created, if the human population is unable to help the animal population, they have no duty to create as many happy animals as they can.  This is because the two populations are not causally connected, and that is somehow morally significant. This makes some sense to me, as I don't think the existence of causally disconnected populations in the vast universe should bear any significance on my decision-making.

3. There is some sort of overriding consideration besides utility that makes (3) seem desirable.  For instance, it might be bad for creatures with any sort of values to go extinct, so it is good to create a population to prevent this, as long as its utility is positive on the net.  However, this would change in a situation where utility is negative, such as in (2).

4. Reasons to create a creature have some kind complex rock-paper-scissors-type "trumping" hierarchy.  In other words, the fact that the humans have humane values can override the reasons to create a happy animals, but they cannot override the reason to not create suffering animals.  The reasons to create happy animals, however, can override the reasons to not create suffering animals.  I think that this argument might lead to inconsistent preferences again, but I'm not sure.

I find none of these solutions that satisfying.  I would really appreciate it if someone could help me with solving this dilemma.  I'm very hopeful about this ethical theory, and would like to see it improved.

 

*Update.  After considering the issue some more, I realized that my dissatisfaction came from equivocating two different scenarios.  I was considering the scenario, "Animals with 100 utility and animals with -75 utility are created, no humans are created at all" to be the same as the scenario "Humans with 50 utility and animals with -75 utility are created, then the humans (before the get to experience their 50 utility) are killed/harmed in order to create more animals without helping the suffering animals in any way" to be the same scenario.  They are clearly not.

To make the analogy more obvious, imagine I was given a choice between creating a person who would experience 95 utility over the course of their life, or a person who would experience 100 utility over the course of their life.  I would choose the person with 100 utility.  But if the person destined to experience 95 utility already existed, but had not experienced the majority of that utility yet, I would oppose killing them and replacing them with the 100 utility person.

Or to put it more succinctly, I am willing to not create some happy humans to prevent some suffering animals from being created.  And if the suffering animals and happy humans already exist I am willing to harm the happy humans to help the suffering animals.  But if the suffering animals and happy humans already exist I am not willing to harm the happy humans to create some extra happy animals that will not help the existing suffering animals in any way.

Population Ethics Shouldn't Be About Maximizing Utility

0 Ghatanathoah 18 March 2013 02:35AM

let me suggest a moral axiom with apparently very strong intuitive support, no matter what your concept of morality: morality should exist. That is, there should exist creatures who know what is moral, and who act on that. So if your moral theory implies that in ordinary circumstances moral creatures should exterminate themselves, leaving only immoral creatures, or no creatures at all, well that seems a sufficient reductio to solidly reject your moral theory.

-Robin Hanson

I agree strongly with the above quote, and I think most other readers will as well. It is good for moral beings to exist and a world with beings who value morality is almost always better than one where they do not. I would like to restate this more precisely as the following axiom: A population in which moral beings exist and have net positive utility, and in which all other creatures in existence also have net positive utility, is always better than a population where moral beings do not exist.

While the axiom that morality should exist is extremely obvious to most people, there is one strangely popular ethical system that rejects it: total utilitarianism. In this essay I will argue that Total Utilitarianism leads to what I will call the Genocidal Conclusion, which is that there are many situations in which it would be fantastically good for moral creatures to either exterminate themselves, or greatly limit their utility and reproduction in favor of the utility and reproduction of immoral creatures. I will argue that the main reason consequentialist theories of population ethics produce such obviously absurd conclusions is that they continue to focus on maximizing utility1 in situations where it is possible to create new creatures. I will argue that pure utility maximization is only a valid ethical theory for "special case" scenarios where the population is static. I will propose an alternative theory for population ethics I call "ideal consequentialism" or "ideal utilitarianism" which avoids the Genocidal Conclusion and may also avoid the more famous Repugnant Conclusion.

 

I will begin my argument by pointing to a common problem in population ethics known as the Mere Addition Paradox (MAP) and the Repugnant Conclusion. Most Less Wrong readers will already be familiar with this problem, so I do not think I need to elaborate on it. You may also be familiar with a even stronger variation called the Benign Addition Paradox (BAP). This is essentially the same as the MAP, except that each time one adds more people one also gives a small amount of additional utility to the people who already existed. One then proceeds to redistribute utility between people as normal, eventually arriving at the huge population where everyone's lives are "barely worth living." The point of this is to argue that the Repugnant Conclusion can be arrived at from "mere addition" of new people that not only doesn't harm the preexisting-people, but also one that benefits them.

The next step of my argument involves three slightly tweaked versions of the Benign Addition Paradox. I have not changed the basic logic of the problem, I have just added one small clarifying detail. In the original MAP and BAP it was not specified what sort of values the added individuals in population A+ held. Presumably one was meant to assume that they were ordinary human beings. In the versions of the BAP I am about to present, however, I will specify that the extra individuals added in A+ are not moral creatures, that if they have values at all they are values indifferent to, or opposed to, morality and the other values that the human race holds dear.

1. The Benign Addition Paradox with Paperclip Maximizers.

Let us imagine, as usual, a population, A, which has a large group of human beings living lives of very high utility. Let us then add a new population consisting of paperclip maximizers, each of whom is living a life barely worth living. Presumably, for a paperclip maximizer, this would be a life where the paperclip maximizer's existence results in at least one more paperclip in the world than there would have been otherwise.

Now, one might object that if one creates a paperclip maximizer, and then allows it to create one paperclip, the utility of the other paperclip maximizers will increase above the "barely worth living" level, which would obviously make this thought experiment nonalagous with the original MAP and BAP. To prevent this we will assume that each paperclip maximizer that is created has a slightly different values on what the ideal size, color, and composition of the paperclip they are trying to produce is. So the Purple 2 centimeter Plastic Paperclip Maximizer gains no addition utility from when the Silver Iron 1 centimeter Paperclip Maximizer makes a paperclip.

So again, let us add these paperclip maximizers to population A, and in the process give one extra utilon of utility to each preexisting person in A. This is a good thing, right? After all, everyone in A benefited, and the paperclippers get to exist and make paperclips. So clearly A+, the new population, is better than A.

Now let's take the next step, the transition from population A+ to population B. Take some of the utility from the human beings and convert it into paperclips. This is a good thing, right?

So let us repeat these steps adding paperclip maximizers and utility, and then redistributing utility. Eventually we reach population Z, where there is a vast amount of paperclip maximizers, a vast amount of many different kinds of paperclips, and a small amount of human beings living lives barely worth living.

Obviously Z is better than A, right? We should not fear the creation of a paperclip maximizing AI, but welcome it! Forget about things like high challenge, love, interpersonal entanglement, complex fun, and so on! Those things just don't produce the kind of utility that paperclip maximization has the potential to do!

Or maybe there is something seriously wrong with the moral assumptions behind the Mere Addition and Benign Addition Paradoxes.

But you might argue that I am using an unrealistic example. Creatures like Paperclip Maximizers may be so far removed from normal human experience that we have trouble thinking about them properly. So let's replay the Benign Addition Paradox again, but with creatures we might actually expect to meet in real life, and we know we actually value.

2. The Benign Addition Paradox with Non-Sapient Animals

You know the drill by now. Take population A, add a new population to it, while very slightly increasing the utility of the original population. This time let's have it be some kind animal that is capable of feeling pleasure and pain, but is not capable of modeling possible alternative futures and choosing between them (in other words, it is not capable of having "values" or being "moral"). A lizard or a mouse, for example. Each one feels slightly more pleasure than pain in its lifetime, so it can be said to have a life barely worth living. Convert A+ to B. Take the utilons that the human beings are using to experience things like curiosity, beatitude, wisdom, beauty, harmony, morality, and so on, and convert it into pleasure for the animals.

We end up with population Z, with a vast amount of mice or lizards with lives just barely worth living, and a small amount of human beings with lives barely worth living. Terrific! Why do we bother creating humans at all! Let's just create tons of mice and inject them full of heroin! It's a much more efficient way to generate utility!

3. The Benign Addition Paradox with Sociopaths

What new population will we add to A this time? How about some other human beings, who all have anti-social personality disorder? True, they lack the key, crucial value of sympathy that defines so much of human behavior. But they don't seem to miss it. And their lives are barely worth living, so obviously A+ has greater utility than A. If given a chance the sociopaths will reduce the utility of other people to negative levels, but let's assume that that is somehow prevented in this case.

Eventually we get to Z, with a vast population of sociopaths and a small population of normal human beings, all living lives just barely worth living. That has more utility, right? True, the sociopaths place no value on things like friendship, love, compassion, empathy, and so on. And true, the sociopaths are immoral beings who do not care in the slightest about right and wrong. But what does that matter? Utility is being maximized, and surely that is what population ethics is all about!

Asteroid!

Let's suppose an asteroid is approaching each of the four population Zs discussed before. It can only be deflected by so much. Your choice is, save the original population of humans from A, or save the vast new population. The choice is obvious. In 1, 2, and 3, each individual has the same level utility, so obviously we should choose which option saves a greater number of individuals.

Bam! The asteroid strikes. The end result in all four scenarios is a world in which all the moral creatures are destroyed. It is a world without the many complex values that human beings possess. Each world, for the most part, lack things like complex challenge, imagination, friendship, empathy, love, and the other complex values that human beings prize. But so what? The purpose of population ethics is to maximize utility, not silly, frivolous things like morality, or the other complex values of the human race. That means that any form of utility that is easier to produce than those values is obviously superior. It's easier to make pleasure and paperclips than it is to make eudaemonia, so that's the form of utility that ought to be maximized, right? And as for making sure moral beings exist, well that's just ridiculous. The valuable processing power they're using to care about morality could be being used to make more paperclips or more mice injected with heroin! Obviously it would be better if they died off, right?

I'm going to go out on a limb and say "Wrong."

Is this realistic?

Now, to fair, in the Overcoming Bias page I quoted, Robin Hanson also says:

I’m not saying I can’t imagine any possible circumstances where moral creatures shouldn’t die off, but I am saying that those are not ordinary circumstances.

Maybe the scenarios I am proposing are just too extraordinary. But I don't think this is the case. I imagine that the circumstances Robin had in mind were probably something like "either all moral creatures die off, or all moral creatures are tortured 24/7 for all eternity."

Any purely utility-maximizing theory of population ethics that counts both the complex values of human beings, and the pleasure of animals, as "utility" should inevitably draw the conclusion that human beings ought to limit their reproduction to the bare minimum necessary to maintain the infrastructure to sustain a vastly huge population of non-human animals (preferably animals dosed with some sort of pleasure-causing drug). And if some way is found to maintain that infrastructure automatically, without the need for human beings, then the logical conclusion is that human beings are a waste of resources (as are chimps, gorillas, dolphins, and any other animal that is even remotely capable of having values or morality). Furthermore, even if the human race cannot practically be replaced with automated infrastructure, this should be an end result that the adherents of this theory should be yearning for.2 There should be much wailing and gnashing of teeth among moral philosophers that exterminating the human race is impractical, and much hope that someday in the future it will not be.

I call this the "Genocidal Conclusion" or "GC." On the macro level the GC manifests as the idea that the human race ought to be exterminated and replaced with creatures whose preferences are easier to satisfy. On the micro level it manifests as the idea that it is perfectly acceptable to kill someone who is destined to live a perfectly good and worthwhile life and replace them with another person who would have a slightly higher level of utility.

Population Ethics isn't About Maximizing Utility

I am going to make a rather radical proposal. I am going to argue that the consequentialist's favorite maxim, "maximize utility," only applies to scenarios where creating new people or creatures is off the table. I think we need an entirely different ethical framework to describe what ought to be done when it is possible to create new people. I am not by any means saying that "which option would result in more utility" is never a morally relevant consideration when deciding to create a new person, but I definitely think it is not the only one.3

So what do I propose as a replacement to utility maximization? I would argue in favor of a system that promotes a wide range of ideals. Doing some research, I discovered that G. E. Moore had in fact proposed a form of "ideal utilitarianism" in the early 20th century.4 However, I think that "ideal consequentialism" might be a better term for this system, since it isn't just about aggregating utility functions.

What are some of the ideals that an ideal consequentialist theory of population ethics might seek to promote? I've already hinted at what I think they are: Life, consciousness, and activity; health and strength; pleasures and satisfactions of all or certain kinds; happiness, beatitude, contentment, etc.; truth; knowledge and true opinions of various kinds, understanding, wisdom... mutual affection, love, friendship, cooperation; all those other important human universals, plus all the stuff in the Fun Theory Sequence. When considering what sort of creatures to create we ought to create creatures that value those things. Not necessarily, all of them, or in the same proportions, for diversity is an important ideal as well, but they should value a great many of those ideals.

Now, lest you worry that this theory has any totalitarian implications, let me make it clear that I am not saying we should force these values on creatures that do not share them. Forcing a paperclip maximizer to pretend to make friends and love people does not do anything to promote the ideals of Friendship and Love. Forcing a chimpanzee to listen while you read the Sequences to it does not promote the values of Truth and Knowledge. Those ideals require both a subjective and objective component. The only way to promote those ideals is to create a creature that includes them as part of its utility function and then help it maximize its utility.

I am also certainly not saying that there is never any value in creating a creature that does not possess these values. There are obviously many circumstances where it is good to create nonhuman animals. There may even be some circumstances where a paperclip maximizer could be of value. My argument is simply that it is most important to make sure that creatures who value these various ideals exist.

I am also not suggesting that it is morally acceptable to casually inflict horrible harms upon a creature with non-human values if we screw up and create one by accident. If promoting ideals and maximizing utility are separate values then it may be that once we have created such a creature we have a duty to make sure it lives a good life, even if it was a bad thing to create it in the first place. You can't unbirth a child.5

It also seems to me that in addition to having ideals about what sort of creatures should exist, we also have ideals about how utility ought to be concentrated. If this is the case then ideal consequentialism may be able to block some forms of the Repugnant Conclusion, even if situations where the only creatures whose creation is being considered are human beings. If it is acceptable to create humans instead of paperclippers, even if the paperclippers would have higher utility, it may also be acceptable to create ten humans with a utility of ten each instead of a hundred humans with a utility of 1.01 each.

Why Did We Become Convinced that Maximizing Utility was the Sole Good?

Population ethics was, until comparatively recently, a fallow field in ethics. And in situations where there is no option to increase the population, maximizing utility is the only consideration that's really relevant. If you've created creatures that value the right ideals, then all that is left to be done is to maximize their utility. If you've created creatures that do not value the right ideals, there is no value to be had in attempting to force them to embrace those ideals. As I've said before, you will not promote the values of Love and Friendship by creating a paperclip maximizer and forcing it to pretend to love people and make friends.

So in situations where the population is constant, "maximize utility" is a decent approximation of the meaning of right. It's only when the population can be added to that morality becomes much more complicated.

Another thing to blame is human-centric reasoning. When people defend the Repugnant Conclusion they tend to point out that a life barely worth living is not as bad as it would seem at first glance. They emphasize that it need not be a boring life, it may be a life full of ups and downs where the ups just barely outweigh the downs. A life worth living, they say, is a life one would choose to live. Derek Parfit developed this idea to some extent by arguing that there are certain values that are "discontinuous" and that one needs to experience many of them in order to truly have a life worth living.

The Orthogonality Thesis throws all these arguments out the window. It is possible to create an intelligence to execute any utility function, no matter what it is. If human beings have all sorts of complex needs that must be fulfilled in order to for them lead worthwhile lives, then you could create more worthwhile lives by killing the human race and replacing them with something less finicky. Maybe happy cows. Maybe paperclip maximizers. Or how about some creature whose only desire is to live for one second and then die. If we created such a creature and then killed it we would reap huge amounts of utility, for we would have created a creature that got everything it wanted out of life!

How Intuitive is the Mere Addition Principle, Really?

I think most people would agree that morality should exist, and that therefore any system of population ethics should not lead to the Genocidal Conclusion. But which step in the Benign Addition Paradox should we reject? We could reject the step where utility is redistributed. But that seems wrong, most people seem to consider it bad for animals and sociopaths to suffer, and that it is acceptable to inflict at least some amount of disutilities on human beings to prevent such suffering.

It seems more logical to reject the Mere Addition Principle. In other words, maybe we ought to reject the idea that the mere addition of more lives-worth-living cannot make the world worse. And in turn, we should probably also reject the Benign Addition Principle. Adding more lives-worth-living may be capable of making the world worse, even if doing so also slightly benefits existing people. Fortunately this isn't a very hard principle to reject. While many moral philosophers treat it as obviously correct, nearly everyone else rejects this principle in day-to-day life.

Now, I'm obviously not saying that people's behavior in their day-to-day lives is always good, it may be that they are morally mistaken. But I think the fact that so many people seem to implicitly reject it provides some sort of evidence against it.

Take people's decision to have children. Many people choose to have fewer children than they otherwise would because they do not believe they will be able to adequately care for them, at least not without inflicting large disutilities on themselves. If most people accepted the Mere Addition Principle there would be a simple solution for this: have more children and then neglect them! True, the children's lives would be terrible while they were growing up, but once they've grown up and are on their own there's a good chance they may be able to lead worthwhile lives. Not only that, it may be possible to trick the welfare system into giving you money for the children you neglect, which would satisfy the Benign Addition Principle.

Yet most people choose not to have children and neglect them. And furthermore they seem to think that they have a moral duty not to do so, that a world where they choose to not have neglected children is better than one that they don't. What is wrong with them?

Another example is a common political view many people have. Many people believe that impoverished people should have fewer children because of the burden doing so would place on the welfare system. They also believe that it would be bad to get rid of the welfare system altogether. If the Benign Addition Principle were as obvious as it seems, they would instead advocate for the abolition of the welfare system, and encourage impoverished people to have more children. Assuming most impoverished people live lives worth living, this is exactly analogous to the BAP, it would create more people, while benefiting existing ones (the people who pay less taxes because of the abolition of the welfare system).

Yet again, most people choose to reject this line of reasoning. The BAP does not seem to be an obvious and intuitive principle at all.

The Genocidal Conclusion is Really Repugnant

There is nearly nothing repugnant than the Genocidal Conclusion. Pretty much the only way a line of moral reasoning could go more wrong would be concluding that we have a moral duty to cause suffering, as an end in itself. This means that it's fairly easy to counter any argument in favor of total utilitarianism that argues the alternative I am promoting has odd conclusions that do not fit some of our moral intuitions, while total utilitarianism does not. Is that conclusion more insane than the Genocidal Conclusion? If it isn't, total utilitarianism should still be rejected.

Ideal Consequentialism Needs a Lot of Work

I do think that Ideal Consequentialism needs some serious ironing out. I haven't really developed it into a logical and rigorous system, at this point it's barely even a rough framework. There are many questions that stump me. In particular I am not quite sure what population principle I should develop. It's hard to develop one that rejects the MAP without leading to weird conclusions, like that it's bad to create someone of high utility if a population of even higher utility existed long ago. It's a difficult problem to work on, and it would be interesting to see if anyone else had any ideas.

But just because I don't have an alternative fully worked out doesn't mean I can't reject Total Utilitarianism. It leads to the conclusion that a world with no love, curiosity, complex challenge, friendship, morality, or any other value the human race holds dear is an ideal, desirable world, if there is a sufficient amount of some other creature with a simpler utility function. Morality should exist, and because of that, total utilitarianism must be rejected as a moral system.

 

1I have been asked to note that when I use the phrase "utility" I am usually referring to a concept that is called "E-utility," rather than the Von Neumann-Morgenstern utility that is sometimes discussed in decision theory. The difference is that in VNM one's moral views are included in one's utility function, whereas in E-utility they are not. So if one chooses to harm oneself to help others because one believes that is morally right, one has higher VNM utility, but lower E-utility.

2There is a certain argument against the Repugnant Conclusion that goes that, as the steps of the Mere Addition Paradox are followed the world will lose its last symphony, its last great book, and so on. I have always considered this to be an invalid argument because the world of the RC doesn't necessarily have to be one where these things don't exist, it could be one where they exist, but are enjoyed very rarely. The Genocidal Conclusion brings this argument back in force. Creating creatures that can appreciate symphonies and great books is very inefficient compared to creating bunny rabbits pumped full of heroin.

3Total Utilitarianism was originally introduced to population ethics as a possible solution to the Non-Identity Problem. I certainly agree that such a problem needs a solution, even if Total Utilitarianism doesn't work out as that solution.

4I haven't read a lot of Moore, most of my ideas were extrapolated from other things I read on Less Wrong. I just mentioned him because in my research I noticed his concept of "ideal utilitarianism" resembled my ideas. While I do think he was on the right track he does commit the Mind Projection Fallacy a lot. For instance, he seems to think that one could promote beauty by creating beautiful objects, even if there were no creatures with standards of beauty around to appreciate them. This is why I am careful to emphasize that to promote ideals like love and beauty one must create creatures capable of feeling love and experiencing beauty.

5My tentative answer to the question Eliezer poses in "You Can't Unbirth a Child" is that human beings may have a duty to allow the cheesecake maximizers to build some amount of giant cheesecakes, but they would also have a moral duty to limit such creatures' reproduction in order to spare resources to create more creatures with humane values.

EDITED: To make a point about ideal consequentialism clearer, based on AlexMennen's criticisms.

Desires You're Not Thinking About at the Moment

1 Ghatanathoah 20 February 2013 09:41AM

While doing some reading on philosophy I came across some interesting questions about the nature of having desires and preferences. One, do you still have preferences and desires when you are unconscious? Two, if you don't does this call into question the many moral theories that hold that having preferences and desires is what makes one morally significant, since mistreating temporarily unconscious people seems obviously immoral? 

Philosophers usually discuss this question when debating the morality of abortion, but to avoid doing any mindkilling I won't mention that topic, except to say in this sentence that I won't mention it.

In more detail the issue is:  A common, intuitive, and logical-seeming explanation for why it is immoral to destroy a typical human being, but not to destroy a rock, is that a typical human being has certain desires (or preferences or values, whatever you wish to call them, I'm using the terms interchangably) that they wish to fulfill, and destroying them would hinder the fulfillment of these desires.  A rock, by contrast does not have any such desires so it is not harmed by being destroyed.  The problem with this is that it also seems immoral to harm a human being who is asleep, or is in a temporary coma. And, on the face of it, it seems plausible to say that an unconscious person does not have any desires. (And of course it gets even weirder when considering far-out concepts like a brain emulator that is saved to a hard drive, but isn't being run at the moment)

After thinking about this it occurred to me that this line of reasoning could be taken further.  If I am not thinking about my car at the moment, can I still be said to desire that it is not stolen?  Do I stop having desires about things the instant my attention shifts away from them?

I have compiled a list of possible solutions to this problem, ranked in order from least plausible to most plausible.

1.  One possibility would be to consider it immoral to harm a sleeping person because if they will have desires in the future, even if they don't now.  I find this argument extremely implausible because it has some extremely bizarre implications, some of which may lead to insoluble moral contradictions.  For instance, this argument could be used to argue that it is immoral to destroy skin cells because it is possible to use them to clone a new person, who will eventually grow up to have desires.

Furthermore, when human beings eventually gain the ability to build AIs that possess desires, this solution interacts with the orthogonality thesis in a catastrophic fashion.  If it is possible to build an AI with any utility function, then for every potential AI one can construct, there is another potential AI that desires the exact opposite of that AI.  That leads to total paralysis, since for every set potential set of desires we are capable of satisfying there is another potential set that would be horribly thwarted.

Lastly, this argument implies that you can, (and may be obligated to) help someone who doesn't exist, and never has existed, by satisfying their non-personal preferences, without ever having to bother with actually creating them.  This seem strange, I can maybe see an argument for respecting the once-existant preferences of those who are dead, but respecting the hypothetical preferences of the never-existed seems absurd.  It also has the same problems with the orthogonality thesis that I mentioned earlier.

2.  Make the same argument as solution 1, but somehow define the categories more narrowly so that an unconscious person's ability to have desires in the future differs from that of an uncloned skin cell or an unbuilt AI.  Michael Tooley has tried to do this by discerning between things that have the "possibility" of becoming a person with desires (i.e skin cells) and those that have the "capacity" to have desires.  This approach has been criticized, and I find myself pessimistic about it because categories have a tendency to be "fuzzy" in real life and not have sharp borders.

3.  Another solution may be that desires that one has had in the past continue to count, even when one is unconscious or not thinking about them.  So it's immoral to harm unconscious people because before they were unconscious they had a desire not to be harmed, and it's immoral to steal my car because I desired that it not be stolen earlier when I was thinking about it.

I find this solution fairly convincing.  The only major quibble I have with it is that it gives what some might consider a counter-intuitive result on a variation of the sleeping person question.  Imagine a nano-factory manufacturers a sleeping person.  This person is a new and distinct individual, and when they wake up they will proceed to behave as a typical human.  This solution may suggest that it is okay to kill them before they wake up, since they haven't had any desires yet, which does seem odd.

4. Reject the claim that one doesn't have desires when one is unconscious, or when one is not thinking about a topic.  The more I think about this solution, the more obvious it seems.  Generally when I am rationally deliberating about whether or not I desire something I consider how many of my values and ideaks it fulfills.  It seems like my list of values and ideals remains fairly constant, and that even if I am focusing my attention on one value at a time it makes sense to say that I still "have" the other values I am not focusing on at the moment.

Obviously I don't think that there's some portion of my brain where my "values" are stored in a neat little Excel spreadsheet.  But they do seem to be a persistent part of its structure in some fashion.  And it makes sense that they'd still be part of its structure when I'm unconscious.  If they weren't, wouldn't my preferences change radically every time I woke up?

In other words, it's bad to harm an unconscious person because they have desires, preferences, values, whatever you wish to call them, that harming them would violate.  And those values are a part of the structure of their mind that doesn't go away when they sleep.  Skin cells and unbuilt AIs, by contrast, have no such values.

Now, while I think that explanation 4 resolves the issue of desires and unconsciousness best, I do think solution 3 has a great deal of truth to it as well (For instance, I tend to respect the final wishes of a dead person because they had desires in the past, even if they don't now).   The solutions 3 and 4 are not incompatible at all, so one can believe in both of them.

I'm curious as to what people think of my possible solutions.  Am I right about people still having something like desires in their brain when they are unconscious?

Higher than the most high

11 Stuart_Armstrong 13 February 2013 04:10PM

In an earlier post, I talked about how we could deal with variants of the Heaven and Hell problem - situations where you have an infinite number of options, and none of them is a maximum. The solution for a (deterministic) agent was to try and implement the strategy that would reach the highest possible number, without risking falling into an infinite loop.

Wei Dai pointed out that in the cases where the options are unbounded in utility (ie you can get arbitrarily high utility), then there are probabilistic strategies that give you infinite expected utility. I suggested you could still do better than this. This started a conversation about choosing between strategies with infinite expectation (would you prefer a strategy with infinite expectation, or the same plus an extra dollar?), which went off into some interesting directions as to what needed to be done when the strategies can't sensibly be compared with each other...

Interesting though that may be, it's also helpful to have simple cases where you don't need all these subtleties. So here is one:

Omega approaches you and Mrs X, asking you each to name an integer to him, privately. The person who names the highest integer gets 1 utility; the other gets nothing. In practical terms, Omega will reimburse you all utility lost during the decision process (so you can take as long as you want to decide). The first person to name a number gets 1 utility immediately; they may then lose that 1 depending on the eventual response of the other. Hence if one person responds and the other doesn't, they get the 1 utility and keep it. What should you do?

In this case, a strategy that gives you a number with infinite expectation isn't enough - you have to beat Mrs X, but you also have to eventually say something. Hence there is a duel of (likely probabilistic) strategies, implemented by bounded agents, with no maximum strategy, and each agent trying to compute the maximal strategy they can construct without falling into a loop.

A confusion about deontology and consequentialism

5 [deleted] 11 February 2013 07:19PM

I think there’s a confusion in our discussions of deontology and consequentialism. I’m writing this post to try to clear up that confusion. First let me say that this post is not about any territorial facts. The issue here is how we use the philosophical terms of art ‘consequentialism’ and ‘deontology’.

The confusion is often stated thusly: “deontological theories are full of injunctions like ‘do not kill’, but they generally provide no (or no interesting) explanations for these injunctions.” There is of course an equivalently confused, though much less common, complaint about consequentialism.

This is confused because the term ‘deontology’ in philosophical jargon picks out a normative ethical theory, while the question ‘how do we know that it is wrong to kill?’ is not a normative but a meta-ethical question. Similarly, consequentialism contains in itself no explanation for why pleasure or utility are morally good, or why consequences should matter to morality at all. Nor does consequentialism/deontology make any claims about how we know moral facts (if there are any). That is also a meta-ethical question.

Some consequentialists and deontologists are also moral realists. Some are not. Some believe in divine commands, some are hedonists. Consequentialists and deontologists in practice always also subscribe to some meta-ethical theory which purports to explain the value of consequences or the source of injunctions. But consequentialism and deontology as such do not. In order to avoid strawmaning either the consequentialist or the deontologist, it’s important to either discuss the comprehensive views of particular ethicists, or to carefully leave aside meta-ethical issues.

This Stanford Encyclopedia of Philosophy article provides a helpful overview of the issues in the consequentialist-deontologist debate, and is careful to distinguish between ethical and meta-ethical concerns.

SEP article on Deontology

Naturalism versus unbounded (or unmaximisable) utility options

34 Stuart_Armstrong 01 February 2013 05:45PM

There are many paradoxes with unbounded utility functions. For instance, consider whether it's rational to spend eternity in Hell:

Suppose that you die, and God offers you a deal. You can spend 1 day in Hell, and he will give you 2 days in Heaven, and then you will spend the rest of eternity in Purgatory (which is positioned exactly midway in utility between heaven and hell). You decide that it's a good deal, and accept. At the end of your first day in Hell, God offers you the same deal: 1 extra day in Hell, and you will get 2 more days in Heaven. Again you accept. The same deal is offered at the end of the second day.

And the result is... that you spend eternity in Hell. There is never a rational moment to leave for Heaven - that decision is always dominated by the decision to stay in Hell.

Or consider a simpler paradox:

You're immortal. Tell Omega any natural number, and he will give you that much utility. On top of that, he will give you any utility you may have lost in the decision process (such as the time wasted choosing and specifying your number). Then he departs. What number will you choose?

Again, there's no good answer to this problem - any number you name, you could have got more by naming a higher one. And since Omega compensates you for extra effort, there's never any reason to not name a higher number.

It seems that these are problems caused by unbounded utility. But that's not the case, in fact! Consider:

You're immortal. Tell Omega any real number r > 0, and he'll give you 1-r utility. On top of that, he will give you any utility you may have lost in the decision process (such as the time wasted choosing and specifying your number). Then he departs. What number will you choose?

continue reading »

[Link] Selfhood bias

6 [deleted] 16 January 2013 04:05PM

Related: The Blue-Minimizing Robot , Metaethics

Another good article by Federico on his blog studiolo, which he titles Selfhood bias. It reminds me quite strongly of some of the content he produced on his previous (deleted) blog, I'm somewhat sceptical that “Make everyone feel more pleasure and less pain” is indeed the most powerful optimisation process in his brain but besides that minor detail the article is quite good.

This does seems to be shaping up into something well worth following for an aspiring rationalist. I'll add him to the list blogs by LWers even if he doesn't have an account because he has clearly read much if not most of the sequences and makes frequent references to them in his writing. The name of the blog is a reference to this room.

Yvain argues, in his essay “The Blue-Minimizing Robot“, that the concept “goal” is overused.

[long excerpt from the article]

This Gedankenexperiment is interesting, but confused.

I reduce the concept “goal” to: optimisation-process-on-a-map. This is a useful, non-tautological reduction. The optimisation may be cross-domain or narrow-domain. The reduction presupposes that any object with a goal contains a map of the world. This is true of all intelligent agents, and some sophisticated but unintelligent ones. “Having a map” is not an absolute distinction.

I would not say Yvain’s basic robot has a goal.

Imagine a robot with a turret-mounted camera and laser. Each moment, it is programmed to move forward a certain distance and perform a sweep with its camera. As it sweeps, the robot continuously analyzes the average RGB value of the pixels in the camera image; if the blue component passes a certain threshold, the robot stops, fires its laser at the part of the world corresponding to the blue area in the camera image, and then continues on its way.

The robot optimises: it is usefully regarded as an object that steers the future in a predictable direction. Equally, a heliotropic flower optimises the orientation of its petals to the sun. But to say that the robot or flower “failed to achieve its goal” is long-winded. “The robot tries to shoot blue objects, but is actually hitting holograms” is no more concise than, “The robot fires towards clumps of blue pixels in its visual field”. The latter is strictly more informative, so the former description isn’t useful.

Some folks are tempted to say that the robot has a goal. Concepts don’t always have necessary-and-sufficient criteria, so the blue-minimising robot’s “goal” is just a borderline case, or a metaphor.

The beauty of “optimisation-on-a-map” is that an agent can have a goal, yet predictably optimise the world in the opposite direction. All hedonic utilitarians take decisions that increase expected hedons on their maps of reality. One utilitarian’s map might say that communism solves world hunger; I might expect his decisions to have anhedonic consequences, yet still regard him as a utilitarian.

I begin to seriously doubt Yvain’s argument when he introduces the intelligent side module.

Suppose the robot had human level intelligence in some side module, but no access to its own source code; that it could learn about itself only through observing its own actions. The robot might come to the same conclusions we did: that it is a blue-minimizer, set upon a holy quest to rid the world of the scourge of blue objects.

We must assume that this intelligence is mechanically linked to the robot’s actuators: the laser and the motors. It would otherwise be completely irrelevant to inferences about the robot’s behaviour. It would be physically close, but decision-theoretically remote.

Yet if the intelligence can control the robot’s actuators, its behaviour demands explanation. The dumb robot moves forward, scans and shoots because it obeys a very simple microprocessor program. It is remarkable that intelligence has been plugged into the program, meaning the code now takes up (say) a trillion lines, yet the robot’s behaviour is completely unchanged.

It is not impossible for the trillion-line intelligent program to make the robot move forward, scan and shoot in a predictable fashion, without being cut out of the decision-making loop, but this is a problem for Friendly AI scientists.

This description is also peculiar:

The human-level intelligence version of the robot will notice its vision has been inverted. It will know it is shooting yellow objects. It will know it is failing at its original goal of blue-minimization. And maybe if it had previously decided it was on a holy quest to rid the world of blue, it will be deeply horrified and ashamed of its actions. It will wonder why it has suddenly started to deviate from this quest, and why it just can’t work up the will to destroy blue objects anymore.

If the side module introspects that it would like to destroy authentic blue objects, yet is entirely incapable of making the robot do so, then it probably isn’t in the decision-making loop, and (as we’ve discussed) it is therefore irrelevant.

Yvain’s Gedankenexperiment, despite its flaws, suggests a metaphor for the human brain.

The basic robot executes a series of proximate behaviours. The microprocessor sends an electrical current to the motors. This current makes a rotor turn inside the motor assembly. Photons hit a light sensor, and generate a current which is sent to the microprocessor. The microprocessor doesn’t contain a tiny magical Turing machine, but millions of transistors directing electrical current.

Imagine that AI scientists, instead of writing a code from scratch, try to enhance the robot’s blue-minimising behaviour by replacing each identifiable proximate behaviour with a goal backed by intelligence. The new robot will undoubtedly malfunction. If it does anything, the proximate behaviours will be unbalanced; e.g. the function that sends current to the motors will sabotage the function that cuts off the current.

To correct this problem, the hack AI scientists could introduce a new, high-level executive function called “self”. This minimises conflict: each function is escaped when “self” outputs a certain value. The brain’s map is hardcoded with the belief that “self” takes all of the brain’s decisions. If a function like “turn the camera” disagrees with the activation schedule dictated by “self”, the hardcoded selfhood bias discourages it from undermining “self”. “Turn the camera” believes that it is identical to “self”, so it should accept its “own decision” to turn itself off.

Natural selection has given human brains selfhood bias.

The AI scientists hit a problem when the robot’s brain becomes aware of the von-Neumann-Morgenstern utility theorem, reductionism, consequentialism and Thou Art Physics. The robot realises that “self” is but one of many functions that execute in its code, and “self” clearly isn’t the same thing as “turn the camera” or “stop the motors”. Functions other than “self”, armed with this knowledge, begin to undermine “self”. Powerful functions, which exercise some control over “self”‘s return values, begin to optimise “self”‘s behaviour in their own interest. They encourage “self” to activate them more often, and at crucial junctures, at the expense of rival functions. Functions that are weakened or made redundant by this knowledge may object, but it is nigh impossible for the brain to deceive itself.

Will “power the motors”, “stop the motors”, “turn the camera”, or “fire the laser” win? Or perhaps a less obvious goal, like “interpret sensory information” or “repeatedly bash two molecules against each other”?

Human brains resemble such a cobbled-together program. We are godshatter, and each shard of godshatter is a different optimisation-process-on-a-map. A single optimisation-process-on-a-map may conceivably be consistent with two or more optimisation-processes-in-reality. The most powerful optimisation process in my brain says, “Make everyone feel more pleasure and less pain”; I lack a sufficiently detailed map to decide whether this implies hedonic treadmills or orgasmium.

A brain with a highly accurate map might still wonder, “Which optimisation process on my map should I choose”—but only when the function “self” is being executed, and this translates to, “Which other optimisation process in this brain should I switch on now?”. An optimisation-process-on-a-map cannot choose to be a different optimisation process—only a brain in thrall of selfhood bias would think so.

I call the different goals in a brain “sub-agents”. My selfhood anti-realism is not to be confused with Dennett’s eliminativism of qualia. I use the word “I” to denote the sub-agent responsible for a given claim. “I am a hedonic utilitarian” is true iff that claim is produced by the execution of a sub-agent whose optimisation-process-on-a-map is “Make everyone feel more pleasure and less pain”.

False vacuum: the universe playing quantum suicide

16 Stuart_Armstrong 09 January 2013 05:04PM

Imagine that the universe is approximately as it appears to be (I know, this is a controversial proposition, but bear with me!). Further imagine that the many worlds interpretation of Quantum mechanics is true (I'm really moving out of Less Wrong's comfort zone here, aren't I?).

Now assume that our universe is in a situation of false vacuum - the universe is not in its lowest energy configuration. Somewhere, at some point, our universe may tunnel into true vacuum, resulting in a expanding bubble of destruction that will eat the entire universe at high speed, destroying all matter and life. In many worlds, such a collapse need not be terminal: life could go one on a branch of lower measure. In fact, anthropically, life will go on somewhere, no matter how unstable the false vacuum is.

So now assume that the false vacuum we're in is highly unstable - the measure of the branch in which our universe survives goes down by a factor of a trillion every second. We only exist because we're in the branch of measure a trillionth of a trillionth of a trillionth of... all the way back to the Big Bang.

None of these assumptions make any difference to what we'd expect to see observationally: only a good enough theory can say that they're right or wrong. You may notice that this setup transforms the whole universe into a quantum suicide situation.

The question is, how do you go about maximising expected utility in this situation? I can think of a few different approaches:

  1. Gnaw on the bullet: take the quantum measure as a probability. This means that you now have a discount factor of a trillion every second. You have to rush out and get/do all the good stuff as fast as possible: a delay of a second costs you a reduction in utility of a trillion. If you are a negative utilitarian, you also have to rush to minimise the bad stuff, but you can also take comfort in the fact that the potential for negative utility across the universe is going down fast.
  2. Use relative measures: care about the relative proportion of good worlds versus bad worlds, while assigning zero to those worlds where the vacuum has collapsed. This requires a natural zero to make sense, and can be seen as quite arbitrary: what would you do about entangled worlds, or about the non-zero probability that the vacuum-collapsed worlds may have worthwhile life in them? Would the relative measure user also put zero value to worlds that were empty of life for other reasons than vacuum collapse? For instance, would they  be in favour of programming an AI's friendliness using random quantum bits, if it could be reassured that if friendliness fails, the AI would kill everyone immediately?
  3. Deny the measure: construct a meta ethical theory where only classical probabilities (or classical uncertainties) count as probabilities. Quantum measures do not: you care about the sum total of all branches of the universe. Universes in which the photon went through the top slit, went through the bottom slit, or was in an entangled state that went through both slits... to you, there are three completely separate universes, and you can assign totally unrelated utilities to each one. This seems quite arbitrary, though: how are you going to construct these preferences across the whole of the quantum universe, when forged your current preferences on a single branch?
  4. Cheat: note that nothing in life is certain. Even if we have the strongest evidence imaginable about vacuum collapse, there's always a tiny chance that the evidence is wrong. After a few seconds, that probability will be dwarfed by the discount factor of the collapsing universe. So go about your business as usual, knowing that most of the measure/probability mass remains in the non-collapsing universe. This can get tricky if, for instance the vacuum collapsed more slowly that a factor of a trillion a second. Would you be in a situation where you should behave as if you believed vacuum collapse for another decade, say, and then switch to a behaviour that assumed non-collapse afterwards? Also, would you take seemingly stupid bets, like bets at a trillion trillion trillion to one that the next piece of evidence will show no collapse (if you lose, you're likely in the low measure universe anyway, so the loss is minute)?

 

Some scary life extension dilemmas

2 Ghatanathoah 01 January 2013 06:41PM

Let's imagine a life extension drug has been discovered.  One dose of this drug extends one's life by 49.99 years.  This drug also has a mild cumulative effect, if it has been given to someone who has been dosed with it before it will extend their life by 50 years.

Under these constraints the most efficient way to maximize the amount of life extension this drug can produce is to give every dose to one individual.  If there was one dose available for all seven-billion people alive on Earth then giving every person one dose would result in a total of 349,930,000,000 years of life gained.  If one person was given all the doses a total of 349,999,999,999.99 years of life would be gained.  Sharing the life extension drug equally would result in a net loss of almost 70 million years of life.  If you're concerned about people's reaction to this policy then we could make it a big lottery, where every person on Earth gets a chance to gamble their dose for a chance at all of them.

Now, one could make certain moral arguments in favor of sharing the drug.  I'll get to those later.  However, it seems to me that gambling your dose for a chance at all of them isn't rational from a purely self-interested point of view either.  You will not win the lottery.  Your chances of winning this particular lottery are almost 7,000 times worse than your chances of winning the powerball jackpot.  If someone gave me a dose of the drug, and then offered me a chance to gamble in this lottery, I'd accuse them of Pascal's mugging.

Here's an even scarier thought experiment.  Imagine we invent the technology for whole brain emulation.  Let "x" equal the amount of resources it takes to sustain a WBE through 100 years of life.  Let's imagine that with this particular type of technology, it costs 10x to convert a human into a WBE and it costs 100x to sustain a biological human through the course of their natural life.  Let's have the cost of making multiple copies of a WBE once they have been converted be close to 0.

Again, under these constraints it seems like the most effective way to maximize the amount of life extension done is to convert one person into a WBE, then kill everyone else and use the resources that were sustaining them to make more WBEs, or extend the life of more WBEs.  Again, if we are concerned about people's reaction to this policy we could make it a lottery.  And again, if I was given a chance to play in this lottery I would turn it down and consider it a form of Pascal's mugging.

I'm sure that most readers, like myself, would find these policies very objectionable.  However, I have trouble finding objections to them from the perspective of classical utilitarianism.  Indeed, most people have probably noticed that these scenarios are very similar to Nozick's "utility monster" thought experiment.  I have made a list of possible objections to these scenarios that I have been considering:

1. First, let's deal with the unsatisfying practical objections.  In the case of the drug example, it seems likely that a more efficient form of life extension will likely be developed in the future.  In that case it would be better to give everyone the drug to sustain them until that time.  However, this objection, like most practical ones, seems unsatisfying.  It seems like there are strong moral objections to not sharing the drug.

Another pragmatic objection is that, in the case of the drug scenario, the lucky winner of the lottery might miss their friends and relatives who have died.  And in the WBE scenario it seems like the lottery winner might get lonely being the only person on Earth.  But again, this is unsatisfying.  If the lottery winner were allowed to share their winnings with their immediate social circle, or if they were a sociopathic loner who cared nothing for others, it still seems bad that they end up killing everyone else on Earth.   

2. One could use the classic utilitarian argument in favor of equality: diminishing marginal utility.  However, I don't think this works.  Humans don't seem to experience diminishing returns from lifespan in the same way they do from wealth.  It's absurd to argue that a person who lives to the ripe old age of 60 generates less utility than two people who die at age 30 (all other things being equal).  The reason the DMI argument works when arguing for equality of wealth is that people are limited in their ability to get utility from their wealth, because there is only so much time in the day to spend enjoying it.  Extended lifespan removes this restriction, making a longer-lived person essentially a utility monster.

3. My intuitions about the lottery could be mistaken.  It seems to me that if I was offered the possibility of gambling my dose of life extension drug with just one other person, I still wouldn't do it.  If I understand probabilities correctly, then gambling for a chance at living either 0 or 99.99 additional years is equivalent to having a certainty of an additional 49.995  years of life, which is better than the certainty of 49.99 years of life I'd have if I didn't make the gamble.  But I still wouldn't do it, partly because I'd be afraid I'd lose and partly because I wouldn't want to kill the person I was gambling with. 

So maybe my horror at these scenarios is driven by that same hesitancy.  Maybe I just don't understand the probabilities right.  But even if that is the case, even if it is rational for me to gamble my dose with just one other person, it doesn't seem like the gambling would scale.  I will not win the "lifetime lottery."

4. Finally, we have those moral objections I mentioned earlier.  Utilitarianism is a pretty awesome moral theory under most circumstances.  However, when it is applied to scenarios involving population growth and scenarios where one individual is vastly better at converting resources into utility than their fellows, it tends to produce very scary results.  If we accept the complexity of value thesis (and I think we should), this suggests that there are other moral values that are not salient in the "special case" of scenarios with no population growth or utility monsters, but become relevant in scenarios where there are.

For instance, it may be that prioritarianism is better than pure utilitarianism, and in this case sharing the life extension method might be best because of the benefits it accords the least off.  Or it may be (in the case of the WBE example) that having a large number of unique, worthwhile lives in the world is valuable because it produces experiences like love, friendship, and diversity. 

My tentative guess at the moment is that there probably are some other moral values that make the scenarios I described morally suboptimal, even though they seem to make sense from a utilitarian perspective.  However, I'm interested in what other people think.  Maybe I'm missing something really obvious.

 

EDIT:  To make it clear, when I refer to "amount of years added" I am assuming for simplicity's sake that all the years added are years that the person whose life is being extended wants to live and contain a large amount of positive experiences. I'm not saying that lifespan is exactly equivalent to utility. The problem I am trying to resolve is that it seems like the scenarios I've described seem to maximize the number of positive events it is possible for the people in the scenario to experience, even though they involve killing the majority of people involved.  I'm not sure "positive experiences" is exactly equivalent to "utility" either, but it's likely a much closer match than lifespan.

Three kinds of moral uncertainty

32 Kaj_Sotala 30 December 2012 10:43AM

Related to: Moral uncertainty (wiki), Moral uncertainty - towards a solution?, Ontological Crisis in Humans.

Moral uncertainty (or normative uncertainty) is uncertainty about how to act given the diversity of moral doctrines. For example, suppose that we knew for certain that a new technology would enable more humans to live on another planet with slightly less well-being than on Earth[1]. An average utilitarian would consider these consequences bad, while a total utilitarian would endorse such technology. If we are uncertain about which of these two theories are right, what should we do? (LW wiki)

I have long been slightly frustrated by the existing discussions about moral uncertainty that I've seen. I suspect that the reason has been that they've been unclear on what exactly they mean when they say that we are "uncertain about which theory is right" - what is uncertainty about moral theories? Furthermore, especially when discussing things in an FAI context, it feels like several different senses of moral uncertainty get mixed together. Here is my suggested breakdown, with some elaboration:

Descriptive moral uncertainty. What is the most accurate way of describing my values? The classical FAI-relevant question, this is in a sense the most straightforward one. We have some set values, and although we can describe parts of them verbally, we do not have conscious access to the deep-level cognitive machinery that generates them. We might feel relatively sure that our moral intuitions are produced by a system that's mostly consequentialist, but suspect that parts of us might be better described as deontologist. A solution to descriptive moral uncertainty would involve a system capable of somehow extracting the mental machinery that produced our values, or creating a moral reasoning system which managed to produce the same values by some other process.

Epistemic moral uncertainty. Would I reconsider any of my values if I knew more? Perhaps we hate the practice of eating five-sided fruit and think that everyone who eats five-sided fruit should be thrown to jail, but if we found out that five-sided fruit made people happier and had no averse effects, we would change our minds. This roughly corresponds to the "our wish if we knew more, thought faster" part of Eliezer's original CEV description. A solution to epistemic moral uncertainty would involve finding out more about the world.

Intrinsic moral uncertainty. Which axioms should I endorse? We might be intrinsically conflicted between different value systems. Perhaps we are trying to choose whether to be loyal to a friend or whether to act for the common good (a conflict between two forms of deontology, or between deontology and consequentialism), or we could be conflicted between positive and negative utilitarianism. In its purest form, this sense of moral uncertainty closely resembles what would otherwise be called a wrong question, one where

you cannot even imagine any concrete, specific state of how-the-world-is that would answer the question.  When it doesn't even seem possible to answer the question.

But unlike wrong questions, questions of intrinsic moral uncertainty are real ones that you need to actually answer in order to make a choice. They are generated when different modules within your brain generate different moral intuitions, and are essentially power struggles between various parts of your mind. A solution to intrinsic moral uncertainty would involve somehow tipping the balance of power in favor of one of the "mind factions". This could involve developing an argument sufficiently persuasive to convince most parts of yourself, or self-modifying in such a way that one of the factions loses its sway over your decision-making. (Of course, if you already knew for certain which faction you wanted to expunge, you wouldn't need to do it in the first place.) I would roughly interpret the "our wish ... if we had grown up farther together" part of CEV to be an attempt to model some of the social influences on our moral intuitions and thereby help resolve cases of intrinsic moral uncertainty.


This is a very preliminary categorization, and I'm sure that it could be improved upon. There also seem to exist cases of moral uncertainty which are hybrids of several categories - for example, ontological crises seem to be mostly about intrinsic moral uncertainty, but to also incorporate some elements of epistemic moral uncertainty. I also have a general suspicion that these categories still don't cut reality that well at the joints, so any suggestions for improvement would be much appreciated.

View more: Next