LESSWRONG
is fundraising!
LW
$

Comment Permalink

Yes, rule-based systems might respond faster, and that is sometimes preferable.

Let me back up. I categorize ethical systems into different levels of meta. "Personal ethics" are the ethical system an individual agent follows. Efficiency, and the agent's limited knowledge, intelligence, and perspective, are big factors.

"Social ethics" are the ethics a society agrees on. AFAIK, all existing ethical theorizing supposes that these are the same, and that an agent's ethics, and its society's ethics, must be the same thing. This makes no sense; casual observation shows this is not the case. People have ethical codes, and they are seldom the same as the ethical codes that society tells them they should have. There are obvious evolutionary reasons for this. Social ethics and personal ethics are often at cross-purposes. Social ethics are inherently dishonest, because the most effective way of maximizing social utility is often to deceive people, We expect, for instance, that telling people there is a distinction between personal ethics and social ethics should be against every social ethics in existence.

(I don't mean that social ethics are necessarily exploiting people. Even if you sincerely want the best outcome for people, and they have personal ethics such that you don't need to deceive them into cooperating, many will be too stupid or in too much of a hurry to get good results if given full knowledge of the values that the designers of the social ethics were trying to optimize. Evolution may be the designer.)

"Meta-ethics" is honest social ethics - trying to figure out what we should maximize, in a way that is not meant for public consumption - you're not going to write your conclusions on stone tablets and give them to the masses, who wouldn't understand them anyway. When Eliezer talks about designing Friendly AI, that's meta-ethics (I hope). And that's what I'm referring to here when I talk about encoding human values into an AI.

Roughly, meta-ethics is "correct and thorough" ethics, where we want to know the truth and get the "right" answer (if there is one) about what to optimize.

Social ethics and agent are likely to be rule-based, and that may be appropriate. Meta-ethics is an abstract thing, carried out, eg., by philosophers in journal articles; and speed of computation is typically not an issue.

Any rule-based system can be transformed into a utilitarian system, but not vice-versa. Any system that can produce a choice between any two outcomes or actions imposes a complete ordering on all possible outcomes or actions, and is therefore utilitarian.

See in context

45 Human errors, human values

by PhilGoetz

9th Apr 2011

1 min read

138

45

The trolley problem

In 2009, a pair of computer scientists published a paper enabling computers to behave like humans on the trolley problem (PDF here). They developed a logic that a computer could use to justify not pushing one person onto the tracks in order to save five other people. They described this feat as showing "how moral decisions can be drawn computationally by using prospective logic programs."

I would describe it as devoting a lot of time and effort to cripple a reasoning system by encoding human irrationality into its logic.

Which view is correct?

Dust specks

Eliezer argued that we should prefer 1 person being tortured for 50 years over 3^^^3 people each once getting a barely-noticeable dust speck in their eyes. Most people choose the many dust specks over the torture. Some people argued that "human values" includes having a utility aggregation function that rounds tiny (absolute value) utilities to zero, thus giving the "dust specks" answer. No, Eliezer said; this was an error in human reasoning. Is it an error, or a value?

Sex vs. punishment

In Crime and punishment, I argued that people want to punish criminals, even if there is a painless, less-costly way to prevent crime. This means that people value punishing criminals. This value may have evolved to accomplish the social goal of reducing crime. Most readers agreed that, since we can deduce this underlying reason, and accomplish it more effectively through reasoning, preferring to punish criminals is an error in judgement.

Most people want to have sex. This value evolved to accomplish the goal of reproducing. Since we can deduce this underlying reason, and accomplish it more efficiently than by going out to bars every evening for ten years, is this desire for sex an error in judgement that we should erase?

The problem for Friendly AI

Until you come up with a procedure for determining, in general, when something is a value and when it is an error, there is no point in trying to design artificial intelligences that encode human "values".

(P.S. - I think that necessary, but not sufficient, preconditions for developing such a procedure, are to agree that only utilitarian ethics are valid, and to agree on an aggregation function.)

Trolley ProblemConsequentialismEthics & MoralityMoral UncertaintyAI

Frontpage

45

Mentioned in

13Holden's Objection 1: Friendliness is dangerous

9Values vs. parameters

Human errors, human values

0Swimmer963 (Miranda Dixon-Luinenburg)

9Swimmer963 (Miranda Dixon-Luinenburg)

4PhilGoetz

3Luke Stebbing

0PhilGoetz

2Swimmer963 (Miranda Dixon-Luinenburg)

New Comment

138 comments, sorted by

top scoring

Click to highlight new comments since: Today at 1:48 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Johnicholas14y180

The way that we can resolve values vs. errors is by endorsing symmetries.

For example, Rawl's "veil of ignorance" enjoins us to design a society, on the assumption that we might be anyone in that society - we might have any degree of talent or disability, any taste or preferences, and so on. This is permutation symmetry.

If we have two situations that we believe are exactly analogous (for example, the trolley car problem and a similar problem with a subway car), then we call any systematic difference in our human intuitions an error, and we choose one of the two intuitions to endorse as applying to both cases. (I don't know that people systematically differ in their survey responses to questions about trolley car problems vs. subway car problems, but I wouldn't be surprised.)

In forming a notion of values and errors, we are choosing a priority order among various symmetries and various human intuitions. Utilitarians prioritize the analogy between flipping the switch and pushing the fat man over the intuition that we should not push the fat man.

[-]PhilGoetz14y140

That's a good idea. I wonder if anyone has done a trolley-problem survey, phrasing it in the terms, "Would you rather live in a society where people would do X or Y?"

7shokwave14y

For one data point, I'd rather live in a world where people made 'push the fat man' decisions. As per Lightwave's comment, if the likelihoods are not skewed, I have a 5/6 chance of being on the tracks, and a 1/6 chance of being the fat man. I can't in good conscience choose the option that doesn't maximise my chances of survival.

3NancyLebovitz14y

I don't know if anyone's done the survey, but it's a very interesting question. In other words, I'm not sure how I'd answer it.

2[anonymous]14y

This is, however, not the question anybody really faces. It is, for sure, the question that we are often asked to face when doing moral and political philosophy. Kant's "categorical imperative" is similar (via Wpedia): But the reality is that our individual acts do not automatically become universal laws. We really do not get to choose our society from among all possible societies (at best, we get to choose from the much narrower range of actual societies). These imagined shortcuts are attractive to the impatient philosopher who wishes to retrace ten thousand years of development with a single thought, and who fancies that he can do better with these mental shortcuts, but they have little to do with reality, which renders their results of dubious value. Very few people have been anywhere near to a position in which they can choose their society. And the ones near that position - such as the supposedly most powerful man in the world (currently Obama) are themselves highly constrained in what they are able to do and what they are inclined to do given their incentives, so that the result is not much like choosing what society they want to live in. Even absolute dictators such as Castro and Kim Jong-Il, while I'm sure they're taking care of themselves (which is all they ultimately seem to care about), probably have not built anything like the societies they dreamed of building. The incomplete Ryugyong Hotel stands as a symbol of the gulf between their aspirations and the reality of what they produced. The choices that we all do have are local and concrete, and our local and concrete answers to these questions are, in aggregate, I think the most powerful factor shaping custom and morality, though they do it slowly, as the Colorado River carved the Grand Canyon. There are, however, mental toolboxes which provide shortcuts for possibly understanding and anticipating the outcome of long years of past and future societal evolution. Economics and game theory are two of th

2dschwarz14y

I have to respectfully disagree with your position. Kant's point, and the point of similar people who make the sweeping universalizations that you dislike, is that it is only in such idealized circumstances that we can make rational decisions. What makes a decision good or bad is whether it would be the decision rational people would endorse in a perfect society. The trouble is not moving from our flawed world to an ideal world. The trouble is taking the lesson we've learned from considering the ideal world and applying it to the flawed world. Kant's program is widely considered to be a failure because it fails to provide real guidelines for the real world. Basically, my point is that asking the Rawlsian "Would you prefer to live in a society where people do X" is valid. However, one may answer that question with "yes" and still rationally refrain from doing X. So your general point, that local and concrete decisions rule the day, still stands. Personally, though, I try to approach local and concrete decisions the way that Rawls does.

1[anonymous]14y

I actually happen to think that human morality is a fit topic for empirical inquiry, same as human language. This is a wildly different approach from either the Kantian or the Rawlsian approach. To study English, we look at the actual practices and we (possibly) develop hypotheses about the development of English and of language in general. What we do not do - in an empirical study of English - is ask ourselves what grammar, what pronunciation, what meanings we would prefer in a perfect society. Such questions are what the creators of Esperanto asked themselves (I presume). Kant and Rawls are trying to do the moral equivalent of inventing Esperanto. I, in contrast, think that morality is something that, like English and French, already exists in the world, possibly varying a bit from place to place. I realize that Kant and Rawls seek to critique our actual practices. It may seem puzzling for me to say so since I just explained my preferred approach as empirical, but so do I. But I do so from a different direction. Just as linguists will distinguish between natural language as it arises spontaneously among speakers, and the pedantic rules endorsed by language mavens, so do I distinguish between morality as it would arise spontaneously among people, and the laws raised over us by legislatures.

2amcknight13y

I think this is the way that a lot of philosophy is done. Identifying symmetries in order to attach more to your intuition pumps. (By the way, great response! It's the only one that directly addresses the main issue raised in the article, as of May 2012.)

[-]rwallace14y180

Cast in consequentialist terms, the reason we shouldn't push the fat man in the second trolley problem is that we are fallible, and when we believe committing an unethical act will serve the greater good, we are probably wrong.

Thought experiments aside, supposing that scenario came up in real life, and I tried actually pushing the fat man, what would happen? Answer: either I'd end up in a tussle with an angry fat man demanding to know why I just tried to kill him, while whatever chance I might have had of shouting a warning to the people in the path of the trolley was lost, or I'd succeed a second too late and then I'd have committed murder for nothing. And when the media got hold of the story and spread it far and wide - which they probably would, it's exactly the kind of ghoulish crap they love - it might help spread the idea that in a disaster, you can't afford to devote all your attention to helping your neighbors, because you need to spare some of it for watching out for somebody trying to kill you for the greater good. That could easily cost more than five lives.

If some future generation ever builds a machine whose domain and capabilities are such that it is called on to make... (read more)

9PhilGoetz14y

That's a good observation, but it doesn't completely solve the problem. The problem here is not just the trolley problem. The problem is that people disagree on whether not pushing the fat man is a value, or a bug. The trolley problem is just one example of the difficulty of determining this in general. There is a large literature on the trolley problem, and on how to solve the trolley problem, and the view taken in the paper, which was arrived at by many experts after studying the problem and conducting polls and other research, is that humans have a moral value called the "principle of double effect": Is this a value, or a bug? As long as we can't all agree on that, there's no reason to expect we can correctly figure out what are values and what are bugs. There's really two problems: 1. Come up with a procedure to determine whether a behavior is a value or an error. 2. Convince most other people in the world that your procedure is correct. Personally, I think a reasonable first step is to try to restrict ethics to utilitarian approaches. We'll never reach agreement as long as there are people still trying to use rule-based ethics (such as the "double effect" rule). The difficulty of getting most people to agree that there are no valid non-utilitarian ethical frameworks is just a small fraction of the difficulty of the entire program of agreeing on human values.

7TCB14y

Perhaps I am missing something here, but I don't see why utilitarianism is necessarily superior to rule-based ethics. An obvious advantage of a rule-based moral system is the speed of computation. Situations like the trolley problem require extremely fast decision-making. Considering how many problems local optima cause in machine learning and optimization, I imagine that it would be difficult for an AI to assess every possible alternative and pick the one which maximized overall utility in time to make such a decision. Certainly, we as humans frequently miss obvious alternatives when making decisions, especially when we are upset, as most humans would be if they saw a trolley about to crash into five people. Thus, having a rule-based moral system would allow us to easily make split-second decisions when such decisions were required. Of course, we would not want to rely on a simple rule-based moral system all the time, and there are obvious benefits to utilitarianism when time is available for careful deliberation. It seems that it would be advantageous to switch back and forth between these two systems based on the time available for computation. If the rules in a rule-based ethical system were derived from utilitarian concerns, and were chosen to maximize the expected utility over all situations to which the rule might be applied, would it not make sense to use such a rule-based system for very important, split-second decisions?

4PhilGoetz14y

Yes, rule-based systems might respond faster, and that is sometimes preferable. Let me back up. I categorize ethical systems into different levels of meta. "Personal ethics" are the ethical system an individual agent follows. Efficiency, and the agent's limited knowledge, intelligence, and perspective, are big factors. "Social ethics" are the ethics a society agrees on. AFAIK, all existing ethical theorizing supposes that these are the same, and that an agent's ethics, and its society's ethics, must be the same thing. This makes no sense; casual observation shows this is not the case. People have ethical codes, and they are seldom the same as the ethical codes that society tells them they should have. There are obvious evolutionary reasons for this. Social ethics and personal ethics are often at cross-purposes. Social ethics are inherently dishonest, because the most effective way of maximizing social utility is often to deceive people, We expect, for instance, that telling people there is a distinction between personal ethics and social ethics should be against every social ethics in existence. (I don't mean that social ethics are necessarily exploiting people. Even if you sincerely want the best outcome for people, and they have personal ethics such that you don't need to deceive them into cooperating, many will be too stupid or in too much of a hurry to get good results if given full knowledge of the values that the designers of the social ethics were trying to optimize. Evolution may be the designer.) "Meta-ethics" is honest social ethics - trying to figure out what we should maximize, in a way that is not meant for public consumption - you're not going to write your conclusions on stone tablets and give them to the masses, who wouldn't understand them anyway. When Eliezer talks about designing Friendly AI, that's meta-ethics (I hope). And that's what I'm referring to here when I talk about encoding human values into an AI. Roughly, meta-ethics is "correct

1[anonymous]14y

People do, but how much of that disagreement is between people who have been exposed to utilitarian and consequentialist moral philosophy, and people who have not? The linked article says: The key word is "consistent". The article does not (in this quote, and as far as I can see) highlight the disagreement that you are talking about. I, of course, am aware of this disagreement - but a large fraction of the people that I discuss this topic with are utilitarians. What the quote from the article suggests to me is that, outside a minuscule population of people who have been exposed to utilitarianism, there is not significant disagreement on this point. If this is the case, then utilitarianism may have created this problem, and the solution may be as simple as rejecting utilitarianism.

3PhilGoetz14y

And here I thought you were going to conclude that this showed that the majority reaction was in error.

3[anonymous]14y

You stated a problem: how to get people to agree. You gave your solution to the problem here (my emphasis) I pointed out, however, that it is apparently utilitarianism that has introduced the disagreement in the first place. I explained why that seems to be so. So the problem may be utilitarianism. If so, then the solution is to reject it.

1Richard_Kennaway14y

How are you judging the validity of an ethical framework? Everything I've read on the subject (which is not a huge amount) assesses ethical systems by constructing intuition-pumping examples (such as the trolley problem, or TORTURE vs. SPECKS, or whatever), and inviting the reader to agree that such-and-such a system gives the right, or the wrong answer to such-and-such an example. But what ethical system produces these judgements, with respect to which other ethical systems are being evaluated?

4PhilGoetz14y

That's the question I'm asking, not the question I'm answering. :)

[-]wobster10914y110

I'm a bit skeptical of using majority survey response to determine "morality." After all, given a Bayesian probability problem, (the exact problem was patients with cancer tests, with a chance of returning a false positive,) most people will give the wrong answer, but we certainly don't want our computers to make this kind of error.

As to the torture vs. dust specks, when I thought about it, I decided first that torture was unacceptable, and then tried to modify my utility function to round to zero, etc. I was very appalled with myself to find that I decided the answer in advance, and then tried to make my utility function fit a predetermined answer. It felt an awful lot like rationalizing. I don't know if everyone else is doing the same thing, but if you are, I urge you to reconsider. If we always go with what feels right, what's the point of using utility functions at all?

3[anonymous]14y

Morality may be the sort of thing that people are especially likely to get right. Specifically, morality may be a set of rules created, supported, and observed by virtually everyone. If so, then a majority survey response about morality may be much like a majority survey response about the rules of chess, restricted to avid chess players (i.e., that subset of the population which observes and supports the rules of chess as a nearly daily occurrence, just as virtually the whole of humanity observes and supports the rules of morality on a daily basis). If you go to a chess tournament and ask the participants to demonstrate how the knight moves in chess, then (a) the vast majority will almost certainly give you the same answer, and (b) that answer will almost certainly be right.

1TheOtherDave14y

One point could be to formalize our feelings about what is right.

8David_Gerard14y

As long as you take care not to overextend. Today my hypothesis is that moralities are sets of cached answers to game theory (possibly cached in our genes), and extending those rules beyond what they're tested against is likely to lead to trouble. Humans try hard to formalise their moralities, but that doesn't make it a good idea per se. (On the other hand, it may require explanation as to why they do.)

5TheOtherDave14y

Yes, part of an accurate description is identifying the boundary conditions within which that description applies, and applying it outside that boundary is asking for trouble. Agreed. I don't see how this is any different for folk morality than for folk physics, folk medicine, folk sociology, or any other aspect of human psychology. For my own part, I find that formalizing my intuitions (moral and otherwise) is a useful step towards identifying the biases that those intuitions introduce into my thinking. I also find that I want to formalize other people's intuitions as a way of subverting the "tyranny of structurelessness" -- that is, the dynamic whereby a structure that remains covert is thereby protected from attack and can operate without accountability. Moral intuitions are frequently used this way.

7David_Gerard14y

Oh yeah. My point - if I have a point, which I may or may not do - is that you can't do it on the level of the morality itself and get good results, as that's all cached derived resuits; you have to go to metamorality, i.e. game theory (at least), not to risk going over the edge into silliness. It's possible this says nothing and adds up to normality, which is the "may not do" bit. I'm currently reading back through abstruse game theory posts on LessWrong and particularly this truly marvellous book and realising just how damn useful this stuff is going to be in real life. Free will as undiscoverability?

2TheOtherDave14y

Oh! (blink) That's actually a very good point. I endorse having it, should you ever do.

0David_Gerard14y

Looks like proper philosophers have been working through the notion since the 1970s. It would be annoying to have come up with a workable version of libertarianism.

0David_Gerard14y

Found a bit of popular science suggesting I'm on the right track about the origins. (I'm ignoring the Liberal/Conservative guff, that just detracts from the actual point and leads me to think less of the researcher.) I don't want to actually have to buy a copy of this, but it looks along the right lines. The implication that overextending the generated rules without firmly checking against the generator's reasons leads to trouble - and is what often leads to trouble - is mine, but would, I'd hope, follow fairly obviously.

0David_Gerard14y

I'm hoping not to have to read the entirety of LessWrong (and I thought the sequences were long) before being able to be confident I have indeed had it :-) May I particularly strongly recommend the Schelling book. Amazing. I'm getting useful results in such practical fields as dealing with four-year-olds and surly teenagers already.

2cousin_it14y

Same here. I think Schelling's book has helped me win at life more than all of LW did. That's why I gave it such a glowing review :-)

0David_Gerard14y

Now you need to find a book that similarly pwns the field of dog training.

2TheOtherDave14y

Awesome! I also found "Don't Shoot The Dog" very useful in those fields, incidentally.

3David_Gerard14y

"Every parent needs to learn the basics of one, avoiding a nuclear holocaust and two, dog training."

3PhilGoetz14y

Can we use folk physics and the development of physics as a model for the proper relationship between "folk ethics" and ethics?

1[anonymous]14y

In game theory the stable solution such as a nash equilibrium is not necessarily one that maximizes aggregate utility. A game theory approach is for this reason probably at odds with a utilitarian approach to morality. If the game theory approach to morality is right, then utilitarianism is probably wrong.

[-]CronoDAS14y100

There's another version of the trolley problem that's even squickier than the "push a man onto the track" version...

“A brilliant transplant surgeon has five patients, each in need of a different organ, each of whom will die without that organ. Unfortunately, there are no organs available to perform any of these five transplant operations. A healthy young traveler, just passing through the city the doctor works in, comes in for a routine checkup. In the course of doing the checkup, the doctor discovers that his organs are compatible with all five of his dying patients. Suppose further that if the young man were to disappear, no one would suspect the doctor.”

-- Judith Jarvis Thomson, The Trolley Problem, 94 Yale Law Journal 1395-1415 (1985)

For some reason, it's a lot less comfortable to endorse murdering the patient than it is to endorse pushing the fat man onto the track...

[-]Desrtopa14y190

That one was raised by a visiting philosopher at my college as an argument (from intuition) against utilitarianism. I pointed out that if we tended to kill patients to harvest them to save more patients, people would be so fearful of being harvested that they would tend not to visit hospitals at all, leading to a greater loss of health and life. So in this case, in any realistic formulation, the less comfortable option is also the one that leads to less utility.

I suspect that this version feels even less comfortable than the trolley dilemma because it includes the violation of an implicit social contract, that if you go into a hospital, they'll try to make you healthier, not kill you. But while violating implicit social contracts tends to be a bad idea, that's certainly not to say that there's any guarantee that the utilitarian thing to do in some situations won't be massively uncomfortable.

5PhilGoetz14y

There are a number of science fiction stories about uncomfortable utilitarian choices. "The Cold Equations" is the most famous. I think Heinlein wrote a novel that had a character who was in charge of a colony that ran out of power, and so he killed half of them in order for the remaining life support to be enough to let the others live until relief arrived. No one stopped him at the time, but after they were safe, they branded him a war criminal or something like that.

4NancyLebovitz14y

I don't think that's a Heinlein. I don't have a specific memory of that story, and his work didn't tend to be that bleak. I'm willing to be surprised if someone has a specific reference.

3shokwave14y

There's also Eliezer's Three Worlds Collide, which has a short aside on ships trying to take on just one more passenger and getting caught in the nova. And I think the movie Titanic had an officer cold-bloodedly executing a man who tried to get onto a full lifeboat, potentially sinking it.

2mkehrt14y

It's possible that you are referring to the secondary plot line of Chasm City by Alaistair Reynolds in which gur nagvureb wrggvfbaf unys gur uvoreangvba cbqf va uvf fgnefuvc, nyybjvat vg gb neevir orsber gur bguref va gur syrrg naq fb tnva zvyvgnel nqinagntr.

1PhilGoetz14y

No, that's different. I was referring to a commander who saved lives, but was condemned for doing that instead of letting everybody die.

0PhilGoetz14y

Does less-wrong have rot13 functionality built in?

2mkehrt14y

No, I used http://www.rot13.com .

0[anonymous]14y

Alistair Reynold's "Chasm City" has a similar back-story. Several colony ships are heading to a new planet, but after generations in space have developed cold-war style hostilities. The captain of one of the ships kills half the cryo-preserved colonists and jettisons their weight so he doesn't have to slow his ship as soon as the other three. Arriving several weeks before the rest, his colonists get all the best colony landing spots and dominate the planet. He is immediately captured and executed as a war criminal, but generations later people view him with mixed emotions - a bit of a monster, yet one who sacrificed himself in order that his people could win the planet.

[-]Lightwave14y120

If the likelihood of me needing a life-saving organ transplant at some point in my life is the same as for most other people, then I think I'd bite the bullet and agree to a system in which random healthy people are killed for their organs. Why? Because I'd have 5x the chance of being saved than being killed.

[-]wedrifid14y100

Because I'd have 5x the chance of being saved than being killed.

Except, of course, for the chance of being slain in the inevitable civil war that ensues. ;)

7Alicorn14y

I remember a short story - title and author escape me - where this was actually much like what was going on. Everyone had their relevant types on file, and if some number of people needed an organ you could supply, you were harvested for said organs. The protagonist got notified that there were nearly enough people who needed his organs and he went undercover and visited them all, thinking he'd kill one and get out of it, but he finds that they aren't what he expected (e.g. the one who needs a liver is a girl with hepatitis, not some drunk) and decides not to, and then one dies anyway and he's off the hook.

3CronoDAS14y

Larry Niven wrote a number of short stories about organ transplants; in one of them, "The Jigsaw Man", the primary source of organs for transplant is executions of criminals, which has led to more and more crimes being punishable by death. The main character of the story, who is currently in jail and awaiting trial, escapes through what amounts a stroke of luck, and finds out that the organ banks are right next to the jail. Certain that he is about to be recaptured and eventually executed, he decides to commit a crime worthy of the punishment he is going to receive: destroying a large number of the preserved organs. At the end of the story, he's brought to trial only for the crime he originally committed: running red lights.

0Alicorn14y

I've read that story, but it's not the one I was thinking of in the grandparent.

0CronoDAS14y

I didn't intend to suggest that "The Jigsaw Man" was the story in question.

0Swimmer963 (Miranda Dixon-Luinenburg) 14y

That sounds like an interesting short story...I wish you remembered the title so I could go track it down.

5PhilGoetz14y

In current practice, organ transplant recipients are typically old people who die shortly after receiving the transplant. The problem is still interesting; but you have to impose some artificial restrictions.

0Lightwave14y

Sure, it's just a thought experiment, like trolley problems. I've seen it used in arguments against consequentialism/utilitarianism, but I'm not sure how many of utilitarians bite this bullet (I guess it depends what type of consequentialist/utilitarian you are).

6benelliott14y

I noticed this as well. Pushing the fat man seemed obvious to me, and I wondered why everyone made such a fuss about it until I saw this dilemma.

6NancyLebovitz14y

Hypotheses: The hospital scenario involves a lot more decisions, so it seems as though there's more rule breaking. You want a hard rule that medical personnel won't do injury to the people in their hands. The trolley scenario evokes prejudice against fat people. It needs variants like redirecting another trolley that has many fewer people in it to block the first trolley, or perhaps knocking two football players (how?) into the trolley's path.

[-]fubarobfusco14y110

You want a hard rule that medical personnel won't do injury to the people in their hands.

Specifically: You don't want future people to avoid seeking medical treatment — or to burn doctors at the stake — out of legitimate fear of being taken apart for their organs. Even if you tell the victims that it's in the greater good for doctors to do that once in a while, the victims' goals aren't served by being sacrificed for the good of five strangers. The victims' goals are much better served by going without a medical checkup, or possibly leading a mob to kill all the doctors.

There is a consequentialist reason to treat human beings as ends rather than means: If a human figures out that you intend to treat him or her as a means, this elicits a whole swath of evolved-in responses that will interfere with your intentions. These range from negotiation ("If you want to treat me as a means, I get to treat you as a means too"), to resistance ("I will stop you from doing that to me"), to outright violence ("I will stop you, so you don't do that to anyone").

3djcb14y

Of course you can add factors to the thought experiment that even while following utilitarianism will make you decide not to harvest the traveler for his organs. But that's also dodging the problem it tries to show -- the problem that sometimes being strictly utilitarian leads to uncomfortable conclusions -- that is, conclusions that conflict with the kind of 'folk', 'knee-jerk' morality we seem to have.

3Unnamed14y

It depends what you consider the starting point for building the scenario. If you start by taking the story seriously as a real-world scenario, taking place in a real hospital with real people, then these are relevant considerations that would naturally arise as you were thinking through the problem, not additional factors that need to be added on. The work comes in removing factors to turn the scenario into an idealized thought experiment that boxes utilitarianism into one side, in opposition to our intuitive moral judgments. And if it's necessary to make extensive or unrealistic stipulations in order to rule out seemingly important considerations, then that raises questions about how much we should be concerned about this thought experiment.

2fubarobfusco14y

Sure. But what's the point of putting "doctor" in the thought-experiment if it isn't to arouse the particular associations that people have about doctors — one of which is the notion that doctors are trusted with unusual levels of access to other people's bodies? It's those associations that lead to people's folk morality coming up with different answers to the "trolley" form and the "doctor" form of the problem. A philosophical system of ethics that doesn't add up to folk morality most of the time, over historical facts would be readily recognized as flawed, or even as not really ethics at all but something else. A system of ethics that added up to baby-eating, paperclipping, self-contradiction, or to the notion that it's evil to have systems of ethics, for that matter — would not be the sort of ethics worth wanting.

0djcb14y

Well, if a different thought experiment leads to different 'folk-morality'-based conclusions while it doesn't make a difference from a strictly utilitarian view point, that shows they are not fully compatible, or? Obviously, you can make them agree again by adding things, but that does not resolve the original problem. For the success of an ethical system indeed it's important to resonate with folk morality, but I also think the phenotype of everyday folk morality is a hodge-podge of biases and illusions. If we would take the basics of folk morality (what would that be, maybe... golden rule + utilitarianism?) I think something more consistent could be forged.

4fubarobfusco14y

I have two responses, based on two different interpretations of your response: I don't see why a "strict utilitarian" would have to be a first-order utilitarian, i.e. one that only sees the immediate consequences of its actions and not the consequences of others' responses to its actions. To avoid dealing with the social consequences (that is, extrapolated multi-actor consequences) of an act means to imagine it being performed in a moral vacuum: a place where nobody outside the imagined scene has any way of finding out what happened or responding to it. But breaking a rule when nobody is around to be injured by it or to report your rule-breaking to others, is a significantly different thought-experiment from breaking a rule under normal circumstances. Humans (and, perhaps, AIs) are not designed to live in a world where the self is the only morally significant actor. They have to care about what other morally significant persons care about, and at more than a second-order level: they need to see not just smiling faces, but have the knowledge that there are minds behind those smiling faces. And in any event, human cognition does not perform well in social or cognitive isolation: put a person in solitary confinement and you can predict that he or she will suffer and experience certain forms of cognitive breakdown; keep a scientific mind in isolation from a scientific community and you can expect that you will get a kook, not a genius. ---------------------------------------- Some people seem to treat the trolley problem and the doctor problem as the same problem stated in two different ways, in such a way as to expose a discrepancy in human moral reasoning. If so, this might be analogous to the Wason selection task, which exposes a discrepancy in human symbolic reasoning. Wason demonstrates that humans reason more effectively about applying social rules than about applying arithmetical rules isomorphic to those social rules. I've always imagined this as humans usi

0[anonymous]14y

Utilitarians can thereby extricate themselves from their prima facie conclusion that it's right to kill the innocent man. However, the solution has the form: "We cannot do what is best utility-wise because others, who are not utilitarians, will respond in ways that damage utility to an even greater extent than we have increased it." However, this kind of solution doesn't speak very well for utilitarianism, for consider an alternative: "We cannot do what is best paperclip-wise because others, who are not paperclippers, will respond in ways that tend to reduce the long term future paperclip-count." In fact, Clippy can 'get the answer right' on a surprisingly high proportion of moral questions if he is prepared to be circumspect, consider the long term, and keep in mind that no-one but him is maximizing paperclips. But then this raises the question: Assuming we lived in a society of utilitarians, who feel no irrational fear at the thought of being harvested for the greater good, and no moral indignation when others are so harvested, would this practice be 'right'? Would that entire society be 'morally preferable' to ours?

1Unnamed14y

There is an alternate version where the man on the footbridge is wearing a heavy backpack, rather than being fat. That's the scenario that Josh Greene & colleagues used in this paper, for instance.

0wedrifid14y

Fat people are OK but you have a problem with football players? There are an awful lot of people who are interested in decision problems who might just say "push 'em" as group-affiliation humor!

2NancyLebovitz14y

The overt reason for pushing a fat man is that it's the way to only kill one person while mustering sufficient weight to stop the trolley. It seems plausible that what's intended is a very fat man, or you could just specify large person. Two football players seems like a way of specifying a substantial amount of weight while involving few people.

3Emile14y

Two additional things are in play here: 1) As others said, there's a breach of an implicit social contract, which explains some squeamishness 2) In this scenario, the "normal" person is the young traveler, he's the one readers are likely to associate with. I'd be inclined to bite the bullet too, i.e. I might prefer living in a society in which things like that happen, provided it really is better (i.e. it doesn't just result in less people visiting doctors etc.). But in this specific scenario, there would be a better solution: the doctor offers to draw lots among the patients to know which of them will is sacrificed to have his organs distributed among the remaining four; so the patients have a choice between agreeing to that (80% chances of survival) and certain death.

4DSimon14y

I like this idea. For the thought experiment at hand, though, it seems too convenient. Suppose the dying patients' organs are mutually incompatible with each other; only the young traveler's organs will do. In that scenario, should the traveler's organs be distributed?

8Emile14y

There's probably a least convenient possible world in which I'd bite the bullet and agree that it might be right for the doctor to kill the patient. Suppose that on planets J and K, doctors are robots, and that it's common knowledge that they are "friendly" consequentialists who take the actions that maximize the expected health of their patients ("friendly" in the sense that they are "good genies" whose utility function matches human morality, i.e. they don't save the life of a patient that wants to die, don't value "vegetables" as much, etc.). But on planet J, robot doctors treat each patient in isolation, maximizing his expected health, whereas on planet K doctors maximize the expected health of their patients as a whole, even if that means killing one to save five others. I would prefer to live on planet K than on planet J, because even if there's a small probability p that I'll have my organs harvested to save five other patients, there's also a probability 5 * p that my life will be saved by a robot doctor's cold utilitarian calculation.

0DSimon14y

Does this include putting less value on patients who would only live a short while longer (say, a year) with a transplant than without? AIUI this is typical of transplant patients.

3Emile14y

Probably yes, which would mean that in many cases the sacrifice wouldn't be made (though - least convenient possible world again - there are cases where it would).

2Armok_GoB14y

I'm not. If I hadn't heard about this or the trolly problem or equivalent I'd probably do it without thinking and then be surprised when people criticised the decision.

[-]Swimmer963 (Miranda Dixon-Luinenburg) 14y90

Most people choose the many dust specks over the torture. Some people argued that "human values" includes having a utility aggregation function that rounds tiny (absolute value) utilities to zero, thus giving the "dust specks" answer. No, Eliezer said; this was an error in human reasoning. Is it an error, or a value?

I'm not sure. I think the answer most people give on this has more to do with fairness than rounding to zero. Yeah, it's annoying for me to get a dust speck in my eye, but it's unfair that someone should be tortured fo... (read more)

4PhilGoetz14y

As Eliezer pointed out, if it's fairness, then you probably have a curved but continuous utility function - and with the numbers involved, it has to be a curve specifically tailored to the example.

3Luke Stebbing14y

Where did Eliezer talk about fairness? I can't find it in the original two threads. This comment talked about sublinear aggregation, but there's a global variable (the temperature of the, um, globe). Swimmer963 is talking about personally choosing specks and then guessing that most people would behave the same. Total disutility is higher, but no one catches on fire. If I was forced to choose between two possible events, and if killing people for organs had no unintended consequences, I'd go with the utilitarian cases, with a side order of a severe permanent guilt complex. On the other hand, if I were asked to accept the personal benefit, I would behave the same as Swimmer963 and with similar expectations. Interestingly, if people are similar enough that TDT applies, my personal decisions become normative. There's no moral dilemma in the case of torture vs specks, though, since choosing torture would result in extreme psychological distress times 3^^^3.

0PhilGoetz14y

When Eliezer wrote, I am taking the inferential step that he was responding to everyone who appealed to non-linear aggregation, including those who just said "we value fairness" without saying or knowing that a technical way of saying that was "we compute a sum over all individuals i of f(utility(i)), where f is convex."

2Swimmer963 (Miranda Dixon-Luinenburg) 14y

A pure utilitarian who could grasp a number as large as 3^^^3 might choose the one person being tortured. My point was that intuitively, the unfairness of torture jumps out more than the huge, huge number of people being minorly annoyed. Maybe fairness as an intuition is more a flaw than a value. That's actually an interesting thought. I'm going to ponder that now for a while.

5TheOtherDave14y

My own feeling is that fairness as an intuition is very useful in small groups but starts to break down as the group gets larger. Which is what I would expect for an intuition that evolved in the context of small groups.

1Normal_Anomaly14y

This matches my intuition on the subject. It also matches my intuition of the problem of Nozick's Utility Monster. Yes, total utility will be maximized if we let the monster eat everyone, but it introduces a large disparity: huge utility for the monster, huge disutility for everyone else. The question is, is this a "valid value" or a problem? The only way I can see to answer this is to ask if I would self-modify to not caring about fairness, and I don't know the answer.

0Houshalter8y

But remove human agency and imagine the torturer isn't a person. Say you can remove a dust speck from your eye, but the procedure has a 1/3\^\^\^3 chance of failing and giving you injuries equivalent to torturing you for 50 years. Now imagine 3\^\^\^3 make a similar choice. One of them will likely fail the procedure and get tortured.

[-][anonymous]14y70

I don't want humans to make decisions where they kill one person to save another. The trolley problem feels bad to us because, in that situation usually, its never that clear. Omega is never leaning over your shoulder, explaining to you that killing the fat man will save those people- you just have to make a guess, and human guesses can be wrong. What I suspect humans are doing is a hidden probability calculation that says "well theres probably a chance of x that I'll save those people, which isn't high enough to chance it". Theres an argument to... (read more)

1drethelin14y

I strongly agree with this. Humans should be morally discouraged from making life or death decisions for other humans because of human fallibility. Individuals not only do not know enough in general to make flash decisions correctly about these kind of probabilities, but also do not share enough values to make these decisions. The rules need to say you can't volunteer other people to die for your cause.

[-]PhilGoetz14y110

Humans make life-and-death decisions for other humans every day. The President decides to bomb Libya or enter Darfur to prevent a genocide. The FDA decides to approve or ban a drug. The EPA decides how to weigh deaths from carcinogens produced by industry, vs. jobs. The DOT decides how to weigh travel time vs. travel deaths.

3NancyLebovitz14y

Note that those are all decisions which have been off-loaded to large institutions. People rarely make overt life and death decisions in their private lives.

6nerzhin14y

Overt is the key word. When you buy a car that's cheaper than a Volvo, or drive over the speed limit, or build a house that cannot withstand a magnitude 9 earthquake, you are making a life and death decision.

2drethelin14y

no. The phrase "life or death decision" does not mean this and this is not how it's used.

1drethelin14y

yes, and this is a series of examples of decisions that almost everyone is discouraged from making themselves. Other examples include a police officer's decision to use lethal force or whether a firefighter goes back into the collapsing building one more time. These are people specifically trained and encouraged to be better at making these judgments, and EVEN then we still prefer the police officer to always take the non-lethal path. The average person is and I think should in general be discouraged from making life or death decisions for other people.

[-]Vaniver14y60

It seems difficult to have this conversation once you've concluded only utilitarian ethics are valid, because the problem is whether or not utilitarian ethics are valid. (I'm excluding utilitarian ethics that are developed to a point where they are functionally deontological from consideration.)

Whether or not you are trying to maximize social status or some sort of abstract quality seems to be the issue, and I'm not sure it's possible to have an honest conversation about that, since one tends to improve social status by claiming to care (and/or actually caring) about an abstract quality.

[-]shokwave14y60

The principle of double effect is interesting:

Harming another individual is permissible if it is the foreseen consequence of an act that will lead to a greater good; in contrast, it is impermissible to harm someone else as an intended means to a greater good.

The distinction to me looks something the difference between

"Take action -> one dies, five live" and "Kill one -> five live"

Where the salient difference is whether the the act is morally permissible on its own. So a morally neutral act like flipping a switch allows the pe... (read more)

5PhilGoetz14y

Is evolution fast enough to have evolved this instinct in the past 4000 years? IIRC, anthropologists have found murder was the most common cause of death for men in some primitive tribes. There can't have been a strong instinct against murder in tribal days, because people did it frequently.

4shokwave14y

It may not even be instinctual; it could be purely cultural conditioning that makes us instinctively refuse murder-like options. Actually, on the balance cultural conditioning seems far more likely.

2Normal_Anomaly14y

Yes, this makes sense. Culture has been changing faster than genes for a long time now. If you're right, shokwave's point still stands.

2jknapka14y

Murder is the most common cause of death today for some groups (young African American males, for example). I don't believe it is correct in general that intentional killing was the most common cause of death in primitive tribes; and if it was the case in specific groups, they were exceptional. The citation that occurs to me immediately is "Sex at Dawn" (Ryan & Jetha), which goes to some trouble to debunk the Hobbesian view that primitive life was "nasty, brutish, and short". (Also, my partner is a professional anthropologist with a lot of experience with indigenous South American populations, and we discuss this kind of thing all the time, FWIW.) When population density is very low and resources (including social resources such as access to sexual partners) plentiful, there is no reason murder should be common (if by "murder" we mean the intentional killing of another in order to appropriate their resources). Even in groups where inter-group violence was common (certain American Indian groups, for example), that violence was generally of a demonstrative nature, and usually ended when one group had asserted its dominance, rather than going on until the ground was littered with corpses. The depictions we see of these conflicts in the media are often heavily over-dramatized. Actually, upon further thought... Even if killing wasn't the point of such inter-group conflicts, it's possible that if those conflicts supplied sufficiently many male deaths, then that sort of "murder" might in fact have been the most common cause of male death in some groups. It is pretty certain, though, that intentional killing within social groups was an extremely rare occurrence, likely to have been met with severe social consequences. (Whereas killing an out-group individual might have been viewed as positively virtuous, probably not analogous to our concept of "murder" at all. Edit: more like "war", I guess :-P ) As for evolving a specific aversion to murder... I think we've a general p

3TheOtherDave14y

Yeah, it seems moderately plausible to me that in primitive tribes the killing of out-group individuals as part of inter-group violence would be a lot like war.

0wedrifid14y

I wouldn't call that an understatement. The difference between inter-tribe violence and 'war' is non-trivial.

0TheOtherDave14y

Hm. If you're motivated to expand on that, I'd be interested.

0Barry_Cotter14y

War requires a great deal more organisation, claity of purpose and discipline. If you've ever read much fantasy it's the difference between a great big fight with warriors (people who may know how to fight, and fight extremely well individually) and soldiers (people who fight as part of a unit, and can be more or less relied upon to follow orders, usually there will be more than one type of unit, each of which has specific strengths and weaknesses and tactical roles.) Obviously there is a continuum, but at one end we have set piece battles with cavalry, infantry, ranged weapon units of whatever type, and at another skirmishes between loose groups of men who have not trained to fight as a team, and are not capable of e.g. retreating in good order, and are much more likely to attack before the order goes out than soldiers.

2TheOtherDave14y

I agree that there's a continuum between engagements involving complex arrangements of heterogenous specialized combat and support units at one extreme, and engagements involving simple arrangements of homogenous combat units at another. I agree that the former requires more explicit strategy and more organization than the latter. I mostly agree that the former requires more discipline and more clarity of purpose than the latter. I agree that certain tactical and strategic maneuvers (e.g., retreating in good order or attacking in a coordinated fashion) become much easier as you traverse that continuum. I'm not entirely convinced that "war" doesn't equally well denote positions all along that continuum, but I guess that's a mere dispute over definitions and not particularly interesting. (nods) OK, fair enough. Thanks for the clarification.

[-]Giles14y60

Maybe there's some value in creating an algorithm which accurately models most people's moral decisions... it could be used as the basis for a "sane" utility function by subsequently working out which parts of the algorithm are "utility" and which are "biases".

(EDIT: such a project would also help us understand human biases more clearly.)

Incidentally, I hope this "double effect" idea is based around more than just this trolley thought experiment. I could get the same result they did with the much simpler heuristic "don't use dead bodies as tools".

7PhilGoetz14y

If I wrote an algorithm that tried to maximize expected value, and computed value as a function of the number of people left alive, it would choose in both trolley problems to save the maximum number of people. That would indicate that the human solution to the second problem, to not push someone onto the tracks, was a bias. Yet the authors of the paper did not make that interpretation. They decided that getting a non-human answer meant the computer did not yet have morals. So, how do you decide what to accurately model? That's where you make the decision about what is moral.

1Giles14y

I agree the authors of the paper are idiots (or seem to be - I only skimmed the paper). But the research they're doing could still be useful, even if not for the reason they think.

[-]Manfred14y50

Eh, if people want to copy human typical (I'll call it "folk") morality, that probably won't end too badly, and it seems like good practice for modeling other complicated human thought patterns.

Whether it's the right morality to try and get machines to use or not gets pretty meta-ethical. However, if the audience is moved by consistency you might use a trolley-problem analogy to claim that building a computer is analogous to throwing a switch and so by folk morality you should be more consequentialist, so making a computer that handles the trolley problem using folk morality is wrong if folk morality is right, and also wrong if folk morality is wrong.

[-]djcb14y40

Interesting read!

I think most people are fundamentally following 'knee-jerk-morality', with the various (meta)ethical systems as a rationalization. This is evidenced by the fact that answers in the trolley-problem differ when some (in the ethical system) morally-neutral factors changed -- for example, whether something happens through action or inaction.

The paper shows that some of the rules of a rationalization of knee-jerk-morality can be encoded in a Prolog program. But if the problem changes a bit (say, the involuntary-organ-transplant-case), you'll ne... (read more)

[-]Matt_Simpson14y30

As a side note, using the word "utilitarian" is potentially confusing. The standard definition of a utilitarian is someone who thinks we should maximize the aggregate utility of all humans/morally relevant agents, and it comes with a whole host of problems. I'm pretty sure all you mean by "utilitarian" is that our values, whatever they are, should be/are encoded into a utility function.

-1PhilGoetz14y

Yes. I don't think that's standard anymore. The terms "total utilitarian" and "average utilitarian" are generally recognized, where "total utilitarian" means what you called "utilitarian".

5Vladimir_M14y

Maybe I'm misreading this exchange, but there seems to be some confusion between individual utility functions and utilitarianism as an ethical system. An individual utility function as per von Neumann and Morgenstern is defined only up to a constant term and multiplication by a positive factor. Individual vN-M utility functions therefore cannot be compared, aggregated, or averaged across individuals, which is what any flavor of utilitarianism requires one way or another (and which invariably leads into nonsense, in my opinion).

4steven046114y

It's only preference utilitarianism that aggregates individual vN-M utility functions. Other kinds of utilitarianism can use other measures of quality of life, such as pleasure minus pain; these measures have their own difficulties, but they don't have this particular difficulty.

3Vladimir_M14y

You're right, it's not true that all sorts of utilitarianism require aggregation of vN-M utility functions. That was an imprecise statement on my part. However, as far as I can tell, any sort of utilitarianism requires comparing, adding, or averaging of some measure of utility across individuals, and I'm not aware of any such measure for which this is more meaningful than for the vN-M utility functions. (If you know of any examples, I'd be curious to hear them.)

0TimFreeman14y

Estimates of individual utility functions can be averaged, if you do it right, so far as I can tell. A possible estimate of everybody's utility is a computable function that given a person id and the person's circumstances, returns a rational number in the interval [0,1]. Discard the computable functions inconsistent with observed behavior of people. Average over all remaining possibilities weighing by the universal prior, thus giving you an estimated utility for each person in the range [0, 1]. We're estimating utilities for humans, not arbitrary hypothetical creatures, so there's an approximate universal minimum utility (torturing you and everyone you care about to death) and an approximate maximum utility (you get everything you want). We're estimating everybody's utility with one function, so an estimate that says that I don't like to be tortured will be simpler than one that doesn't even if I have never been tortured, because other people have attempted to avoid torture. Does that proposal make sense? (I'm concerned that I may have been too brief.) Does anything obvious break if you average these across humans?

5Vladimir_M14y

As far as I see, your proposal is well-defined and consistent. However, even if we ignore all the intractable problems with translating it into any practical answers about concrete problems (of which I'm sure you're aware), this is still only one possible way to aggregate and compare utilities interpersonally, with no clear reason why you would use it instead of some other one that would favor and disfavor different groups and individuals.

1TimFreeman14y

Analysis paralysis is one path to defeat. I agree with you that my proposed scheme is computationally intractable, and that it has other issues too. IMO the other issues can be fixed and I hope to get feedback on a completed version at some point. Assuming the fixes are good, we'd then have an unimplementable specification of a way to fairly balance the interests of different people, and a next step would be to look for some implementable approximation to it. That would be an improvement over not having a specification, right? The implied principle here seems to be that if we can't find a unique way to balance the interests of different people, we shouldn't do it at all. I believe there are multiple plausible schemes, so we will be paralyzed as long as we refuse to pick one and continue. There is precedent for this -- many cultural norms are arbitrary, for example. I wish I actually had multiple plausible schemes to consider. I can think of some with obvious bugs, but it doesn't seem worthwhile to list them here. I could also make a trivial change by proposing unfair weights (maybe my utility gets a weight of 1.1 in the average and everyone else gets a weight of 1, for example). If anybody can propose an interestingly different alternative, I'd love to hear it. Also, if I incorrectly extracted the principle behind the parent post, I'd like to be corrected.

2Matt_Simpson14y

"Average" and "total" utilitarian are just two different ways of specifying what "aggregate" means though. To my knowledge, none of the standard utilitarian positions (outside of lesswrong) say "maximize your own values." (I'm willing to be corrected here.) To LWer's, it's not confusing, but to most outsiders, they'll probably come away with a different message than you intended.

[-]Richard_Kennaway14y30

To program a computer to tell right from wrong, first you must yourself know what is right and what is wrong. The authors obtained this knowledge, in a limited domain, from surveys of people's responses to trolley problems, then implemented in Prolog a general principle suggested by those surveys.

One may argue with the validity of the surveys, the fitting of the general principle to the results of those surveys, or the correctness with which the principle was implemented -- because one can argue with anything -- but as a general way of going about this I don't see a problem with it.

Can you unpack your comment about "encoding human irrationality "?

3PhilGoetz14y

Saying it's encoding human irrationality is taking the viewpoint that the human reaction to the fat-man trolley problem is an error of reasoning, where the particular machinery humans use to decide what to do gives an answer that does not maximize human values. It makes some sense to say that a human is a holistic entity that can't be divided into "values" and "algorithms". I argued that point in "Only humans can have human values". But taking that view, together with the view that you should cling to human values, means you can't be a transhumanist. You can't talk about improving humans, because implementing human values comes down to being human. Any "improvement" to human reasoning means giving different answers, which means getting "wrong" answers. And you can't have a site like LessWrong, that talks about how to avoid errors that humans systematically make - because, like in the trolley problem case, you must claim they aren't errors, they're value judgements.

2Richard_Kennaway14y

You can still have a LessWrong, because one can clearly demonstrate that people avoidably draw wrong conclusions from unreliable screening tests, commit conjunction fallacies, and so on. There are agreed ways of getting at the truth on these things and people are capable of understanding the errors that they are making, and avoiding making those errors. Values are a harder problem. Our only source of moral knowledge (assuming there is such a thing, but those who believe there is not must dismiss this entire conversation as moonshine) is what people generally do and say. If contradictions are found, where does one go for evidence to resolve them?

1PhilGoetz14y

You're right - there is a class of problems for which we can know what the right answer is, like the Monty Hall problem. (Although I notice that the Sleeping Beauty problem is a math problem on which we were not able to agree on what the right answer was, because people had linguistic disagreements on how to interpret the meaning of the problem.)

0DSimon14y

Even when holding a view that human values can't be improved, rationality techniques are still useful, because human values conflict with each other and have to be prioritized or weighted. If value knowing the truth, and I also in the holistic sense "value" making the conjunction fallacy, then LessWrong is still helpful to me provided I value the first more than the second, or if the weighting is such that the net value score is increased even though the individual conjunction fallacy value is decreased.

[-]thomblake14y20

I've been more or less grappling with this problem lately with respect to my dissertation. If someone asks you to make sure a robot is ethical, what do they mean? It seems like most would want something like the machine described above, that manages to somehow say "ew" to the same stimuli a human would.

And then, if you instead actually make an ethical machine, you haven't solved the problem as specified.

[-][anonymous]14y20

there is no point in trying to design artificial intelligences than encode human "values".

I think you mean to say "that encode human 'values'"...?

0PhilGoetz14y

Yup. Thanks.

[-][anonymous]14y-10

The problem of whether or not to push the person onto the tracks resembles the following problem.

Imagine that each of five patients in a hospital will die without an organ transplant. The patient in Room 1 needs a heart, the patient in Room 2 needs a liver, the patient in Room 3 needs a kidney, and so on. The person in Room 6 is in the hospital for routine tests. Luckily (for them, not for him!), his tissue is compatible with the other five patients, and a specialist is available to transplant his organs into the other five. This operation would save the

... (read more)

5wobster10914y

Another similar problem that I've encountered runs thus: suppose we're in a scenario where it's one person's life against a million, or a billion, or all the people in the world. Suppose aliens are invading and will leave Earth be if we were to kill an arbitrarily-determined innocent bystander. Otherwise, they will choose an arbitrary person, take him to safety, and destroy Earth, along with everyone else. In that case, consensus seems to be that the lives of everyone on Earth far outweigh a healthy innocent's rights. The largest difference between the two cases is numbers: five people becomes six billion. If there is another difference, I have yet to find it. But if it is simply a difference in numbers, then whatever justification people use to choose the healthy man over five patients ought to apply here as well.

3Normal_Anomaly14y

Within the thought experiment, the difference is simply numbers and people are giving the wrong answer, as long as you specify that this would increase the total number of years lived (many organ recipients are old and will die soon anyway). Outside the experiment in the realm of public policy, it is wrong to kill the "donor" in this one case because of the precedent it would set: people would be afraid to go to the hospital for fear of being killed for their organs. And if this was implemented by law, there would be civil unrest that would more than undo the good done.

0[anonymous]14y

It sounds like you're saying that the thought experiment is unfixably wrong, since it can't be made to match up with reality "outside the experiment". If that's the case, then I question whether people are "giving the wrong answer". Morals are useful precisely for those cases where we often do not have enough facts to make a correct decision based only on what we know about a situation. For most people most of the time, doing the moral thing will pay off, and not doing the moral thing will ultimately not, even though it will quite often appear to for a short while after.

2DSimon14y

At a practical level, there's another significant difference between the two cases: confidence in the probabilities. As has been pointed out above, the thought experiment with the donors has a lot of utilitarian implications that are farther out than just the lives of the five people in the doctor's room. Changing the behavior of doctors will change the behavior of others, since they will anticipate different things happening when they interact with doctors. On the other hand, we haven't got much basis for predicting how choosing one of the two scenarios will influence the aliens, or even thinking that they'll come back.

4Armok_GoB14y

I never got this example, it's obvious to me that you should do the operation, and that the only reason not to is the dumbness of red tape and lynch mobs being extremely irrational.

4DSimon14y

If you learned that doctors actually regularly did this sort of thing, would that change the probability that you'd go and get a somewhat important but non-critical operation (i.e. wisdom teeth removal)?

1prase14y

Since the risk of that happening to me would be quite low (at least two times lower than the risk of needing a transplant myself, and probably much lower even than that) it wouldn't be rational to alter the behaviour, but I would certainly feel nervous in the hospital.

0Armok_GoB14y

You know what this made me think of? thopse people that say that if medical care was free there'd be no incentive not to go to the doctor for trivial things... >:D

4SilasBarta14y

What Alicorn said -- most people aren't going to overuse free medical care, at least not through that vector. And any deliberate, artificial "visit inflation" is going to be from doctors who order unnecessary visits in order to score extra fees, not from patients, who would generally not prefer to have to schedule around new appointments and spend a long time in the waiting room. This kind of overuse does happen, of course, but it's due to the tiny set of people ("hypochondriacs") who do go to the doctor for every little thing, and raise costs for anyone pooled with them (via taxes or health insurance). Or from people who use the ER as their checkup and force others who are in severe pain but not "visibly dying" to suffer longer. (I know you were joking but it needed to be said anyway.)

8[anonymous]14y

The basics of economics are not suspended for medicine. "Overuse" is a judgment and therefore not useful for a dispassionate discussion, but people do commonly buy more (notice I say "more", which is not a judgment) of something when it costs less. I do this all the time when shopping for food. Of there is a sale on a good brand of sardines, I might buy twenty cans at once (and I'll go through them pretty fast too). Now of course, if something is sufficiently inexpensive then a long queue will likely form, and once the queue is formed then that will limit consumption of the service or good, but does not bring the consumption back down to the original level. Consumption levels off but probably at a higher level. But key point: don't use judgment terms like overuse unless you want to kill your mind. Did I buy "too much" sardines? Well, maybe I completely cleared out the shelf, or maybe I took half and some other person took the other half, leaving no sardines for anyone else. Who is to judge that I took "too much"? But we can describe what happened without passing judgment: at the sale price, the sardines were quickly cleared out, leaving no sardines for any further customers. That's a shortage. Lowering the price below the market level may create a shortage - which is not a judgment, it is a description of what happens. Alternatively, lowering the price may create a queue. Offloading the price to a third party may, rather than creating a shortage or queue, lead to increased use and thus an increase in price. And so on. Immediately jumping in with judgment words like overuse triggers the emotions and makes thought difficult.

1SilasBarta14y

You're right, I did use the wrong term there, and in a way that encourages sloppy thinking. I was just trying to dispel the vision some people have of the population being ultra price-sensitive to doctor visit payments and are therefore just inches away from overloading the system this way if it became free, which I think is a common but unrealistic model of the dynamics of health care decisions, in particular the non-monetary costs of doctor visits. And since you bring up the topic, the health care sector is many, many degrees removed from market-based identification of efficient production/consumption levels, in some ways intractably (because of public unwillingness to let people go without certain kinds of care on the basis of not affording it, for example).

4Alicorn14y

I find going to the doctor massively inconvenient in terms of scheduling, having to interact with people to make an appointment, trying to convey the six hundred things that are wrong with me to a doctor who only wanted to spend ten minutes with me, etc... if I didn't have to cough up a small co-pay after every visit this would affect my finances but I doubt it would make me actually go to the doctor more often.

2Armok_GoB14y

Yea, I were just telling a joke. I personally live in a place where it's free for most things and we don't have a problem with that etc. But let's not get into mindkiller territory.

0Alicorn14y

Assuming I believe that doctors do this by themselves rather than hiring goons to help them, I'd go, but bring a friend or relative - ideally one of the doctors I'm related to so he could better notice anything dodgy going on, and take over without killing me in a pinch if he had to tackle the doctor who was about to harvest my organs.

7DSimon14y

On the doctor's home territory, though, that would be tricky; they might easily have a half-dozen nurses standing by with tranquilizers ready to knock out any potential interferers. This would eventually lead to medical force escalation, and then medical feudalism. You go to your "own" hospital knowing for sure that the doctors there will not take out your organs without permission, since you're part of their tribe. However, that hospital has to put in place a strong defensive perimeter to stop any task forces from the hospital tribe down the street from breaking in and stealing organs for themselves. And of course, if you turn out to need an organ, then your hospital would deploy its own tribal ninja doctors to sneak into the enemy hospital and retrieve whatever is needed...

0NancyLebovitz14y

Definitely a good enough premise for satirical science fiction.

1DSimon14y

Or as an in-universe explanation for the heavily armed doctors in Team Fortress.

[-]Kai-o-logos14y-20

Dust specks – I completely disagree with Eliezer’s argument here. The hole in Yudkowsky’s logic, I believe, is not only the curved utility function, but also the main fact that discomfort cannot be added like numbers. The dust speck incident is momentary. You barely notice it, you blink, its gone, and you forget about it for the rest of your life. Torture, on the other hand, leaves lasting emotional damage on the human psyche. Futhermore, discomfort is different than pain. If, for example the hypothetical replaced the torture with 10000 people getting a no... (read more)

4loqi14y

When you say that pain is "fundamentally different" than discomfort, do you mean to imply that it's a strictly more important consideration? If so, your theory is similar to Asimov's One Law of Robotics, and you should stop wasting your time thinking about "discomfort", since it's infinitely less important than pain. Stratified utility functions don't work.

Moderation Log

138Comments