The way that we can resolve values vs. errors is by endorsing symmetries.
For example, Rawl's "veil of ignorance" enjoins us to design a society, on the assumption that we might be anyone in that society - we might have any degree of talent or disability, any taste or preferences, and so on. This is permutation symmetry.
If we have two situations that we believe are exactly analogous (for example, the trolley car problem and a similar problem with a subway car), then we call any systematic difference in our human intuitions an error, and we choose one of the two intuitions to endorse as applying to both cases. (I don't know that people systematically differ in their survey responses to questions about trolley car problems vs. subway car problems, but I wouldn't be surprised.)
In forming a notion of values and errors, we are choosing a priority order among various symmetries and various human intuitions. Utilitarians prioritize the analogy between flipping the switch and pushing the fat man over the intuition that we should not push the fat man.
That's a good idea. I wonder if anyone has done a trolley-problem survey, phrasing it in the terms, "Would you rather live in a society where people would do X or Y?"
Cast in consequentialist terms, the reason we shouldn't push the fat man in the second trolley problem is that we are fallible, and when we believe committing an unethical act will serve the greater good, we are probably wrong.
Thought experiments aside, supposing that scenario came up in real life, and I tried actually pushing the fat man, what would happen? Answer: either I'd end up in a tussle with an angry fat man demanding to know why I just tried to kill him, while whatever chance I might have had of shouting a warning to the people in the path of the trolley was lost, or I'd succeed a second too late and then I'd have committed murder for nothing. And when the media got hold of the story and spread it far and wide - which they probably would, it's exactly the kind of ghoulish crap they love - it might help spread the idea that in a disaster, you can't afford to devote all your attention to helping your neighbors, because you need to spare some of it for watching out for somebody trying to kill you for the greater good. That could easily cost more than five lives.
If some future generation ever builds a machine whose domain and capabilities are such that it is called on to make...
I'm a bit skeptical of using majority survey response to determine "morality." After all, given a Bayesian probability problem, (the exact problem was patients with cancer tests, with a chance of returning a false positive,) most people will give the wrong answer, but we certainly don't want our computers to make this kind of error.
As to the torture vs. dust specks, when I thought about it, I decided first that torture was unacceptable, and then tried to modify my utility function to round to zero, etc. I was very appalled with myself to find that I decided the answer in advance, and then tried to make my utility function fit a predetermined answer. It felt an awful lot like rationalizing. I don't know if everyone else is doing the same thing, but if you are, I urge you to reconsider. If we always go with what feels right, what's the point of using utility functions at all?
There's another version of the trolley problem that's even squickier than the "push a man onto the track" version...
“A brilliant transplant surgeon has five patients, each in need of a different organ, each of whom will die without that organ. Unfortunately, there are no organs available to perform any of these five transplant operations. A healthy young traveler, just passing through the city the doctor works in, comes in for a routine checkup. In the course of doing the checkup, the doctor discovers that his organs are compatible with all five of his dying patients. Suppose further that if the young man were to disappear, no one would suspect the doctor.”
-- Judith Jarvis Thomson, The Trolley Problem, 94 Yale Law Journal 1395-1415 (1985)
For some reason, it's a lot less comfortable to endorse murdering the patient than it is to endorse pushing the fat man onto the track...
That one was raised by a visiting philosopher at my college as an argument (from intuition) against utilitarianism. I pointed out that if we tended to kill patients to harvest them to save more patients, people would be so fearful of being harvested that they would tend not to visit hospitals at all, leading to a greater loss of health and life. So in this case, in any realistic formulation, the less comfortable option is also the one that leads to less utility.
I suspect that this version feels even less comfortable than the trolley dilemma because it includes the violation of an implicit social contract, that if you go into a hospital, they'll try to make you healthier, not kill you. But while violating implicit social contracts tends to be a bad idea, that's certainly not to say that there's any guarantee that the utilitarian thing to do in some situations won't be massively uncomfortable.
If the likelihood of me needing a life-saving organ transplant at some point in my life is the same as for most other people, then I think I'd bite the bullet and agree to a system in which random healthy people are killed for their organs. Why? Because I'd have 5x the chance of being saved than being killed.
Because I'd have 5x the chance of being saved than being killed.
Except, of course, for the chance of being slain in the inevitable civil war that ensues. ;)
You want a hard rule that medical personnel won't do injury to the people in their hands.
Specifically: You don't want future people to avoid seeking medical treatment — or to burn doctors at the stake — out of legitimate fear of being taken apart for their organs. Even if you tell the victims that it's in the greater good for doctors to do that once in a while, the victims' goals aren't served by being sacrificed for the good of five strangers. The victims' goals are much better served by going without a medical checkup, or possibly leading a mob to kill all the doctors.
There is a consequentialist reason to treat human beings as ends rather than means: If a human figures out that you intend to treat him or her as a means, this elicits a whole swath of evolved-in responses that will interfere with your intentions. These range from negotiation ("If you want to treat me as a means, I get to treat you as a means too"), to resistance ("I will stop you from doing that to me"), to outright violence ("I will stop you, so you don't do that to anyone").
Most people choose the many dust specks over the torture. Some people argued that "human values" includes having a utility aggregation function that rounds tiny (absolute value) utilities to zero, thus giving the "dust specks" answer. No, Eliezer said; this was an error in human reasoning. Is it an error, or a value?
I'm not sure. I think the answer most people give on this has more to do with fairness than rounding to zero. Yeah, it's annoying for me to get a dust speck in my eye, but it's unfair that someone should be tortured fo...
I don't want humans to make decisions where they kill one person to save another. The trolley problem feels bad to us because, in that situation usually, its never that clear. Omega is never leaning over your shoulder, explaining to you that killing the fat man will save those people- you just have to make a guess, and human guesses can be wrong. What I suspect humans are doing is a hidden probability calculation that says "well theres probably a chance of x that I'll save those people, which isn't high enough to chance it". Theres an argument to...
Humans make life-and-death decisions for other humans every day. The President decides to bomb Libya or enter Darfur to prevent a genocide. The FDA decides to approve or ban a drug. The EPA decides how to weigh deaths from carcinogens produced by industry, vs. jobs. The DOT decides how to weigh travel time vs. travel deaths.
It seems difficult to have this conversation once you've concluded only utilitarian ethics are valid, because the problem is whether or not utilitarian ethics are valid. (I'm excluding utilitarian ethics that are developed to a point where they are functionally deontological from consideration.)
Whether or not you are trying to maximize social status or some sort of abstract quality seems to be the issue, and I'm not sure it's possible to have an honest conversation about that, since one tends to improve social status by claiming to care (and/or actually caring) about an abstract quality.
The principle of double effect is interesting:
Harming another individual is permissible if it is the foreseen consequence of an act that will lead to a greater good; in contrast, it is impermissible to harm someone else as an intended means to a greater good.
The distinction to me looks something the difference between
"Take action -> one dies, five live" and "Kill one -> five live"
Where the salient difference is whether the the act is morally permissible on its own. So a morally neutral act like flipping a switch allows the pe...
Maybe there's some value in creating an algorithm which accurately models most people's moral decisions... it could be used as the basis for a "sane" utility function by subsequently working out which parts of the algorithm are "utility" and which are "biases".
(EDIT: such a project would also help us understand human biases more clearly.)
Incidentally, I hope this "double effect" idea is based around more than just this trolley thought experiment. I could get the same result they did with the much simpler heuristic "don't use dead bodies as tools".
Eh, if people want to copy human typical (I'll call it "folk") morality, that probably won't end too badly, and it seems like good practice for modeling other complicated human thought patterns.
Whether it's the right morality to try and get machines to use or not gets pretty meta-ethical. However, if the audience is moved by consistency you might use a trolley-problem analogy to claim that building a computer is analogous to throwing a switch and so by folk morality you should be more consequentialist, so making a computer that handles the trolley problem using folk morality is wrong if folk morality is right, and also wrong if folk morality is wrong.
Interesting read!
I think most people are fundamentally following 'knee-jerk-morality', with the various (meta)ethical systems as a rationalization. This is evidenced by the fact that answers in the trolley-problem differ when some (in the ethical system) morally-neutral factors changed -- for example, whether something happens through action or inaction.
The paper shows that some of the rules of a rationalization of knee-jerk-morality can be encoded in a Prolog program. But if the problem changes a bit (say, the involuntary-organ-transplant-case), you'll ne...
As a side note, using the word "utilitarian" is potentially confusing. The standard definition of a utilitarian is someone who thinks we should maximize the aggregate utility of all humans/morally relevant agents, and it comes with a whole host of problems. I'm pretty sure all you mean by "utilitarian" is that our values, whatever they are, should be/are encoded into a utility function.
To program a computer to tell right from wrong, first you must yourself know what is right and what is wrong. The authors obtained this knowledge, in a limited domain, from surveys of people's responses to trolley problems, then implemented in Prolog a general principle suggested by those surveys.
One may argue with the validity of the surveys, the fitting of the general principle to the results of those surveys, or the correctness with which the principle was implemented -- because one can argue with anything -- but as a general way of going about this I don't see a problem with it.
Can you unpack your comment about "encoding human irrationality "?
I've been more or less grappling with this problem lately with respect to my dissertation. If someone asks you to make sure a robot is ethical, what do they mean? It seems like most would want something like the machine described above, that manages to somehow say "ew" to the same stimuli a human would.
And then, if you instead actually make an ethical machine, you haven't solved the problem as specified.
there is no point in trying to design artificial intelligences than encode human "values".
I think you mean to say "that encode human 'values'"...?
The problem of whether or not to push the person onto the tracks resembles the following problem.
...Imagine that each of five patients in a hospital will die without an organ transplant. The patient in Room 1 needs a heart, the patient in Room 2 needs a liver, the patient in Room 3 needs a kidney, and so on. The person in Room 6 is in the hospital for routine tests. Luckily (for them, not for him!), his tissue is compatible with the other five patients, and a specialist is available to transplant his organs into the other five. This operation would save the
Dust specks – I completely disagree with Eliezer’s argument here. The hole in Yudkowsky’s logic, I believe, is not only the curved utility function, but also the main fact that discomfort cannot be added like numbers. The dust speck incident is momentary. You barely notice it, you blink, its gone, and you forget about it for the rest of your life. Torture, on the other hand, leaves lasting emotional damage on the human psyche. Futhermore, discomfort is different than pain. If, for example the hypothetical replaced the torture with 10000 people getting a no...
The trolley problem
In 2009, a pair of computer scientists published a paper enabling computers to behave like humans on the trolley problem (PDF here). They developed a logic that a computer could use to justify not pushing one person onto the tracks in order to save five other people. They described this feat as showing "how moral decisions can be drawn computationally by using prospective logic programs."
I would describe it as devoting a lot of time and effort to cripple a reasoning system by encoding human irrationality into its logic.
Which view is correct?
Dust specks
Eliezer argued that we should prefer 1 person being tortured for 50 years over 3^^^3 people each once getting a barely-noticeable dust speck in their eyes. Most people choose the many dust specks over the torture. Some people argued that "human values" includes having a utility aggregation function that rounds tiny (absolute value) utilities to zero, thus giving the "dust specks" answer. No, Eliezer said; this was an error in human reasoning. Is it an error, or a value?
Sex vs. punishment
In Crime and punishment, I argued that people want to punish criminals, even if there is a painless, less-costly way to prevent crime. This means that people value punishing criminals. This value may have evolved to accomplish the social goal of reducing crime. Most readers agreed that, since we can deduce this underlying reason, and accomplish it more effectively through reasoning, preferring to punish criminals is an error in judgement.
Most people want to have sex. This value evolved to accomplish the goal of reproducing. Since we can deduce this underlying reason, and accomplish it more efficiently than by going out to bars every evening for ten years, is this desire for sex an error in judgement that we should erase?
The problem for Friendly AI
Until you come up with a procedure for determining, in general, when something is a value and when it is an error, there is no point in trying to design artificial intelligences that encode human "values".
(P.S. - I think that necessary, but not sufficient, preconditions for developing such a procedure, are to agree that only utilitarian ethics are valid, and to agree on an aggregation function.)