Followup toEthical Injunctions

During World War II, Knut Haukelid and three other saboteurs sank a civilian Norwegian ferry ship, the SF Hydro, carrying a shipment of deuterium for use as a neutron moderator in Germany's atomic weapons program.  Eighteen dead, twenty-nine survivors.  And that was the end of the Nazi nuclear program.  Can you imagine a Hollywood movie in which the hero did that, instead of coming up with some amazing clever way to save the civilians on the ship?

Stephen Dubner and Steven Levitt published the work of an anonymous economist turned bagelseller, Paul F., who dropped off baskets of bagels and came back to collect money from a cashbox, and also collected statistics on payment rates.  The current average payment rate is 89%.  Paul F. found that people on the executive floor of a company steal more bagels; that people with security clearances don't steal any fewer bagels; that telecom companies have robbed him and that law firms aren't worth the trouble.

Hobbes (of Calvin and Hobbes) once said:  "I don't know what's worse, the fact that everyone's got a price, or the fact that their price is so low."

If Knut Haukelid sold his soul, he held out for a damned high price—the end of the Nazi atomic weapons program.

Others value their integrity less than a bagel.

One suspects that Haukelid's price was far higher than most people would charge, if you told them to never sell out.  Maybe we should stop telling people they should never let themselves be bought, and focus on raising their price to something higher than a bagel?

But I really don't know if that's enough.

The German philosopher Fichte once said, "I would not break my word even to save humanity."

Raymond Smullyan, in whose book I read this quote, seemed to laugh and not take Fichte seriously.

Abraham Heschel said of Fichte, "His salvation and righteousness were apparently so much more important to him than the fate of all men that he would have destroyed mankind to save himself."

I don't think they get it.

If a serial killer comes to a confessional, and confesses that he's killed six people and plans to kill more, should the priest turn him in?  I would answer, "No."  If not for the seal of the confessional, the serial killer would never have come to the priest in the first place.  All else being equal, I would prefer the world in which the serial killer talks to the priest, and the priest gets a chance to try and talk the serial killer out of it.

I use the example of a priest, rather than a psychiatrist, because a psychiatrist might be tempted to break confidentiality "just this once", and the serial killer knows that.  But a Catholic priest who broke the seal of the confessional—for any reason—would face universal condemnation from his own church.  No Catholic would be tempted to say, "Well, it's all right because it was a serial killer."

I approve of this custom and its absoluteness, and I wish we had a rationalist equivalent.

The trick would be establishing something of equivalent strength to a Catholic priest who believes God doesn't want him to break the seal, rather than the lesser strength of a psychiatrist who outsources their tape transcriptions to Pakistan.  Otherwise serial killers will, quite sensibly, use the Catholic priests instead, and get less rational advice.

Suppose someone comes to a rationalist Confessor and says:  "You know, tomorrow I'm planning to wipe out the human species using this neat biotech concoction I cooked up in my lab."  What then?  Should you break the seal of the confessional to save humanity?

It appears obvious to me that the issues here are just those of the one-shot Prisoner's Dilemma, and I do not consider it obvious that you should defect on the one-shot PD if the other player cooperates in advance on the expectation that you will cooperate as well.

There are issues with trustworthiness and how the sinner can trust the rationalist's commitment.  It is not enough to be trustworthy; you must appear so.  But anything that mocks the appearance of trustworthiness, while being unbound from its substance, is a poor signal; the sinner can follow that logic as well.  Perhaps once neuroimaging is a bit more advanced, we could have the rationalist swear under a truthtelling machine that they would not break the seal of the confessional even to save humanity.

There's a proverb I failed to Google, which runs something like, "Once someone is known to be a liar, you might as well listen to the whistling of the wind."  You wouldn't want others to expect you to lie, if you have something important to say to them; and this issue cannot be wholly decoupled from the issue of whether you actually tell the truth.  If you'll lie when the fate of the world is at stake, and others can guess that fact about you, then, at the moment when the fate of the world is at stake, that's the moment when your words become the whistling of the wind.

I don't know if Fichte meant it that way, but his statement makes perfect sense as an ethical thesis to me.  It's not that one person's personal integrity is worth more, as terminal valuta, than the entire world.  Rather, losing all your ethics is not a pure advantage.

Being believed to tell the truth has advantages, and I don't think it's so easy to decouple that from telling the truth.  Being believed to keep your word has advantages; and if you're the sort of person who would in fact break your word to save humanity, the other may guess that too.  Even intrapersonal ethics can help protect you from black swans and fundamental mistakes.  That logic doesn't change its structure when you double the value of the stakes, or even raise them to the level of a world.  Losing your ethics is not like shrugging off some chains that were cool to look at, but were weighing you down in an athletic contest.

This I knew from the beginning:  That if I had no ethics I would hold to even with the world at stake, I had no ethics at all.  And I could guess how that would turn out.

 

Part of the sequence Ethical Injunctions

Next post: "Ethics Notes"

Previous post: "Ethical Injunctions"

New to LessWrong?

New Comment
43 comments, sorted by Click to highlight new comments since: Today at 4:26 PM

Hrm. I'd think "avoid destroying the world" itself to be an ethical injunction too. (modulo all relevant caveats like all minds on earth uploading and deciding collectively to rest of the matter composing the planet for some other purpose, blah blah blah, you know what I mean)

"Even intrapersonal ethics can help protect you from black swans and fundamental mistakes. That logic doesn't change its structure when you double the value of the stakes, or even raise them to the level of a world."

  • I'm not so sure. The kind of deontological ethics that you are talking about works well in human social interactions. When removed from that context, why do you think that it will still work?

For example, Knut Haukelid broke a deontological rule in order to make a significant dent in the subjective probability (given his knowledge) of the the nazis winning WWII. I think that he did the right thing, and I think that in such an extreme case one ought to act according to the greater good.

The problem is working out when one is in a sufficiently extreme case. For the readers of Overcoming Bias, and those interested in the singularity, this is a tough question. Clever men, such as yourself, tell us that the fate of the entire human race rests upon solving the FAI problem. Does this count as extreme? Does it count as extreme enough to justify damaging one's personal life, one's friends or family?

My answer to such questions of "greater good" versus "duty" used to be to favor the former, but my experiences in life have shown me that it is better to try to avoid such choices. Looking back on the times when I have stuck by my friends or my duties to my disadvantage, and the times where I have betrayed people or lied, (Yes, I have done both several times), I realize that in every single case there was a third option available if I had just thought about the problem clearly enough.

If somebody was planning to destroy the world, the rationalist could stop him and not break his oath of honesty by simply killing the psychopath. Then if the rationalist were caught and arrested and still didn't reveal why he had committed murder, perhaps even being condemned to death for the act but never breaking his oath of honesty, now that would make an awesome movie.

It might make an awesome movie, but if it were expected behaviour, it would defeat the point of the injunction. In fact if rationalists were expected to look for workarounds of any kind it would defeat the point of the injunction. So the injunction would have to be, not merely to be silent, but not to attempt to use the knowledge divulged to thwart the one making the confession in any way except by non-coercive persuasion.

Or alternatively, not to ever act in a way such that if the person making the confession had expected it they would have avoided making the confession.

Not that a rationalist Confessor should do such a thing, but I wonder if a Catholic priest is theologically allowed to kill sinners so long as they never say why. That would be an awesome loophole, and just the sort of thing to drive more traffic to the rationalists.

I suspect, though, that this is more of a Jewish thought than a Catholic thought. Any professional Catholics feel free to chime in.

I wonder if a Catholic priest is theologically allowed to kill sinners so long as they never say why

I don't think they are, any more than they are allowed to kill anyone else.

I don't know the Catholic church's current take on this, but the Bible does require the death penalty for a large number of crimes, and Jesus agreed with that penalty. If there was no state-sponsored death penalty, and nobody else was willing, my religious knowledge fail me on whether an individual or a Catholic priest would be forbidden, allowed, or required to performing the execution by this, and I'm unsure if or how that's affected by the context of a confessional.

http://www.vatican.va/archive/ccc_css/archive/catechism/p3s2c2a5.htm

2267 Assuming that the guilty party's identity and responsibility have been fully determined, the traditional teaching of the Church does not exclude recourse to the death penalty, if this is the only possible way of effectively defending human lives against the unjust aggressor.

If, however, non-lethal means are sufficient to defend and protect people's safety from the aggressor, authority will limit itself to such means, as these are more in keeping with the concrete conditions of the common good and more in conformity to the dignity of the human person.

Today, in fact, as a consequence of the possibilities which the state has for effectively preventing crime, by rendering one who has committed an offense incapable of doing harm - without definitely taking away from him the possibility of redeeming himself - the cases in which the execution of the offender is an absolute necessity "are very rare, if not practically nonexistent."

If a serial killer comes to a confessional, and confesses that he's killed six people and plans to kill more, should the priest turn him in? I would answer, "No." If not for the seal of the confessional, the serial killer would never have come to the priest in the first place.

It's important to distinguish two ways this argument might work. The first is that the consequences of turning him in are bad, because future killers will be (or might be) less likely to seek advice from priests. That's a fairly straightforward utilitarian argument.

But the second is that turning him in is somehow bad, regardless of the consequences, because the world in which every "confessor" did as you do is a self-defeating, impossible world. This is more of a Kantian line of thought.

Eliezer, can you be explicit which argument you're making? I thought you were a utilitarian, but you've been sounding a bit Kantian lately. :)

Your deontological ethics are tiresome. Why not just be a utilitarian and lie your way to a better tomorrow?

Put more seriously, I would think that being believed to put the welfare of humanity ahead of concerns about personal integrity could have significant advantages itself.

Or put another way, when it's time to shut up and do the impossible (save humanity, say), that doesn't seem like a good time to defer to pre-established theories, of ethics or anything else. Refer, yes; defer, no. You say to beware of cleverness, be wary of thinking you're smarter than your ethics (meaning deontological beliefs and intuitions). That discussion sounded like a Hofstadter's Law ("It always takes longer than you expect, even when you take Hofstadter's Law into account.") for ethics. Yet, when the chips are down, when you've debugged your hardware as best you can, isn't our cleverness what we have to trust? What else could there be? After all, as you yourself said, rationality is ultimately about winning, and so however much you hedge against black swans and corrupt hardware, it can't be an infinite amount, and there must come a point where you should stop and do what your brain computes is the right thing to do.

If my ethics don't tell me to save the world, I have no ethics at all.

There seems to be a conflict here between not lying to yourself, and holding a traditional rule that suggests you ignore your rationality.

This is A way to deal with running on untrusted hardware, but I am far from convinced it is optimal.

Crossman: there's a third argument, which is that even if the consequences of keeping the secret are overall worse than those of betraying the confidence even after the effect you discuss, turning yourself into someone who will never betray these secrets no matter what the consequences and advertising yourself as such in an impossible-to-fake way may overall have good consequences. In other words, you might turn away from consequentialism on consequentialist grounds.

Another example where unfakeably advertising irrationality can (at least in theory) serve you is threats. My only way of stopping you from taking over the world is that I have the power to destroy the world and you. Now, if you take over the world, there's no possible advantage to destroying it, so I won't, so you can take the world over. But if I put a lunatic in charge of the button who believably will carry out the threat, you will be deterred; the same applies if I can become that lunatic.

However, overall I think that the arguments against turning yourself into a lunatic are pretty strong, and in fact I suspect that consequentialism has the best consequences.

Given that there are already lots of people seeking to stop atrocities, the presence of one more person trying to do the same seems likely to be irrelevant. But there are very few people who have even a chance to speak with atrocity-planners and possibly persuade them to do otherwise - effectively none.

Trying to decide what to do without looking at what most other people are likely to do is impossible. Taking the behavior of others into account, it is quite reasonable for one person to put another strategy into play.

If talking has even a small chance of working, the utility of applying it as well as other prevention strategies is greater than the others alone.

Psy-Kosh: Hrm. I'd think "avoid destroying the world" itself to be an ethical injunction too.

The problem is that this is phrased as an injunction over positive consequneces. Deontology does better when it's closer to the action level and negative rather than positive.

Imagine trying to give this injunction to an AI. Then it would have to do anything that it thought would prevent the destruction of the world, without other considerations. Doesn't sound like a good idea.

Crossman: Eliezer, can you be explicit which argument you're making? I thought you were a utilitarian, but you've been sounding a bit Kantian lately.

If all I want is money, then I will one-box on Newcomb's Problem. I don't think that's quite the same as being a Kantian, but it does reflect the idea that similar decision algorithms in similar epistemic states will tend to produce similar outputs.

Clay: Put more seriously, I would think that being believed to put the welfare of humanity ahead of concerns about personal integrity could have significant advantages itself.

The whole point here is that "personal integrity" doesn't have to be about being a virtuous person. It can be about trying to save the world without any concern for your own virtue. It can be the sort of thing you'd want a pure nonsentient decision agent to do.

There seems to be a conflict here between not lying to yourself, and holding a traditional rule that suggests you ignore your rationality.

Your rationality is the sum of your full abilities, all components, including your wisdom about what you refrain from doing in the presence of what seem like good reasons.

Psy-Kosh: Hrm. I'd think "avoid destroying the world" itself to be an ethical injunction too.

The problem is that this is phrased as an injunction over positive consequneces. Deontology does better when it's closer to the action level and negative rather than positive.

Imagine trying to give this injunction to an AI. Then it would have to do anything that it thought would prevent the destruction of the world, without other considerations. Doesn't sound like a good idea.

So, I realize this is really old, but it helped trip the threshold for this idea I'm rolling between my palms.

Do we suspect that a proper AI would interpret "avoid destroying the world" as something like

avoid(prevent self from being cause of) destroying(analysis indicates destruction threshold ~= 10% landmass remaining habitable, etc.) the world(interpret as earth, human society...)

(like a modestly intelligent genie)

or do we have reason to suspect that it would hash out the phrase to something more like how a human would read it (given that it's speaking english which it learned from humans)?

This idea isn't quite fully formed yet, but I think there might be something to it.

I am glad Stanislav Petrov, contemplating his military oath to always obey his superiors and the appropriate guidelines, never read this post.

Yvain: I am glad Stanislav Petrov, contemplating his military oath to always obey his superiors and the appropriate guidelines, never read this post.

An interesting point, for several reasons.

First, did Petrov actually swear such an oath, and would it apply in such fashion as to require him to follow the written policy rather than using his own military judgment?

Second, you might argue that Petrov's oath wasn't intended to cover circumstances involving the end of the world, and that a common-sense exemption should apply when the stakes suddenly get raised hugely beyond the intended context of the original oath. I think this fails, because Petrov was regularly in charge of a nuclear-war installation and so this was exactly the sort of event his oath would be expected to apply to.

Third, the Soviets arguably implemented what I called strategy 1 in this comment: Petrov did the right thing, and was censured for it anyway.

Fourth - maybe, on sober reflection, we wouldn't have wanted the Soviets to act differently! Yes, the written policy was stupid. And the Soviet Union was undoubtedly censuring Petrov out of bureaucratic coverup, not for reasons of principle. But do you want the Soviet Union to have a written, explicit policy that says, "Anyone can ignore orders in a nuclear war scenario if they think it's a good idea," or even an explicit policy that says "Anyone who ignores orders in a nuclear war scenario, who is later vindicated by events, will be rewarded and promoted"?

Paul, that's a good point.

Eliezer: If all I want is money, then I will one-box on Newcomb's Problem.

Mmm. Newcomb's Problem features the rather weird case where the relevant agent can predict your behaviour with 100% accuracy. I'm not sure what lessons can be learned from it for the more normal cases where this isn't true.

@Allan: Agent need not predict your overall behavior, only the outcome. If you are creating such agent, you are creating the situation where you have a system that will arrange the future context based on deep analysis of environment, and your other actions are forming this environment. Orchestrating actions consisting of intelligent agents requires this kind of reasoning.

"One suspects that Haukelid's price was far higher than most people would charge"

Or that he routinely killed people and just didn't mind.

Your sense of morality is so wayward.

Question - would you lie in order to win the AI box experiment?

Eliezer,

Crossman and Crowley make very good points above, delineating three possible types of justification for some of the things you say:

1) Don't turn him in because the negative effects of the undermining of the institution will outweigh the benefits

2) Don't turn him in because [some non-consequentialist reason on non-consequentialist grounds]

3) Don't turn him in because you will have rationally/consequentialistly tied yourself to the mast making it impossible to turn him in to achieve greater benefits.

(1) and (3) are classic pieces of consequentialism, the first dating back at least to Mill. If your reason is like those, then you are probably a consequentialist and there is no need to reinvent the wheel: I can provide some references for you. If you support (2), perhaps on some kind of Newcomb's problem grounds, then this deserves a clear explanation. Why, on account of a tricky paradoxical situation that may not even be possible, will you predictably start choosing to make things worse in situations that are not Newcomb situations? Unless you are explicit about your beliefs, we can't help debug them effectively, and you then can't hold them with confidence for they won't have undergone peer scrutiny. [The same still goes for your meta-ethical claims].

If the serial killer comes to the priest and says, 'I have killed six people and plan to kill more. You, Father, included.' Does the priest have license to act out of self-preservation? If not, are you crazy? If so, what does that do to your argument?

http://en.wikipedia.org/wiki/I_Confess_%28film%29

The whole idea of the film is that a murderer comes to a priest and confesses having killed someone, then tries to get the priest falsely suspected of committing the killing himself. The priest comes close to being convicted and executed for the murder, because he can never say or do anything based on the confession he heard.

Toby, my actual stance on the core issue is that it is a Newcomblike problem. You observe the seal of the confessional for the same reason that you one-box on Newcomb's Problem, cooperate in the oneshot Prisoner's Dilemma, or keep your word as Parfit's Hitchhiker: namely, to win.

And if we were talking about superintelligences dealing with other superintelligences, this would be the whole of the law.

It's not easy to transport Newcomblike problems to humans - who cannot make rigorous inferences about each other's probable initial conditions, cannot make rigorous deductions about decisions given the initial condition, and who can only guess at the degree of similarity of decision processes.

But it's by no means obvious that a human should two-box on Newcomb's Problem - it seems that people's choices on Newcomb's Problem do correlate to other facets of their personality, which means that one-boxers against a human Omega might still do better on average. It's by no means clear that humans should go around defecting in the Prisoner's Dilemma, because for us, such situations are often iterated. Our PDs are rarely True PDs where you really don't care at all about the other person. It's by no means clear that humans should believe themselves obligated to break their word to Parfit's Hitchhiker, because we are not perfect liars.

If that lacks the crispness of, for example, the rule that you should not adopt mysterious answers to mysterious questions - well, not every question that I consider has a nice, crisp, massively supported answer. Some of them do. Those are nice. And I even prefer to write about questions that are clear to me, than areas where the borders are fuzzy. But I felt that I had to write about ethics anyway - all things considered.

Eliezer,

So you have a form of deontological ethics based on Newcomb's problem? Now that is very unusual. I can't see how that could be plausible, but hope that you will surprise me. Obviously it is something important enough for a post (or many), so I won't ask you to elaborate any further in the comments.

Googling Parfit and hitchhiker returns some fans of both Derek and Guide to the Galaxy, and a few academic papers behind a paywall. Is there a summary of his example online somewhere?

Since no one so far seems to have mentioned it, there was in fact a Hollywood-style film made (albeit in the UK, not Hollywood), with a mixture of British and American stars, based on the Haukelid/Norsk Hydro story, called "The Heroes of Telemark". Despite being an war/action movie, it actually somehow managed to understate the historical reality.

http://www.imdb.com/title/tt0059263/

Keith, Eliezer: from what I remember of Catholic doctrine (I grew up one), breaking the seal of confession is a lesser sin than murder - as murder is a mortal sin. You go straight to hell for that one, no passing go - Especially as Jesus specifically said 'do not kill' is one of the strongest commandments - but breaking the seal, IIRC, is 'just' de-frocking and excommunication (which may or may not condemn you to hell), which are only undoable by the Pope.

However, mortal sins can be forgiven, and I recall that self-defense lessens the gravity of the offense. So given the hypothetical case of a sinner who is going to kill the priest, I think the thing to do would be to kill the sinner; but in the case of the sinner killing a bunch of other people & specifically excepting the priest (so he can't claim self-defense as in the first case), that's harder. I suppose it comes down to whether you think you can convince the Pope that you were justified in breaking the seal.

TGGP, regarding Parfit's Hitchhiker:

As near as I can tell, it's from his 1984 book "Reasons and Persons." (http://en.wikipedia.org/wiki/Reasons_and_Persons) The hitchhiker is, indeed, living in Douglas Adams' universe, where teleportation "is as nice and instinctive as a kick in the head." I believe he's using teleportation as a metaphor for the thread of personal identity between T1 and T2, and the moral obligation person at T1 has for person at T2. A quote from the wiki goes:

"Part 3 argues for a reductive account of personal identity; rather than accepting the claim that our existence is a deep, significant fact about the world, Parfit's account of personal identity is like this:

At time 1, there is a person. At a later time 2, there is a person. These people seem to be the same person. Indeed, these people share memories and personality traits. But there are no further facts in the world that make them the same person."

I'm including this here so nobody else has to spend an hour searching down broken google links. I might be a little bit off - I had to make a lot of inferences, so don't take it as gospel - and if anyone knows better, please correct my errors.

:)

This is an old post, but it might be worth mentioning that psychiatrists are actually required by law to break confidentiality if you tell them that you are planning on killing someone and they believe that, indeed, you are going to do it.

Really? What is this, market protection for Catholic priests? I guess you literally do have to start a federally recognized church (thankfully they recognize atheist churches) in order to have the Order of Silent Confessors.

Yeah, psychiatrists have a duty to act if they believe you're a danger to yourself or others. If they don't, they can be sued by the victims.

More, I think, rent-seeking by a long-established and powerful constituency. Doctors and lawyers also have very strong privileges, while psychiatrists seem to be less well organized and not as old.

(Quick! In a lobbying fight between the APA and the AMA, who would win? Between the APA and ABA? Between local bars and local psychiatrists?)

confessor is not the only role of a psychiatrist. If you go to a psychiatrist because you are depressed and want help, you PREFER the world where they're allowed to intervene.

Except that they are only allowed to intervene if you are going to harm another or yourself.

If ethics must be held to in the face of the annihilation of everything, then I will proudly state that I have no ethics, only value judgments. Would I kill babies? Yes, to save the life of the mother. Would I kill innocents who had helped me? Yes, to save more. On an interesting aside, I would not torture an innocent for 40 years to prevent 3^^^^3 people from getting a speck of dust in their eyes assuming no further consequences from any of that dust. I would not walk away from Omelas, I would stay to tear it down.

Hobbes (of Calvin and Hobbes)

Of course! WHO else?!

I just love this kind of quoting. Hilarious.

[This comment is no longer endorsed by its author]Reply

I think this this is the quote you were looking for.

(Obligatory)

‘No, Lady,’ [Sam] answered. ‘To tell the truth, I wondered what you were talking about. I saw a star through your finger. But if you’ll pardon my speaking out, I think my master was right. I wish you’d take his Ring. You’d put things to rights. You’d stop them digging up the gaffer and turning him adrift. You’d make some folk pay for their dirty work.’

‘I would,’ she said. ‘That is how it would begin. But it would not stop with that, alas! We will not speak more of it. Let us go!’

I was thinking about this a few months ago, and since people have multiple "one shot" instances of the Prisoner's Dilemma within their lifetimes it might make sense for general rules about "one shot" instances to arise. This sort of interacts with Hofstadter's ideas about superrationality, too. I don't remember the thought very well, but hopefully this comment sort of gets the idea across.

Can you imagine a Hollywood movie in which the hero did that, instead of coming up with some amazing clever way to save the civilians on the ship?

Jack Bauer might do it.