It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:
I believe that the ability to expect that conversation partners are well-intentioned by default is a public good. An extremely valuable public good. When criticism turns to attacking the intentions of others, I perceive that to be burning the commons. Communities often have to deal with actors that in fact have ill intentions, and in that case it's often worth the damage to prevent an even greater exploitation by malicious actors. But damage is damage in either case, and I suspect that young communities are prone to destroying this particular commons based on false premises.
To be clear, I am not claiming that well-intentioned actions tend to have good consequences. The road to hell is paved with good intentions. Whether or not someone's actions have good consequences is an entirely separate issue. I am only claiming that, in the particular case of small high-trust communities, I believe almost everyone is almost always attempting to do good by their own lights. I believe that propagating doubt about that fact is nearly always a bad idea.
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?
There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but social intuitions are a central case of the sort of things I would expect humans to get rightby default. I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.
Nate continues:
My models of human psychology allow for people to possess good intentions while executing adaptations that increase their status, influence, or popularity. My models also don’t deem people poor allies merely on account of their having instinctual motivations to achieve status, power, or prestige, any more than I deem people poor allies if they care about things like money, art, or good food. […]
One more clarification: some of my friends have insinuated (but not said outright as far as I know) that the execution of actions with bad consequences is just as bad as having ill intentions, and we should treat the two similarly. I think this is very wrong: eroding trust in the judgement or discernment of an individual is very different from eroding trust in whether or not they are pursuing the common good.
Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.
More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control.
I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”
Aside from the obvious problems with assuring someone that you're telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.
Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn't be so common if it required an exceptionally bad character. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.
I'm very glad that you asked this! I think we can come up with some decent heuristics:
If you start out with some sort of inbuilt bad faith detector, try to see when, in retrospect, it's given you accurate readings, false positives, and false negatives. I catch myself doing this without having planned to on a System 1 level from time to time. It may be possible, if harder, to do this sort of intuition reshaping in response to evidence with System 2. Note that it sometimes takes a long time, and that sometimes you never figure out, whether or not your bad-faith-detecting intuitions were correct.
There's debate about whether a bad-faith-detecting intuition that fires when someone "has good intentions" but ends up predictably acting in ways that hurt you (especially to their own benefit) is "correct". My view is that the intuition is correct; defining it as incorrect and then acting in social accordance with it being incorrect incentivizes others to manipulate you by being/becoming good at making themselves believe they have good intentions when they don't, which is a way of destroying information in itself. Hence why allowing people to get away with too many plausibly deniable things destroys information: if plausible deniability is a socially acceptable defense when it's obvious someone has hurt you in a way that benefits them, they'll want to blind themselves to information about how their own brains work. (This is a reason to disagree with many suggestions made in Nate's post. If treating people like they generally have positive intentions reduces your ability to do collaborative truth-seeking with others on how their minds can fail in ways that let you down--planning fallacy is one example--then maybe it would be helpful to socially disincentivize people from misleading themselves this way by giving them critical feedback, or at least not tearing people down for being ostracizers when they do the same).
Try to evaluate other's bad faith detectors by the same mechanism as in the first point; if they give lots of correct readings and not many false ones (especially if they share their intuitions with you before it becomes obvious to you whether or not they're correct), this is some sort of evidence that they have strong and accurate bad-faith-detecting intuitions.
The above requires that you know someone well enough for them to trust you with this data, so a quicker way to evaluate other's bad-faith-detecting intuitions is to look at who they give feedback to, criticize, praise, etc. If they end up attacking or socially qualifying popular people who are later revealed to have been acting in bad faith, or if they end up praising or supporting ones who are socially suspected of being up to something who are later revealed to have been acting in good faith, these are strong signals of them having accurate bad-faith-detecting intuitions.
Done right, bad-faith-detecting intuitions should let you make testable predictions about who will impose costs or provide benefits to you and your friends/cause; these intuitions become more valuable as you become more accurate at evaluating them. Bad-faith-detecting intuitions might not "taste" like Officially Approved Scientific Evidence,
and we might not respect them much around here, but they should tie back into reality, and be usable to help you make better decisions than you'd been able to make without using them.
It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?
There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but social intuitions are a central case of the sort of things I would expect humans to get right by default. I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.
Nate continues:
Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.
More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control.
I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”
Aside from the obvious problems with assuring someone that you're telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.
Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn't be so common if it required an exceptionally bad character. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.
(Cross-posted at my personal blog.)
[EDITED 1 May 2017 - changed wording of title from "behavior" to "disposition"]