I agree that most relevant bad behavior isn't going to feel from the inside like an attempt to mislead, and I think that rationalists sometimes either ignore this or else have an unfounded optimism about nominal alignment.
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
In the evolutionary context, our utterances and conscious beliefs are optimized for their effects on others, and not merely for accuracy. Believing and claiming bad things about competitors is a typical strategy. Prima facie, accusations of bad faith are particularly attractive since they can be levied on sparse evidence yet are rationally compelling. Empirically, accusations of bad faith are particularly common.
Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing.
This makes an interesting contrast with the content of the post. The feeling that some people are bad is a strong and central social intuition. Do you think you've risen to the standard of evidence you are requesting here? It seems to me that you are largely playing the same game p...
The feeling that some people are bad is a strong and central social intuition. Do you think you've risen to the standard of evidence you are requesting here?
Nope! Good point.
It seems to me that you are largely playing the same game people normally play, and then trying to avoid norms that regulate the game by disclaiming "I'm not playing the game."
Here's a specific outcome I would like to avoid: ganging up on the individuals saying misleading things, replacing them with new individuals who have better track records, but doing nothing to alter the underlying incentives. That would be really bad. I think we actually have exceptionally high-integrity individuals in critical leadership positions now, in ways that make the problem of perceived incentive to distort much easier to solve than it might otherwise be.
I don't actually know how not to play the same old game yet, but I am trying to construct a way.
I don't actually know how not to play the same old game yet, but I am trying to construct a way.
I see you aiming to construct a way and making credible progress, but I worry that you're trying to do to many things at once and are going to cause lasting damage by the time you figure it out.
Specifically, the "confidence game" framing of the previous post moved it from "making an earnest good faith effort to talk about things" to "the majority of the post's content is making a status move"[1] (in particular in the context of your other recent posts, and is exacerbated by this one), and if I were using the framing of this current post I'd say both the previous post and this one have bad intent.
I don't think that's a good framing - I think it's important that you (and folk at OpenPhil and at CEA) do not just have an internally positive narrative but are actually trying to do things that actually cache out to "help each other" (in a broad sense of "help each other"). But I'm worried that this will not remain the case much longer if you continue on your current trajectory.
A year ago, I was extremely impressed with the work you were doi...
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
Human punishment of free riders helps ensure there are few free-riders. Our fear and surprise responses are ridiculously over sensitive, because of the consequences of type 1 vs type 2 errors. Etc...
Evolution, too, is into massive A/B testing with no optimisation target that includes truth.
I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response
Note also that most groups treat their intuitions about whether or not someone is acting in bad faith as evidence worth taking seriously, and that we're remarkable in how rarely we tend to allow our bad-faith-detecting intuitions to lead us to reach the positive conclusion that someone is acting in bad faith. Note also that we have a serious problem with not being able to effectively deal with Gleb-like people, sexual predators, e...
The binary classification leads to problems. We distinguish cooperative intent, defective intent and hostile intent. The person who optimizes his marketing for conversation without regard for the truth is acting defective and neither cooperative nor hostile.
There's such a thing as hostile intent. Some people are intent to cause harm for other people but those aren't the people with whom we have problems in this community.
Just loudly repeating what you said using my own words... when we talk about optimizing for truth (or any other X), there are essentially 3 options (and of course any mix of them)...
And while it is a bad form to accuse someone of optimizing against truth, it makes sense to suspect that people are simply not optimizing for truth... which -- especially when they optimize for something else -- usua...
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
You may not be wrong but I don't think it would necessarily be surprising. We adapted under social conditions that are radically different than exist today. It may no longer be adaptive.
Hypothesis: In small tribes and family groups assumptions of bad faith may have served to help negotiate away from unreasonable positions while strong familial ties and respected third parties mos...
So, a couple of thoughts:
1) ascribing intent to behavior is one of the best ways to control someone's behavior, and it's deeply baked into our reactions. You are much more likely to get someone to conform to your desires if you say "you're intentionally behaving badly, stop it" than if you say "I don't like that outcome, but you didn't mean it". Your mind is biased toward seeing (and believing, so you can more forcefully make the accusation) much stronger intent than actually exists.
2) intent is a much better predictor of future behav...
For more explanation on how incentive gradients interact with and allow the creation of mental modules that can systematically mislead people without intent to mislead, see False Faces.
It’s common to think that someone else is arguing in bad faith. In a recent blog post, Nate Soares claims that this intuition is both wrong and harmful:
It would be surprising, if bad intent were so rare in the relevant sense, that people would be so quick to jump to the conclusion that it is present. Why would that be adaptive?
What reason do we have to believe that we’re systematically overestimating this? If we’re systematically overestimating it, why should we believe that it’s adaptive to suppress this?
There are plenty of reasons why we might make systematic errors on things that are too infrequent or too inconsequential to yield a lot of relevant-feeling training data or matter much for reproductive fitness, but social intuitions are a central case of the sort of things I would expect humans to get right by default. I think the burden of evidence is on the side disagreeing with the intuitions behind this extremely common defensive response, to explain what bad actors are, why we are on such a hair-trigger against them, and why we should relax this.
Nate continues:
Nate's argument is almost entirely about mens rea - about subjective intent to make something bad happen. But mens rea is not really a thing. He contrasts this with actions that have bad consequences, which are common. But there’s something in the middle: following an incentive gradient that rewards distortions. For instance, if you rigorously A/B test your marketing until it generates the presentation that attracts the most customers, and don’t bother to inspect why they respond positively to the result, then you’re simply saying whatever words get you the most customers, regardless of whether they’re true. In such cases, whether or not you ever formed a conscious intent to mislead, your strategy is to tell whichever lie is most convenient; there was nothing in your optimization target that forced your words to be true ones, and most possible claims are false, so you ended up making false claims.
More generally, if you try to control others’ actions, and don’t limit yourself to doing that by honestly informing them, then you’ll end up with a strategy that distorts the truth, whether or not you meant to. The default state for any given constraint is that it has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people perceive a convergent incentive to inform one another, rather than a divergent incentive to grab control. But, if you do not defend yourself and your community against divergent strategies unless there is unambiguous evidence, then you make yourself vulnerable to those strategies, and should expect to get more of them.The default hypothesis should be that any given constraint has not been applied to someone's behavior. To say that someone has the honest intent to inform is a positive claim about their intent. It's clear to me that we should expect this to sometimes be the case - sometimes people have a convergent incentive to inform one another, rather than a divergent incentive to grab control.
I’ve been criticizing EA organizations a lot for deceptive or otherwise distortionary practices (see here and here), and one response I often get is, in effect, “How can you say that? After all, I've personally assured you that my organization never had a secret meeting in which we overtly resolved to lie to people!”
Aside from the obvious problems with assuring someone that you're telling the truth, this is generally something of a nonsequitur. Your public communication strategy can be publicly observed. If it tends to create distortions, then I can reasonable infer that you’re following some sort of incentive gradient that rewards some kinds of distortions. I don’t need to know about your subjective experiences to draw this conclusion. I don’t need to know your inner narrative. I can just look, as a member of the public, and report what I see.
Acting in bad faith doesn’t make you intrinsically a bad person, because there’s no such thing. And besides, it wouldn't be so common if it required an exceptionally bad character. But it has to be OK to point out when people are not just mistaken, but following patterns of behavior that are systematically distorting the discourse - and to point this out publicly so that we can learn to do better, together.
(Cross-posted at my personal blog.)
[EDITED 1 May 2017 - changed wording of title from "behavior" to "disposition"]