There's a common thread that runs through a lot of odd human behavior that I've recognized:
- People often accept surface explanations of their own and others' habits when the nefarious explanations would say something bad about them.
- The media we make for ourselves presents people as far more willing to go out of their way to defy incentives and help others than they actually are, even when you account for storytelling conveniences.
- People tend to trust that organizations like hospitals, nonprofits, and state bureaucracies will self-organize towards pursuing their nominal goals, so long as they claim to be doing that, even if those bureaucracies lack strong organizational incentives to do so.
- People are quick to, without much evidence, argue that those involved in terrible atrocities were or are anomalously evil, instead of representative examples of average people's respect for the human lives of strangers.
- People are shocked by, and often go into outright denial about, the purpose and effective output of major human institutions. Someone had to write an entire book about how education wasn't about learning, before people started to notice that it wasn't. And plenty of people still don't!
To summarize: people are really charitable. They're charitable about the people they know, and the people they don't know. They're charitable about experts, institutions, and the society in which they live. Even people who pride themselves on being independent thinkers seem to take for granted that their hospitals or schools are run by people who just want to make life better for them. When they do snap out of these delusions, it seems to take a lot of intellectual effort, and a lot of explicit thinking about incentives, that is unnecessary for them in other contexts.
The bias is not granted equally. In my experience, there's a connection between people's niceness, and their proclivity in giving unwarranted trust to others.
My old high school Theology teacher, Mr. Portman, was the nicest person I've ever met. The students took advantage of him, like the rest of the nice teachers, correctly inferring that they would be less likely to stick up for themselves. One year he ran a charity drive by selling conflict-free chocolate bars he had bought with his own money, intending to donate the profits to anti-slavery charities. He was such an honest soul that he let kids in his class take them and make a verbal promise that they'd pay him for them later. Even in the upscale high school I went to, they almost never did.
I think it's a generally accepted observation about kind people, that honor and naivete go hand in hand. There are lots of folk explanations for this tendency; for example, a lot of people say that virtuous people generalize from one example, and assume others are "like them".
Unfortunately none of these explanations tend to account for an additional fact of my experience, that the bias seems to only apply to nice people and not mean people. It's much rarer that I encounter someone who is so cynical about others' motivations that they start avoiding trustworthy people. If the problem is that nice people are generalizing from their internal experiences, then why is it that even self-declared psychopaths I meet seem ~basically correctly calibrated about how likely others are to mess with them?
To answer this, I think it's helpful to view the situation through the lens of game theory, as a toy model. Imagine people like Mr. Portman as running around implementing certain algorithms in one of those Prisoner's Dilemma tournaments.
Most people are not running CooperateBot or DefectBot in the general sense. They're running something between FairBot and PrudentBot. And in order to run these algorithms in the real world, you naturally need to make probabilistic assessments about the behavior of other people.
In theory, any combination of FairBot and PrudentBot cooperate with each other. If they have good line of sight, they would all mostly trade swimmingly.
In practice, in a world full of PrudentBots, you want to present as a FairBot, regardless of what you actually are. Why? Because simple algorithms are easier to verify, and tit-for-tat is the simplest possible algorithm that still receives good treatment. Trading safely with a PrudentBot is doable, but dangerous. You'll get less trading opportunities that way, because the person who wants to trade with you needs to convey something more specific than "I will cooperate". They need to make you believe "I will cooperate with you iff you cooperate with me".
On the other hand, if almost everyone around you is already a FairBot, the simplest and most effective identity becomes CooperateBot, not FairBot. In FairBotLand, cooperating with everyone just works, and provides a killer logfile. Sure you may get taken advantage of once in a while, but depending on your environment that might be an acceptable risk if it means the FairBots can clock you as trustworthy more often.
So assuming you lived in a relatively nice environment, and wanted to be known as a simple, clean trading partner, how would you actually convey either of these things? Not everyone has your log. You could just say "I follow the golden rule" or "I give people the benefit of the doubt" - but you might be lying.
Well, most people are indeed running something like tit-for-tat, and treat people that they like a lot better. So one nice adaption for assuring others you'll be kind, is having a pro-human cognitive bias. Specifically, one that suggests a positive view of how people treat one another. In this frame, unnecessary charitability is a costly signal of friendliness which demonstrates one can be fooled, but also exposes one to more trading opportunities. It's a trust exercise.
I think this analysis also explains to me another detail, which is why a lot virtue signaling seems so "misplaced". When most people I know think of virtue-signaling, they're not usually imagining direct acts of charity, like donating to the AMF, or saving children drowning in ponds. Sometimes people still call that stuff virtue signaling, but in my mind it's not the central example. What I imagine when I think of virtue signaling is dramatic, public displays of compassion toward people who either don't deserve it or can't reciprocate. I couldn't understand why people's attempts to display "virtue" were so ineffective at actually improving society.
But it makes a lot more sense if the point of the adaption is to signal friendliness and not necessarily to show you're "net-positive" in an abstract EA sense. What an act like Martha McKay's shows is not just that the person cares about others in general, but also that they are dramatically optimistic about human nature, and unlikely to take advantage of you if you decide to interact with them.
To be clear, people like Mr. Portman or Ms. McKay are actually nice. They're generally prosocial people. When you're doing character analysis of others, you should take into account that cynicism is a bad sign. But you can imagine a lot of left-right squabbling over criminal justice reform as resulting from the left accusing the right of being unscrupulous and evil, and the right accusing the left of misunderstanding human nature. Both accusations are true; the left, being more staffed with empathetic people, is more prone to a humans-are-wonderful-bias and thus more willing to entertain bizarre policies like police abolishment. The right, being less sympathetic, genuinely doesn't care much about the participants of the criminal justice system, but is also less likely to adopt naive restorative justice positions for social reasons.
When it comes to this particular bias, I think there's a balance to be struck. Insofar as it's required for you to pretend that people are nicer than they are to be kind to them, I think you should do that. But your impact will be better if you at least note it if that's what you're doing, and try to prevent it from bleeding into policy analysis.
By the way, if we consider game theory and logic to be any relevant, then there's a corollary of Löb's Theorem: if you defect given proof that counterparty will defect, and another party will defect given proof that you will, then you both will, logically, defect against each other, with no choice in the matter. (And if you additionally declare that you cooperate given proof that partner will cooperate, you've just declared a logical contradiction.)
For packing this result into a "wise" phrase, I'd use words:
Good is not a universally valid response to Evil. Evil is not a universally valid response to Evil either. Seek that which will bring about a Good equilibrium.