Take a second to imagine what being a child was like throughout most of human history. You were born with a huge and underdeveloped brain, designed for soaking in information from your surroundings like a sponge. But you weren’t able to freely follow your curiosity: even if you had loving, caring parents, you still faced frequent physical danger from nature and other people, severe scarcity, and rigid cultural norms that governed acceptable behavior within your community, with harsh penalties for stepping out of line. You had to learn fast and reliably how to stay safe, and how to stay on the good side of the adults around you, especially your parents, whose care for you was a matter of life or death.

Even after you grew up and passed the period of most acute danger, you’d still face many threats of violence and scarcity. Your ability to avoid these depended in large part on your relationships: holding a respected position within your tribe was the key pathway to a good life, whereas exclusion from your tribe was tantamount to execution. So “danger and rejection” isn’t an ad-hoc combination: our brains are primed to think of them as the same thing; and conversely, to equate safety and love. I’ll call the latter combination “security” (which I think of as a combination of “physical security” and “emotional security”, although I’ll mostly be focusing on the latter). Children are learning machines, and what they learn above all is strategies for achieving security; because the opinions of other people are so powerful, “being good” in ways which receive approval from the group is one of the central strategies they learn.

How literally should we take this story? It’s clear that describing humans as optimizing for a single goal is a big oversimplification. But it’s hard to overstate how powerful the drive for security is. Think of the many girls who override the drive to eat because part of their brain is convinced that being skinnier will make others desire and love them. Think of the many boys who override their sex drives because part of their brain is convinced that hitting on girls would lead to broader social rejection. Think of the many suicidal adults who override their literal survival drive in response to problems with their relationships or careers. If these drives can be quashed by the drive for security, then anything can be.

I think there are two main reasons that our drive for security is hard to see. One is that, as adults, we’re often not optimizing directly for security, but instead following heuristics and strategies developed to achieve proxies for security in our childhood environment. In other words: on an emotional level, our brains often “cache” conclusions from childhood, which become hard to override as adults. I like the way that Malcolm Ocean describes the result of this caching: “everyone is basically living in a dream mashup of their current external situation and their old emotional meanings”. But those old emotional meanings are adapted for a childhood environment very different from our adult environment, and are therefore often deeply counterproductive. For example, once we are able to stand up to people yelling at us, or we are able to leave abusive relationships, we may still be slow to do so because we’ve cached the conclusion that we’re helpless in situations like these.

The second reason that our drive for security is hard to see: because we don’t directly understand what’s behind them, counterproductive strategies often get stuck in self-reinforcing feedback loops. Consider a child who becomes fixated on the belief: “if I’m skinnier then people will love me”. She becomes skinnier, and of course it doesn’t fix anything. The reasonable move is to discard that belief. But, perversely, this failure can also be taken as evidence for that belief: the only reason it didn’t work, she might think, is because she didn’t become skinny enough. I’m reminded of the common political strategy of blaming all failures of your policies on the fact that you didn’t have enough power to implement them comprehensively enough—an obviously flawed argument which nevertheless often works, because confirmation bias is so powerful.

Similarly, consider someone who sees a small fault in their partner (e.g. he’s messy), is driven by past trauma to read a deeper message into it (“he doesn’t clean up because he doesn’t love me“) and starts making snarky comments in response to any mess. At some level, those comments are trying to evoke love, but typically they have exactly the opposite effect—causing resentment or anger in their partner, which only makes the fear of abandonment worse.

So a small “snag” in our brains can grow into a tangled mess of defensiveness and self-destructiveness. And in particular, small experiences of fear or hurt from when we’re young and vulnerable can give rise to complex protective strategies, like the ones described above. Those strategies get defensive when criticized, or when they lose influence, because they have a deep-rooted belief that their approach is the only way to avoid danger; and this can cause them to be reinforced even when they lead to very counterproductive behavior.

The memories of those painful experiences are called “traumas” (although I often use “microtrauma” to convey that these are typically much less severe than the type of trauma that causes PTSD); and the combination of them and the protective strategies that build up around them are called emotional schemas. Thinking about people's behavior in terms of traumas and emotional schemas has helped me be much more empathetic, by highlighting the ways in which even their hurtful or harmful actions come from them trying their best to take care of themselves; and by seeing the "inner child" that is motivating much of their behavior.

I want to flag here that I’m asking for quite a lot of suspension of disbelief from many readers. The explanation I've just given is reminiscent of just-so stories which appeal to evolutionary psychology in unrigorous ways—and, like most of those stories, it seems very hard to falsify. You can probably construct some narrative in which any given goal is an extrapolation of any given negative experience; and indeed, given how unreliable our memories are, you can sometimes get people to make up childhood memories wholesale. Without having seen this frame concretely improve people’s lives, is it even worth your time to read about it? Additionally, there’s something very panglossian about this—it’s postulating that a lot of cruel, destructive or even evil behavior is driven by fundamentally sympathetic and innocent goals.[1] That may feel naive, or unjust, or even dangerous. Isn’t the ability to hold people responsible for their actions what stands between us and anarchy?

I shared many of these concerns when first introduced to these ideas. Since then they've proven so incredibly effective in my personal experience that I'm convinced there's something very important here. All of my claims are based on various psychotherapy techniques which seem to work well for many others too; and the claims I've made so far have some neuroscientific backing (see here in particular). I do want to flag that just because a technique works doesn't mean that its underlying ontology is correct (and that the claims I'll be making later on have less scientific backing). But my primary goal in this sequence is to convey the way I think about these things, rather than doing a more rigorous review of the evidence.

Since I'm a very cerebral person, though, it's particularly important to me that there's some high-level justification for why the counterintuitive things I'm describing make sense. I'll be giving that intellectual justification in post #8. In advance of that, I just want to appeal to common sense. We’ve all seen people pursuing self-destructive behaviors, often with disastrous yet totally foreseeable consequences. We’ve also all seen people who have very complicated and emotionally loaded relationships with their parents, even decades later, after their parents have almost no power over their lives. And we’ve all seen people who are incredibly (“irrationally”) touchy about certain topics. When you zoom out and think about this with fresh eyes, many of these behaviors are so baffling that the answer has to be counterintuitive in some sense.[2]

And yet I think there’s another sense in which this answer is very intuitive, because it allows us to reason about our psychology by recycling many of the intuitions that we already apply to social interactions. I’ll explain that perspective in the next post.

You can think of protective strategies as trying to give our "inner children" safety and love—but often backfiring.
  1. ^

    I’m reminded of a quote from Scott Alexander’s Unsong: “Evil… was hollow, more brittle than glass, lighter than a feather, thinner than a hair, tinier than a dust speck, so tiny it barely even existed at all. Evil was the world’s dumbest joke, the flimsiest illusion.”

  2. ^

     To be clear, though, none of my claims should be considered universal; there will always be a bunch of psychological outliers (e.g. sociopaths) to whom they don’t apply. Although even then, it's notable that sociopaths often had extremely traumatic childhoods. And it’s striking to read accounts of the “banality” of evil, and how many of the most heinous war criminals thought of themselves as trying to do the right thing, took care of their friends, and loved their families.

New Comment
5 comments, sorted by Click to highlight new comments since:

The only counterintuitive thing about this post is that you expect the readers to find it counterintuitive! It's pretty obvious to those of us who remember our childhoods and have enough self-awareness to notice ourselves reliving them over and over...

I'm a big fan of the Replacing Guilt series. But I've always found the "guilt" part troubling because it always felt there was something more behind, something even more primitive.

Perhaps it's just me or people like me but now I believe that thing is fear. Completely subjectively I had an experience recently while watching my thoughts (inspired by https://www.lesswrong.com/posts/bbB4pvAQdpGrgGvXH/tuning-your-cognitive-strategies) and noticed that certain chains if thoughts terminated as if at a wall made up of this panicky feeling, the one where you feel your chest tigthen and your breathing become shallow and difficult. It felt a lot like fear.

Looking forward to your next posts in the series!

Thank you very much for this sequence. I knew fear was a great influence (or impediment) over my actions, but I hadn't given it such a concrete form, and especially a weapon (= excitement) to combat it, until now.

Following matto's comment, I went through the Tunning Your Cognitive Strategies exercise, spotting microthoughts and extracting the cognitive strategies and deltas between such microthoughts. When evaluating a possible action, the (emotional as much as cognitive) delta "consider action X -> tiny feeling in my chest or throat -> meh, I'm not sure about X" seemed quite recurring. Thanks to your pointers on fear and to introspecting about it, I have added "-> are you feeling fear? -> yes, I have this feeling in my chest -> is this fear helpful? -> Y, so no -> can you replace fear with excitement?" (a delta about noticing deltas) as a cognitive strategy.

Why I (beware of other-optimizing) can throw away fear in most situations is that I have developed the mental techniques, awareness and strength to counter the negatives which fear wants to point at.

As many, I developed fear as a kid, in response to being criticised or rejected, at a time when I didn't have the mental tools to deal with these situations. For example, I took things too personally, thought others' reactions were about me and my identity, and failed to put myself in others' shoes and understand that when other kids criticise it is often unfounded and just to have a laugh. To protect my identity I developed aversion, a bias towards inaction, and fear of failure and of being criticised. This propagated to also lead to demotivation, self-doubt, and underconfidence.

Now I can evaluate whether fear is an emotion worth having. Fear points at something real and valuable: the desire to do things well and be liked. But as I said, for me personally fear is something I can do away with in most situations because I have the tools to respond better to negative feedback. If I write an article and it gets downvoted, I won't take it as a personal issue that hurts my intrinsic worth; I will use the feedback to improve and update my strategies. In several cases, excitement can be much more useful (and motivating, leading to action) than fear: excitement of commenting or writing on LessWrong over fear of saying the wrong thing; excitement of talking or being with a girl rather than fear of rejection.

Dear Richard,

I stumbled upon this particular post in my initial explorations of lesswrong and researching what the best knowledge is that the community has been able to come to on the topic of "drives, intentions and pursuit of goals" as related to the non-human agency of risky AI.

Thanks for creating these sequences on fear which looks to be a well-thought-out thesis with an interesting proposal for a fear-reducing strategy. The reason I'm making this my first comment on the forum is because the topic also resonates with my personal experience, as I see it has done for others in the comments section. Therapy has helped me touch on some childhood memories and has been productive in reshaping some of my thinking and behavior for the better.

And I also think it is necessary to understand where our own goals come from if we want to align them with AI.

I think what could improve your writing and reasoning in this post, although you might balance it out with the other posts that I have not read yet, is to distinguish a bit more the doubt you have about the instrumental value of the strategy and the ontology of the fear. I think you can rightly posit that behaviors emerge out of the need for the child to adapt and you could reference (even) more sources that point to this.  Also, therapy that takes the person back to these memories to transform them is evidenced to work. 

What I am missing is an investigation into how the fear, undesirable in situations that might cognitively or intelligently be observed as non-threatening, is part of the story someone tells themselves about themselves; their self-perceived identity. Like you say, any giving goal could have at its base any give negative experience, although some might be more easily correlated through observance of their features (fear of intimacy because of abandonment or abuse, etc.), any story could be thought of by the person that makes them cope at the risk of expending rationality.

So while you do touch on these things, and probably in other posts as well, I think that by revisiting your writing and making up your mind about some things you could take out the caveats and make the post a bit more authoritative, especially from the paragraph onwards that starts with "I want to flag...".

Looking forward to your response.

Strongly upvoted because this is an important topic and because it could help quite a lot of potential readers.

Thanks for writing it Richard.