Our conceptual understanding of 'motivated cognition', and why it's defective as a cognitive algorithm - the "Bottom Line" insight.
"Defective" isn't quite enough; you want a prescription to replace it with. Saying "this is a bad habit" seems less useful than saying "here is a good habit."
There are two obvious prescriptions I see: provide correct rationales for decisions, or do not provide rationales for decisions. Which prescription you shoot for has a radically different impact on what exercises you do, and so should be talked about quite a bit. It may be desirable to try and wipe out rationalization, first, and then progress to correct rationales.
One exercise might be asking "who will this convince?" and "whose desires do I want to maximize?". Lucy probably doesn't actually expect Marvin to be swayed by the plight of Big Sugar, and probably doesn't actually suspect that Marvin will believe she's motivated by the plight of Big Sugar, and so that deflection may be the wrong play because it's incredible.
It seems to me that social incentives will swallow most internal incentives here. If I can get more out of others by rationalizing, then it may be a losing move for me to not rationalize- and so it may be more profitable to focus specifically on internal desire-desire conflicts. If Marvin will buy the cake for Lucy if she gives Marvin-optimized reasons, then Lucy should seek to determine whether or not she wants the cake for Lucy-optimized reasons and then present the case to Marvin in terms of Marvin-optimized reasons.
When Lucy senses one desire to eat a whole chocolate cake, and comes up with the sugar industry reason, perhaps Lucy should ask which Lucy that represents (Altruist Lucy) and which other Lucys want to have a say on the issue. Thin Lucy and Cheap Lucy might both think that Lucy shouldn't buy the cake, and Sweet Tooth Lucy wants that cake.
And when Lucy simulates their internal discussion, she quickly realizes that Altruist Lucy doesn't actually care much about the cake issue, compared to the other three. If Altruist Lucy was fully modeled here, she'd probably side with Cheap Lucy (as those dollars can do more good elsewhere). And so the question is what tradeoff Lucy wants to make between the preferences of Thin Lucy, Cheap Lucy, and Sweet Tooth Lucy.
Notice that the rationalization is an explicit call for alliances or disguise in this model. Only three Lucys are really interested- and the weight is against the cake- but Sweet Tooth Lucy can call in other Lucys by constructing arguments that tangentially involve them. That should be a costly move- at the beginning of a decision, Lucy should determine which Lucys are most relevant to the decision, and then be skeptical of attempts to bring in other Lucys.
The first exercise would be labeling the desires involved in a decision. I suspect there will generally be at least three, but in some decisions one or two will dominate. It might be useful to start with decisions where one desire dominates, and then move to where two desires agree, and then three desires agree, and then start introducing conflicting desires.
Jack tripped, and is falling. He notices a desire to stop his fall.
Healthy Jack wants to not get hurt.
Jack tripped, and is falling, within sight of his girlfriend. He notices a desire to stop his fall.
Healthy Jack wants to not get hurt and Impressive Jack wants to not make a fool of himself. They agree on the recommended action.
On a lazy Saturday afternoon, Jack notices a desire to do a mildly dangerous trick in front of his girlfriend.
Impressive Jack wants to show off, and Healthy Jack wants to not get hurt. They disagree on the recommended action.
The second exercise would be declaring other desires as invalid (or possibly valid). This one seems like it could be done either as a worksheet - "does Cheap Jack have anything important to say about Jack tripping, conditioned on Healthy Jack and Impressive Jack already being in the discussion?" - or better yet, socially, in which someone describes a recent decision they faced, which three desires they thought were the most important, and then their partner / other members of the group seek to argue for the inclusion of other desires. It's not yet clear how to get a good balance of suggestions that should be shot down and suggestions that should be considered more deeply, and assigning any sort of points to performance in this exercise could cause motivated cognition, which is bad.
The third exercise would be finding a quick way to resolve this competition between desires. This seems the area where it's hardest to be prescriptive- different methods will fit different minds. Here are a few I can think of:
Summarize each desire's case in a single sentence, put all the sentences next to each other, and choose one side or the other.
Summarize each desire's case in a single sentence, then go with the one that's most compelling.
Summarize each desire's case in a single sentence, assign each a weight, and then randomly determine which desire to go with (using the weights).
Take the proposed courses of action, and then find compromises along the axis of each desire. Cheap Lucy could be satisfied more and Sweet Tooth Lucy only a little less if Lucy just bought a bag of sugar and ate some of it. Thin Lucy could be satisfied more and Sweet Tooth Lucy only a little less if Lucy bought a cake made with Splenda instead of sugar. Imagine the expanded alternative set and choose from one of the options in it.
I'm sure there are more.
"Break down what your parts have to say into parts" would be an interesting counter to rationalization - I think I'll have to call this an immediate $50 award on the grounds that I intend to test the skill itself, never mind how to teach it.
(The Exercise Prize series of posts is the Center for Applied Rationality asking for help inventing exercises that can teach cognitive skills. The difficulty is coming up with exercises interesting enough, with a high enough hedonic return, that people actually do them and remember them; this often involves standing up and performing actions, or interacting with other people, not just working alone with an exercise booklet and a pencil. We offer prizes of $50 for any suggestion we decide to test, and $500 for any suggestion we decide to adopt. This prize also extends to LW meetup activities and good ideas for verifying that a skill has been acquired. See here for details.)
The following awards have been made: $550 to Palladias, $550 to Stefie_K, $50 to lincolnquirk, and $50 to John_Maxwell_IV. See the bottom for details. If you've earned a prize, please PM StephenCole to claim it. (If you strongly believe that one of your suggestions Really Would Have Worked, consider trying it at your local Less Wrong meetup. If it works there, send us some participant comments; this may make us update enough to test it.)
Motivated cognition is the way (all? most?) brains generate false landscapes of justification in the presence of attachments and flinches. It's not enough for the human brain to attach to the sunk cost of a PhD program, so that we are impelled in our actions to stay - no, that attachment can also go off and spin a justificational landscape to convince the other parts of ourselves, even the part that knows about consequentialism and the sunk cost fallacy, to stay in the PhD program.
We're almost certain that the subject matter of "motivated cognition" isn't a single unit, probably more like 3 or 8 units. We're also highly uncertain of where to start teaching it. Where we start will probably end up being determined by where we get the best suggestions for exercises that can teach it - i.e., end up being determined by what we (the community) can figure out how to teach well.
The cognitive patterns that we use to actually combat motivated cognition seem to break out along the following lines:
And also:
Exercises to teach all of these are desired, but I'm setting apart the Rationalization Patterns into a separate SotW, since there are so many that I'm worried 1-4 won't get fair treatment otherwise. This SotW will focus on items 1-3 above; #4 seems like more of a separate unit.
Conceptual understanding / insights / theoretical background:
The core reasons why rationalization doesn't work are given in The Bottom Line and Rationalization. The Bayesian analysis of selective search is given in What Evidence Filtered Evidence? and Conservation of Expected Evidence.
For further discussion, see the entire Against Rationalization sequence, also The Meditation on Curiosity (for the Litany of Tarski).
Some key concepts (it'd be nice if some exercise taught a gut-level understanding thereof, although as always the goal is to t each skills rather than concepts):
(We might also need an exercise just for getting people to understand the concept of motivated cognition at all. When Anna and Michael ran their first session on motivated cognition, they found that while most participants immediately recognized the notion of 'rationalization' from examples like Lucy above, several people had no idea what they were talking about - they didn't see why anyone would ever want to use a technique like the Litany of Tarski. Yes, we know you're skeptical, we also couldn't see how that could possibly be true a priori, but sometimes the evidence just punches you in the nose. After some investigation, it seems entirely possible that Alicorn has simply never rationalized, ever. Other cases (not Alicorn's) suggest that some people might have a very low need for verbal justification; even if they feel guilty about breaking their diet, they feel no urge to invent an elaborate excuse - they just break their diet. On the other hand, LW!Hermione failed to reproduce this experiment - she couldn't find anyone who didn't immediately recognize "rationalization" after 10 tries with her friends. We notice we are confused.)
(The upshot is that part of the challenge of constructing a first unit on motivated cognition may be to "Explain to some participants what the heck a 'rationalization' is, when they don't remember any internal experience of that" or might even be "Filter out attendees who don't rationalize in the first place, and have them do a different unit instead." Please don't be fascinated by this problem at the expense of the primary purpose of the unit, though; we're probably going to award at most 1 prize on this subtopic, and more likely 0, and there's an existing thread for further discussion.)
Countering the rationalization impulse / restoring truth-seeking:
The Tarski method: This is the new name of what we were previously calling the Litany of Tarski: "If the sky is blue, I want to believe the sky is blue; if the sky is not blue, I want to believe the sky is not blue; let me not become attached to beliefs I may not want."
Example: Suppose you walk outside on a fall day wearing a short-sleeved shirt, when you feel a slightly chill breath of air on your arms. You wonder if you should go back into the house and get a sweater. But that seems like work; and so your mind quickly notes that the Sun might come out soon and then you wouldn't need the sweater.
Diagram:
Visualizing all 4 quadrants of this binary proposition - the world is like A and I believe A, the world is like B and I believe A, etc. - should, in principle, emotionally confirm the truth of the proposition: "If it will be cold, I want to believe it's cold; if it's not cold, I want to believe it's not cold; let me not become attached to beliefs I may not want."
Eliezer and Anna, when using this method against the temptation to believe X, visualize only the quadrant "The world is not like X and I believe X" to remind themselves of the consequences; e.g. we would only visualize the "You are cold!" quadrant. Michael Smith (aka "Val", short for Valentine) says that after some practice on this technique as a kata, he was able to visualize all 4 quadrants quickly and that visualizing all 4 seemed to help.
Val also used an upside-down W-diagram with the two worlds at the top and the four beliefs at the bottom, to emphasize the idea that the world is there first, and is fixed, and we have only a choice of what to believe within a fixed world, not a choice of which background world to live in. The Tarski Method embodies a "Start from the world" mental process in which you visualize the world being there first, and your belief coming afterward; a similar "Start from the world" rule is likewise emphasized in the Bayes unit, wherein one starts from a world and asks about the probability of the evidence, rather than starting from the evidence and trying to make it match up with a world.
When we actually tested a unit based on asking people to draw Tarski squares, it didn't work very well - possibly because people didn't seem to understand what it was for, or when they would use it; possibly because it wasn't a group exercise. In any case, we already tried teaching this the obvious way ("Go draw Tarski squares!") and it didn't work. But it still seems worth teaching if someone can invent a better exercise, because it's something that multiple CfAR people actually use to counter the rationalization impulse / restore truthseeking in real life.
Become Curious: Detect non-curiosity and become curious. Anna's main alarm signal is when she notices that she's not curious in the middle of a conversation - that she doesn't have an impulse-to-find-out the answer - and then try to make herself curious about the subject of discussion. Besides visualizing the not-X-and-believe-X quadrant of the Tarski diagram, this is also something you may be able to do by brute introspection - remember the feeling of curiosity, and try to call it up. (This is probably in the top 3 most important things I learned from Anna. -- EY)
Take Pride in Your Objectivity: Julia teaches this as a primary counter in her Combat Reflexes unit (how to avoid instantly defending or attacking). Eliezer does this every time he admits he's wrong on the Internet - congratulates himself on being such a great rationalist, in order to apply counter-hedons to the flash of pain that would otherwise be associated.
Visualize a Fixed Probability: This is what Eliezer used as a child to stop being scared of the dark - he would deliberately visualize a murderer standing with a knife behind a door, then visualize his own thoughts having no effect on the fixed probability that any such murderer was actually present. In other words, the notion of a "true probability" that his thoughts couldn't affect, countered the fear of thoughts affecting reality. Visualizing there being a fixed frequency of worlds, or a lawful probability that a Bayesian agent would assign, can help in perceiving the futility of rationalization because you're trying to use arguments to move a lawful probability that is fixed. This is also part of the domain of Lawful Uncertainty, the notion that there are still rules which apply even when we're unsure (not presently part of any unit).
Imagine the Revelation: Anna imagines that the answer is about to be looked up on the Internet, that Omega is about to reveal the answer, etc., to check if her thoughts would change if she was potentially about to be embarrassed right now. This detects belief-alief divergence, but also provides truthseeking impulse.
Knowing the Rules: And finally, if you have sufficient mastery of probability theory or decision theory, you may have a procedure to follow which is lawful enough, and sufficiently well-understood, that rationalization can't influence it much without the mistake being blatant even to you. (In a sense, this is what most of Less Wrong is about - reducing the amount of self-honesty required by increasing the obviousness of mistakes.)
Noticing flinches and attachments, and raising them to conscious attention:
A trigger for use of curiosity-restoration or the Tarski Method: Noticing what it feels like for your mind to:
Anna's anti-rationalization makes heavy use of noticing suspect situations where the outside view says she might rationalize - cases where her status is at stake, and so on - and specific keywords like "I believe that" or "No, I really believe that". She wants to try training people to notice likely contexts for rationalization, and to figure out keywords that might indicate rationalization in themselves. (Eliezer has never tried to train himself to notice keywords because he figures his brain will just train itself to avoid the trigger phrase; and he worries about likely-context training because he's seen failure modes where no amount of evidence or sound argument is enough to overcome the suspicion of rationalization once it's been invoked.)
Awards for previous SotW suggestions:
$550 to Palladias for the Monday-Tuesday game, which has been tested ($50) and now adopted ($500) into the Be Specific unit (though it might be moved to some sort of Anticipation unit later on).
$550 to Stefie_K for her suggestion to have the instructor pretend to be someone who really wants you to invest in their company, but is never specific; also $50 to daenrys for the "More Specific!" improv-game suggestion. In combination these inspired the Vague Consultant game ("Hi, I'm a consultant, I'm here to improve your business processes!" "How?" "By consulting with stakeholders!") which has now been adopted into the Be Specific unit.
$50 to lincolnquirk for the "Channel Paul Graham" game, which we tested. We all thought this would work - it was our highest-rated candidate suggestion - but it didn't get positive audience feedback. Congratulations to lincolnquirk on a good suggestion nonetheless.
We haven't yet tested, but definitely intend to at least test, and are hence already awarding $50 to, the following idea:
$50 to John Maxwell IV for the Choose Your Own Adventure suggestion for the Consequentialism unit.
To claim a prize, send a LessWrong private message (so we know it originates from the same LW user account) to StephenCole.