rwallace, as mentioned by whpearson, notes possible risks from de-compartmentalization:
Human thought is by default compartmentalized for the same good reason warships are compartmentalized: it limits the spread of damage.... We should think long and hard before we throw away safety mechanisms, and compartmentalization is one of the most important ones.
I agree that if you suddenly let reason into a landscape of locally optimized beliefs and actions, you may see significant downsides. And I agree that de-compartmentalization, in particular, can be risky. Someone who believes in heaven and hell but doesn’t consider that belief much will act fairly normally; someone who believes in heaven and hell and actually thinks about expected consequences might have fear of hell govern all their actions.
Still, it seems to me that it is within the reach of most LW-ers to skip these downsides. The key is simple: the downsides from de-compartmentalization stem from allowing a putative fact to overwrite other knowledge (e.g., letting one’s religious beliefs overwrite knowledge about how to successfully reason in biology, or letting a simplified ev. psych overwrite one’s experiences of what da...
This reminds me:
When I finally realized that I was mistaken about theism, I did one thing which I'm glad of– I decided to keep my system of ethics until I had what I saw as really good reasons to change bits of it. (This kept the nihilist period I inevitably passed through from doing too much damage to me and the people I cared about, and of course in time I realized that it was enough that I cared about these things, that the universe wasn't requiring me to act like a nihilist.)
Eventually, I did change some of my major ethical beliefs, but they were the ones that genuinely rested on false metaphysics, and not the ones that were truly a part of me.
I think instrumental rationalists should perhaps follow a modified Tarski litany, "If I live in a universe where believing X gets me Y, and I wish Y, then I wish to believe X". ;-)
Maybe. The main counter-argument concerns the side-effects of self-deception. Perhaps believing X will locally help me achieve Y, but perhaps the walls I put up in my mind to maintain my belief in X, in the face of all the not-X data that I am also needing to navigate, will weaken my ability to think, care, and act with my whole mind.
'Something to protect' always sounded to me like a term for a defensive attitude, a kind of bias; I have to remind myself it's LW jargon for something quite different. 'Definite major purpose' avoids this problem.
I find the analysis presented in this post to be exceptionally good, even by the standards of your usual posting.
For example, we may request critiques from those likely to praise our abilities...
In the spirit of learning and not wireheading, could a couple people for whom this post didn't work well explain what didn't work about it? A few folks praised it, but it seems to be getting less upvotes than other posts, and I'd love to figure out how to make posts that are widely useful.
After trying to figure out where the response would be best suited, I'm splitting the difference; I'll put a summary here, and if it's not obviously stupid and seems to garner comments, I'll post the full thing on its own.
I've read some of the sequences, but not all; I started to, and then wandered off. Here are my theories as to why, with brief explanations.
1) The minimum suggested reading is not just long, it's deceptively long.
The quantity by itself is a pretty big hurdle to someone who's only just developing an interest in its topics, and the way the sequences are indexed hides the actual amount of content behind categorized links. This is the wrong direction in which to surprise the would-be reader. And that's just talking about the core sequences.
2) Many of the sequences are either not interesting to me, or are presented in ways that make them appear not to be.
If the topic actually doesn't interest me, that's fine, because I presumably won't be trying to discuss it, either. But some of the sequence titles are more pithy than informative, and some of the introductory text is dissuasive where it tries to be inviting; few of them give a clear summary of what the subject is and w...
To be clear, do you actually think that time spent reading later posts has been more valuable than marginal time on the sequences would have been? To me that seems like reading Discover Magazine after dropping your intro to mechanics textbook because the later seems to just tell you thinks that are obvious.
I think some of my time spent reading articles in the sequences was well spent, and the rest was split between two alternatives: 1) in a minority of cases where the reading didn't feel useful, it was about something I already felt I understood, and 2) in a majority of such cases, it wasn't connected to something I was already curious about.
It's explained a bit better in the longer version of the above comment (which now appears to be homeless). But I think the sequences, or at least the admonition to read them all, are targeted at someone who has done some reading or at least thinking about their subjects before. Not because they demand prior knowledge, but because they demand prior interest. You may have underestimated how much of a newbie you have on your hands.
It's not that I'm claiming to be so smart that I can participate fully in the discussions without reading up on the fundamentals, it's that participating or even just watching the discussion is the thing that's piquing my interest in the subjects in the first place. It feels less like asking me to read about basic physics before trying to set up a physics experiment, and more like asking me to read about music theory witho...
Thank you, this was an excellent post. It tied together a lot of discussions that have gone on and continue to go on here, and I expect it to be very useful to me.
Among other things, I suffer from every impairment to instrumental rationality that you mention under Type 2.
The first of those is perhaps my most severe downfall; I term it the "perpetual student" syndrome, and I think that phrasing matches other places where that phrase is used. I'm fantastically good at picking up entry-level understandings of things, but once I lose the rewarding f...
I like the writing here: very clear and useful.
I have a very simple problem when doing mathematics.
I want to write a proof. But I also want to save time. And so I miss nuances and make false assumptions and often think the answer is simpler than it is. It's almost certainly motivated cognition, rather than inadequate preparation or "stupidity" or any other problem.
I know the answer is "Stop wanting to save time" -- but how do you manipulate your own unvoiced desires?
Do you have any ideas, including guesswork, about where your hurry is coming from? For example, are you in a hurry to go do other activities? Are you stressing about how many problems you have left in your problem set? Do you feel as though you're stupid if you don't immediately see the answer?
Some strategies that might help, depending:
“... and so I don’t care about dating anyhow, and I have no reason to risk approaching someone.”
This doesn't seem like it is a distorted reward pathway. Unless people are valuing being virtuous and not wasting time and money on dating?
If it is a problem it seems more likely to be an Ugh field. I.e. someone who had problems with the opposite sex and doesn't want to explore a painful area.
Apart from that I think rwallace's point needs to be addressed. Lack of compartmentalisation can be a bad thing as well as a good thing. Implicit in this piece is the i...
Thanks! This excellent material to be using during "confession" parts of my Yom Kippur faking ceremony. I am taking printouts to shul ;)
This post has been very useful to me.
If I had to isolate what was personally most useful (it'd be hard but) I'd pick the combination of your discussion of distorted reward signals and your advice about something to protect. I now notice status wireheading patterns quite frequently (often multiple times daily), and put a stop to them because I recognize they don't work towards what I actually care about (or maybe because I identify as a rationalist, I'm not sure). Either way I appreciate being able to halt such patterns before they grow into larger action patterns.
I suspect that an underrated rationality technique is to scream while updating your plans and beliefs on unpleasant subjects, so that any dismay at the unpleasantness finds expression in the scream rather than in your plans and beliefs.
This is a great post, and I wish to improve only a tiny piece of it:
"Similarly, we often continue to discuss, and care about, concepts that have lost all their moorings in anticipated sense-experience."
In that sentence, I hear a suggestion the primary or only thing we ought to care about is anticipated sense-experience. However, anticipated sense-experience can be manipulated (via suicide or other eyes-closing techniques), and so cannot be the only or primary thing that we ought to care about.
I admit I don't know precisely what else we ought to c...
[2] We receive reward/pain not only from "primitive reinforcers" such as smiles, sugar, warmth, and the like, but also from many long-term predictors of those reinforcers (or predictors of predictors of those reinforcers, or...)
How primitive are these "primitive reinforcers"? For those who know more about the brain, is it known if and how they are reinforced through lower-level systems? Can these systems be (at least partially) brought under conscious control?
Beside the technical posts, LW has many good articles that teach a good mindset for epistemic rationality (like the 12 Virtues and the litanies). Much of this applies to instrumental rationality. But I compartmentalize between epistemic and instrumental rationality. I use different words and thoughts when thinking about believes and actions or plans.
So I have been reading the 12 Virtues and tried to interpret it in terms of plans, actions and activities.
The first virtue (curiosity) would obviously become "something to protect".
...The fourth virtu
Related to: Humans are not automatically strategic, The mystery of the haunted rationalist, Striving to accept, Taking ideas seriously
I argue that many techniques for epistemic rationality, as taught on LW, amount to techniques for reducing compartmentalization. I argue further that when these same techniques are extended to a larger portion of the mind, they boost instrumental, as well as epistemic, rationality.
Imagine trying to design an intelligent mind.
One problem you’d face is designing its goal.
Every time you designed a goal-indicator, the mind would increase action patterns that hit that indicator[1]. Amongst these reinforced actions would be “wireheading patterns” that fooled the indicator but did not hit your intended goal. For example, if your creature gains reward from internal indicators of status, it will increase those indicators -- including by such methods as surrounding itself with people who agree with it, or convincing itself that it understood important matters others had missed. It would be hard-wired to act as though “believing makes it so”.
A second problem you’d face is propagating evidence. Whenever your creature encounters some new evidence E, you’ll want it to update its model of “events like E”. But how do you tell which events are “like E”? The soup of hypotheses, intuition-fragments, and other pieces of world-model is too large, and its processing too limited, to update each belief after each piece of evidence. Even absent wireheading-driven tendencies to keep rewarding beliefs isolated from threatening evidence, you’ll probably have trouble with accidental compartmentalization (where the creature doesn’t update relevant beliefs simply because your heuristics for what to update were imperfect).
Evolution, AFAICT, faced just these problems. The result is a familiar set of rationality gaps:
I. Accidental compartmentalization
a. Belief compartmentalization: We often fail to propagate changes to our abstract beliefs (and we often make predictions using un-updated, specialized components of our soup of world-model). Thus, learning modus tolens in the abstract doesn’t automatically change your answer to the Wason card test. Learning about conservation of energy doesn’t automatically change your fear when a bowling ball is hurtling toward you. Understanding there aren’t ghosts doesn’t automatically change your anticipations in a haunted house. (See Will's excellent post Taking ideas seriously for further discussion).
b. Goal compartmentalization: We often fail to propagate information about what “losing weight”, “being a skilled thinker”, or other goals would concretely do for us. We also fail to propagate information about what specific actions could further these goals. Thus (absent the concrete visualizations recommended in many self-help books) our goals fail to pull our behavior, because although we verbally know the consequences of our actions, we don’t visualize those consequences on the “near-mode” level that prompts emotions and actions.
c. Failure to flush garbage: We often continue to work toward a subgoal that no longer serves our actual goal (creating what Eliezer calls a lost purpose). Similarly, we often continue to discuss, and care about, concepts that have lost all their moorings in anticipated sense-experience.
II. Reinforced compartmentalization:
Type 1: Distorted reward signals. If X is a reinforced goal-indicator (“I have status”; “my mother approves of me”[2]), thinking patterns that bias us toward X will be reinforced. We will learn to compartmentalize away anti-X information.
The problem is not just conscious wishful thinking; it is a sphexish, half-alien mind that distorts your beliefs by reinforcing motives, angles or approach or analysis, choices of reading material or discussion partners, etc. so as to bias you toward X, and to compartmentalize away anti-X information.
Impairment to epistemic rationality:
Impairment to instrumental rationality:
Type 2: “Ugh fields”, or “no thought zones”. If we have a large amount of anti-X information cluttering up our brains, we may avoid thinking about X at all, since considering X tends to reduce compartmentalization and send us pain signals. Sometimes, this involves not-acting in entire domains of our lives, lest we be reminded of X.
Impairment to epistemic rationality:
Impairment to instrumental rationality:
Type 3: Wireheading patterns that fill our lives, and prevent other thoughts and actions. [3]
Impairment to epistemic rationality:
Impairment to instrumental rationality:
Strategies for reducing compartmentalization:
A huge portion of both Less Wrong and the self-help and business literatures amounts to techniques for integrating your thoughts -- for bringing your whole mind, with all your intelligence and energy, to bear on your problems. Many fall into the following categories, each of which boosts both epistemic and instrumental rationality:
1. Something to protect (or, as Napoleon Hill has it, definite major purpose[4]): Find an external goal that you care deeply about. Visualize the goal; remind yourself of what it can do for you; integrate the desire across your mind. Then, use your desire to achieve this goal, and your knowledge that actual inquiry and effective actions can help you achieve it, to reduce wireheading temptations.
2. Translate evidence, and goals, into terms that are easy to understand. It’s more painful to remember “Aunt Jane is dead” than “Aunt Jane passed away” because more of your brain understands the first sentence. Therefore use simple, concrete terms, whether you’re saying “Aunt Jane is dead” or “Damn, I don’t know calculus” or “Light bends when it hits water” or “I will earn a million dollars”. Work to update your whole web of beliefs and goals.
3. Reduce the emotional gradients that fuel wireheading. Leave yourself lines of retreat. Recite the litanies of Gendlin and Tarski; visualize their meaning, concretely, for the task or ugh field bending your thoughts. Think through the painful information; notice the expected update, so that you need not fear further thought. On your to-do list, write concrete "next actions", rather than vague goals with no clear steps, to make the list less scary.
4. Be aware of common patterns of wireheading or compartmentalization, such as failure to acknowledge sunk costs. Build habits, and perhaps identity, around correcting these patterns.
I suspect that if we follow up on these parallels, and learn strategies for decompartmentalizing not only our far-mode beliefs, but also our near-mode beliefs, our models of ourselves, our curiosity, and our near- and far-mode goals and emotions, we can create a more powerful rationality -- a rationality for the whole mind.
[1] Assuming it's a reinforcement learner, temporal difference learner, perceptual control system, or similar.
[2] We receive reward/pain not only from "primitive reinforcers" such as smiles, sugar, warmth, and the like, but also from many long-term predictors of those reinforcers (or predictors of predictors of those reinforcers, or...), such as one's LW karma score, one's number theory prowess, or a specific person's esteem. We probably wish to regard some of these learned reinforcers as part of our real preferences.
[3] Arguably, wireheading gives us fewer long-term reward signals than we would achieve from its absence. Why does it persist, then? I would guess that the answer is not so much hyperbolic discounting (although this does play a role) as local hill-climbing behavior; the simple, parallel systems that fuel most of our learning can't see how to get from "avoid thinking about my bill" to "genuinely relax, after paying my bill". You, though, can see such paths -- and if you search for such improvements and visualize the rewards, it may be easier to reduce wireheading.
[4] I'm not recommending Napoleon Hill. But even this unusually LW-unfriendly self-help book seems to get most points right, at least in the linked summary. You might try reading the summary as an exercise in recognizing mostly-accurate statements when expressed in the enemy's vocabulary.