If you find yourself on a playing field where everyone else is a TrollBot (players who cooperate with you if and only if you cooperate with DefectBot) then you should cooperate with DefectBots and defect against TrollBots.
An example from real life: DefectBot = God, TrollBots = your religious neighbors. God does not reward you for your prayers, but your neighbors may punish you socially for lack of trying. You defect against your neighbors by secretly being a member of an atheist community, and generally by not punishing other nonbelievers.
I wonder what techniques could we use to make the compartmentalization stronger and easy to turn off when it's no longer needed. Clear boundaries. A possible solution would be to use the different set of beliefs only while wearing a silly hat. Not literally silly, because I might want to use it in public without handicapping myself. But some environmental reminder. An amulet, perhaps?
When I was being raised an Orthodox Jew, there were several talismans that served essentially this purpose (though of course my rabbis would not have phrased it that way).
she who wears the magic bracelet of future-self delegation http://i.imgur.com/5Bfq4we.png prefers to do as she is ordered
One major example of a situation where you'll want to hack your terminal goals is if you're single and want to get into a relationship: you're far more likely to succeed if you genuinely enjoy the company of members-of-your-preferred-sex even when you don't think that it will lead to anything.
Agreed. Also helpful is if the parts of you with close access to e.g. your posture and voice tone have an unshakable belief in your dominance within the tribe, and your irresistible sex appeal. In fact social interaction in general is the best example of somewhere that dark side rationality is helpful.
This is the best article on lesswrong in some time. I think it should at least be considered for entry into the sequences. it raises some extremely important challenges to the general ethos around here.
Hmmm. I suspects it depends on the circumstances. If you are a (male) pirate and you want a pirate girlfriend, but the only females in your immediate surroundings are ninjas, you will not find their constant discussions of surikens and stealthy assasinations enjoyable, when you only want to talk about muskets and ship-boardings.
Then go somewhere else. Duh. :-)
This makes it sound trivial. Would you consider it trivial for someone ship-wrecked on a Ninja Island with a peg-leg and various other piraty injuries?
Then go somewhere else. Duh. :-)
You can't. You need money to repair your ship, and only ninjas hire in this economy...
You don't enjoy company of most members-of-your-preferred-sex, but are hopeful that there are people out there that you could spend your life with. The problem is that finding them is painful, because you have to spend time with people whose company you won't enjoy during the search.
By hacking yourself to enjoy their company you make the search actually pleasant. Though hopefully your final criteria does not change.
By hacking yourself to enjoy their company you make the search actually pleasant.
"We understand how dangerous a mask can be. We all become what we pretend to be." ― Patrick Rothfuss, The Name of the Wind
"No man, for any considerable period, can wear one face to himself and another to the multitude, without finally getting bewildered as to which may be the true." ― Nathaniel Hawthorne, The Scarlet Letter
Anyone that know's me knows that I'm quite familiar with the dark arts. I've even used hypnosis to con Christians into Atheism a half dozen times. The tempting idea is that dark arts can be used for good - and that the ends justify the means. I've since changed my mind.
The thing is, even though I don't advocate dark arts for persuasion let alone rationality, I almost entirely agree with the actions you advocate. I just disagree strongly with the frame through which you look at them.
For example, I am heavily into what you call "changing terminal goals", however I disagree that I'm changing terminal goals. If I recognize that pursuing instrumental goal A for sake of "terminal" goal B is the best way to achieve goal B, I'll self modify in the way you describe. I'll also do that thing you frame as "being inconsistent" where I make sure to notice if chasing goal A is no longer the best way to achieve goal B, I self modify to stop chasing goal A. If you make sure to remember that step, goals are not sticky. You chase goal A "for its own sake" iff it is the best way to achieve goal B. That's what instrumental goals are.
The way I see it, the differen...
To each their own.
The way I see it, the difference in motivation comes not from "terminal vs instrumental", but from how you're focusing your attention.
This may be true for small subgoals, but I feel it's difficult for large goals. Consider learning to program. In my experience, it is much easier to become a good programmer if you actually love programming. Even if you successfully choose to focus on programming and manage not to be distracted by your "real" goals, the scheduler acts differently if you've decided to program versus if you love programming. The difference is in the details, like how you'll mentally debug a project you're working on while riding the bus, or scribble ideas in a notebook while in class, things that the scheduler wouldn't even consider if you've shifted your focus but haven't actually made programming an end unto itself.
If you can achieve the same level of commitment merely by shifting your focus, more power to you. In my experience, there is an extra boost I get from a task being an end in its own right.
That said, as I mentioned in the post, I seldom use terminal-goal-modification myself. Part of the point of that section was to ...
This may be true for small subgoals, but I feel it's difficult for large goals. Consider learning to program. In my experience, it is much easier to become a good programmer if you actually love programming.
That's the part I agree with.
I'm not a full blown programmer, but I have loved programming to the point of working on it for long stretches and losing sleep because I was too drawn to it to let my mind rest. I still call that kind of thing (and even more serious love for programming) "instrumental"
If you can achieve the same level of commitment merely by shifting your focus, more power to you. In my experience, there is an extra boost I get from a task being an end in its own right.
It's hard to describe in a single comment, but it's not the same as just "Hmmm... You make a good point. I guess I should focus on the instrumental goal". It's not a conscious decision to willpower some focus. I'm talking about the same process you are when you "switch an instrumental goal to terminal". It has all the same "qualia" associated with it.
It's just that I don't agree with calling "that thing I do because I love doing it" a "term...
Please write your own article. This is worthy content, but thousand-word comments are an awful medium.
Heh, I liked using the placebo effect for morning coffee at my office (they had decaf and regular). I didn't really want to acquire a caffeine dependency, so I'd pour a bit of caffeinated coffee into my cup (not a consistent amount), and then fill it the rest of the way with decaf. Then, as I walked back to my desk, I'd think to myself, "Wow, there's a mysterious amount of caffeine in this cup! I might wind up with a lot of energy!"
Worked pretty well.
Consider the Prisoner's Dilemma.
You really should call this "the Prisoner's dilemma with shared source code," because the only strategies for the True Prisoner's Dilemma are CooperateBot and DefectBot, of which it is obviously better to be DefectBot.
Overall, I found most of this post... aggravating? When you separated out local and global beliefs, I was pleased, and I wished you had done the same with 'optimality.' When you're playing the game of "submit source code to a many-way PD tournament, where you know that there's a DefectBot and many TrollBots," then the optimal move is "submit code which cooperates with DefectBot," even though you end up losing points to DefectBot. That doesn't seem contentious.
But it seems to me that anything you can do with dark arts, I can do with light arts. The underlying insight there is "understand which game you are playing"- the person who balks at cooperating with DefectBot has not scoped the game correctly. Similarly, the person who says "my goal is to become a rockstar" or "my goal is to become a writer" is playing the game where they talk about becoming a rockstar or a writer, n...
OK, I upvoted it before reading, and now that I have read it, I wish there were a karma transfer feature, so I could upvote it a dozen times more :) Besides the excellent content, it is exemplary written ( engaging multi-level state-explain-summarize style, with quality examples throughout).
By the way, speaking of karma transfer, here is one specification of such a feature: anyone with, say, 1000+ karma should be able to specify the number of upvotes to give, up to 10% of their total karma (diluted 10x for Main posts, since currently each Main upvote gives 10 karma points to OP). The minimum and transfer thresholds are there to prevent misuse of the feature with sock puppets.
Now, back to the subject at hand. What you call compartmentalization and what jimmy calls attention shifting I imagine in terms of the abstract data type "stack": in your current context you create an instance of yourself with a desired set of goals, then push your meta-self on stack and run the new instance. It is, of course, essential that the instance you create actually pops the stack and yields control at the right time, (and does not go on creating and running more instances, until you get a st...
Huh. never noticed that
Presumably you joined a while ago, when there weren't so many intimidating high-karma users around
For example, I happen to have just over 10k karma, does it make me a clique member? What about TheOtherDave, or Nancy?
Yes on all counts. You're clearly the cool kids here.
How do you tell if someone is in this clique?
You see them talk like they know each other. You see them using specialized terms without giving any context because everybody knows that stuff already. You see their enormous, impossible karma totals and wonder if they've hacked the system somehow.
How does someone in the clique tell if she is?
Dunno. It probably looks completely different from the other side. I'm just saying that's what it feels like (and this is bad for attracting new members), not that's what it's really like.
And I doubt it's unique to LessWrong
Relevant link: The Tyranny of Structurelessness (which is mostly talking about real-life political groups, but still, much of it is relevant):
...Contrary to what we would like to believe, there is no such thing as a structureless group. Any group of people of whatever nature that comes together for any length of time for any purpose will inevitably structure itself in some fashion [...]
Elites are nothing more, and nothing less, than groups of friends who also happen to participate in the same political activities. They would probably maintain their friendship whether or not they were involved in political activities; they would probably be involved in political activities whether or not they maintained their friendships. It is the coincidence of these two phenomena which creates elites in any group and makes them so difficult to break.
These friendship groups function as networks of communication outside any regular channels for such communication that may have been set up by a group. If no channels are set up, they function as the only networks of communication. Because people are friends, because they usually share the same values and orientat
Great post!
That said, I've found that cultivating a gut-level feeling that what you're doing must be done, and must be done quickly, is an extraordinarily good motivator. It's such a strong motivator that I seldom explicitly acknowledge it. I don't need to mentally invoke "we have to study or the world ends".
This reminds me of the fact that, like many others, on many tasks I tend to work best right before the deadline, while doing essentially nothing in the before that deadline. Of course, for some tasks this is actually the rational way of approaching things (if they can be accomplished sufficiently well in that time just before the deadline). But then there are also tasks for which it would be very useful if I could temporarily turn on a belief saying that this is urgent.
Interestingly, there are also tasks that I can only accomplish well if there isn't any major time pressure, and a looming deadline prevents me from getting anything done. For those cases, it would be useful if I could just turn off the belief saying that I don't have much time left.
This was awesome, but the last bit terrified me. I don't know what that means exactly, but I think it means I shouldn't do it. I'm definitely going to try the placebo thing. Unfortunately, the others don't seem... operationalized? enough to really try implementing - what is it that you do to create a compartment in which you are unstoppable? How do you convince that part of yourself/yourself in that mode of that?
Yeah, that's a good question. I haven't given any advice as to how to set up a mental compartment, and I doubt it's the sort of advice you're going to find around these parts :-)
Setting up a mental compartment is easier than it looks.
First, pick the idea that you want to "believe" in the compartment.
Second, look for justifications for the idea and evidence for the idea. This should be easy, because your brain is very good at justifying things. It doesn't matter if the evidence is weak, just pour it in there: don't treat it as weak probabilistic evidence, treat it as "tiny facts".
It's very important that, during this process, you ignore all counter-evidence. Pick and choose what you listen to. If you've been a rationalist for a while, this may sound difficult, but it's actually easy. You're brain is very good at reading counter-evidence and disregarding it offhand if it doesn't agree with what you "know". Fuel that confirmation bias.
Proceed to regulate information intake into the compartment. If you're trying to build up "Nothing is Beyond My Grasp", then every time that you succeed at something, feed that pride and success into the compartment. Every time you fail, though, simply remind yourself that you knew it was a compartment, and this isn't too surprising, and don't let the compartment update.
Before long, you'll have this discontinuous belief that's completely out of sync with reality.
This article is awesome! I've been doing this kind of stuff for years with regards to motivation, attitudes, and even religious belief. I've used the terminology of "virtualisation" to talk about my thought-processes/thought-rituals in carefully defined compartments that give me access to emotions, attitudes, skills, etc. I would otherwise find difficult. I even have a mental framework I call "metaphor ascendence" to convert false beliefs into virtualised compartments so that they can be carefully dismantled without loss of existing ...
Excellent post. I'm glad to find some more stuff on LessWrong that's directly applicable to real life and the things I'm doing right now.
Overall, an excellent post. It brought up some very clever ideas that I had never thought of or previously encountered.
I do, however, think that your colloquial use of the phrase "terminal value" is likely to confuse and/or irritate a lot of the serious analytic-philosophy crowd here; it might be wise to use some other word or other phrase for your meaning, which seems to be closer to "How an idealized[1] utility-maximizing agent would represent its (literal) Terminal Values internally". Perhaps a "Goal-in-itself"? A "motivationally core goal"?
Another small example. I have a clock near the end of my bed. It runs 15 minutes fast. Not by accident, it's been reset many times and then set back to 15 minutes fast. I know it's fast, we even call it the "rocket clock". None of this knowledge diminishes it's effectiveness at getting me out of bed sooner, and making me feel more guilty for staying up late. Works very well.
Glad to discover I can now rationalise it as entirely rational behaviour and simply the dark side (where "dark side" only serves to increase perceived awesomeness anyway).
EDIT: I think the effects were significantly worse than this and caused a ton of burnout and emotional trauma. Turns out thinking the world will end with 100% probability if you don't save it, plus having heroic responsibility, can be a little bit tough sometimes...
I worry most people will ignore the warnings around willful inconsistency, so let me self-report that I did this and it was a bad idea. Central problem: It's hard to rationally update off new evidence when your system 1 is utterly convinced of something. And I think this screwed with my epistemi...
Just read a paper suggesting that "depleting" willpower is a mechanism that gradually prioritizes "want to do" goals over "should/have to do" goals. I'm guessing that "terminal goal hacking" could be seen as a way to shift goals from the "should/have to do" category to the "want to do" category, thus providing a massive boost in one's ability to actually do them.
Ah! So that's what I've been doing wrong. When I tried to go to the gym regularly with the goal of getting stronger/bigger/having more energy, the actual process of exercising was merely instrumental to me so I couldn't motivate myself to do it consistently. Two of my friends who are more successful at exercising than me have confirmed that for them exercising is both instrumental and a goal in and of itself.
But while I'm down with the idea of hacking terminal goals, I have no idea how to do that. Whereas compartmentalizing is easy (just ignore evidence against the position you want to believe), goal hacking sounds very difficult. Any suggestions/resources for learning how to do this?
and I am confident that I can back out (and actually correct my intuitions) if the need arises.
Did you ever do this? Or are you still running on some top-down overwritten intuitive models?
If you did back out, what was that like? Did you do anything in particular, or did this effect fade over time?
Ideally, we would be just as motivated to carry out instrumental goals as we are to carry out terminal goals. In reality, this is not the case. As a human, your motivation system does discriminate between the goals that you feel obligated to achieve and the goals that you pursue as ends unto themselves.
I don't think that this is quite right actually.
If the psychological link between them is strong in the right way, the instrumental goal will feel as appealing as the terminal goal (because succeeding at the instrumental goal feels like making progress on th...
What great timing! I've just started investigating the occult and chaos magick (with a 'k') just to see if it works.
Another nail hit squarely on the head. Your concept of a strange playing field has helped crystallize an insight I've been grappling with for a while -- a strategy can be locally rational even if it is in some important sense globally irrational. I've had several other insights which are specific instances of this and which I only just realized are part of a more general phenomenon. I believe it can be rational to temporarily suspend judgement in the pursuit of certain kinds of mystical experiences (and have done this with some small success), and I believ...
I really liked the introduction - really well done. (shminux seems to agree!)
Some constructuve criticisms:
'There are playing fields where you should cooperate with DefectBot, even though that looks completely insane from a naïve viewpoint. Optimality is a feature of the playing field, not a feature of the strategy.' - I like your main point made with TrollBot, but this last sentence doesn't seem like a good way of summing up the lesson. What the lesson seems to be in my eyes is: strategies' being optimal or not is playing-field relative. So you could say t...
I've crossed paths with many a confused person who (without any explicit thought on their part) had really silly terminal goals. We've all met people who are acting as if "Acquire Money" is a terminal goal, never noticing that money is almost entirely instrumental in nature. When you ask them "but what would you do if money was no issue and you had a lot of time", all you get is a blank stare.
I'm incapable of comprehending this mental state. If I didn't see so much evidence all around me that there are lots and lots of people who seem to just want money as...
I think I've been able to make outstanding progresses last year in improving rationality and starting to work on real problems mostly because of megalomaniac beliefs that were somewhat compartmentalised but that I was able to feel at a gut level each time I had to start working.
Lately, as a result of my progresses, I've started slowing down because I was able to come to terms with these megalomaniac beliefs and realise at a gut level they weren't accurate, so a huge chunk of my drive faded, and my predictions about my goals updated on what I felt I could r...
I think the distinction (and disjunction) between instrumental and terminal goals is an oversimplification, at least when applied to motivation (as you've demonstrated). My current understanding of goal-setting is that instrumental goals can also be terminal in the sense that one enjoys or is in the habit of doing them.
To take the rock star example: It's not a lie to enjoy practicing or to enjoy making music, but it's still true that making good at music is also instrumental to the goal of becoming a rock-star. I might say that making music as being instru...
Great content, but this post is significantly too long. Honestly, each of your main points seems worthy of a post of its own!
I propose that we reappropriate the white/black/grey hat terminology from the Linux community, and refer to black/white/grey cloak rationality. Someday perhaps we'll have red cloak rationalists.
To summarize, belief in things that are not actually true, may have beneficial impact on your day to day life?
You don't really need require any level of rationality skills to arrive at that conclusion, but the writeup is quite interesting.
Just don't fall in the trap of thinking I am going to swallow this placebo and feel better, because I know that even though placebo does not work... crap. Let's start from the beginning....
The major points are:
The insane thing is that you can think "I am going to swallow this placebo and feel better, even though I know it's a placebo" and it still works. Like I said, brains are weird.
How do you know compartmentalisation is more mentally efficient than dissonance of mental representations of will power?
First, appreciation: I love that calculated modification of self. These, and similar techniques, can be very useful if put to use in the right way. I recognize myself here and there. You did well to abstract it all out this clearly.
Second, a note: You've described your techniques from the perspective of how they deviate from epistemic rationality - "Changing your Terminal Goals", "Intentional Compartmentalization", "Willful inconsistency". I would've been more inclined to describe them from the perspective of their central eff...
As a comment about changing an instrumental value to a terminal value, I'm just going to copy and paste from a recent thread, as it seems equally relevant here.
The recent thread about Willpower as a Resource identified the fundamental issue.
There are tasks we ought to do, and tasks we want to do, and the latter don't suffer from will power limitations. Find tasks that server your purpose that you want to do. Then do your best to remind yourself that you want the end, and therefore you want the means. Attitude is everything.
That said, it's hard to understate how useful it is to have a gut-level feeling that there's a short, hard timeline.
When I was in college, about the only way I could avoid completing assignments just before (and occasionally after) the deadline, was to declare that assignments were due within hours of being assigned.
The second and thirst sections were great. But is the idea of 'terminal goal hacking' actually controversial? Without the fancy lingo, it says that it's okay to learn how to genuinely enjoy new activities and turn that skill to activities that don't seem all that fun now but are useful in the long term. This seems like a common idea in discourse about motivation. I'd be surprised if most people here didn't already agree with it.
This made the first section boring to me and I was about to conclude that it's yet another post restating obvious things in needlessly complicated terms and walk away. Fortunately, I kept on reading and got to the fun parts, but it was close.
Yes, good post.
A related technique might be called the "tie-in." So for example, the civil rights advocate who wants to quit smoking might resolve to donate $100 to the KKK if he smokes another cigarette. So that the goal of quitting smoking gets attached to the goal of not wanting to betray one's passionately held core beliefs.
In fact, one could say that most motivational techniques rely on some form of goal-tweaking.
I think that at least part of the benefit from telling yourself "I don't get ego depletion" is from telling yourself "I don't accept ego depletion as an excuse to stop working".
If you model motivation as trying to get long-term goals accomplished while listening to a short-term excuse-to-stop-working generator, it matches up pretty well. I did a short test just now with telling myself "I don't accept being tired as an excuse to stop rasping out a hole", and I noticed at least two attempts to use just that excuse that I'd normally take.
I mean actions that feel like ends unto themselves. I am speaking of the stuff you wish you were doing when you're doing boring stuff, the things you do in your free time just because they are fun, the actions you don't need to justify.
If I'm hungry, eating feels like an end unto itself. Once I'm full, not eating feels like an end unto itself. If I'm bored doing a task, doing something else feels like an end unto itself, until I've worked on a new task enough to be bored. All my values seem to need justification to an extent, and labeling any of them as terminal values seems to cause significant cognitive dissonance at some point in time.
I haven't finished this post yet, but the first section (on "hacking terminal values") seems to be best illumated by the very paragraph in another, older post (by Yvain) I happened to be reading in another tab.
To whit:
...When I speak of "terminal goals" I mean actions that feel like ends unto themselves. I am speaking of the stuff you wish you were doing when you're doing boring stuff, the things you do in your free time just because they are fun, the actions you don't need to justify.
This seems like the obvious meaning of "terminal
nice post. However it might be better to characterize the first two classes as beliefs which are true because of the belief, instead of as false beliefs (Which is important so as not to unconsciously weaken our attachment to truth). For example in your case of believing that water will help you feel better, the reason you believe it is because it is actually true by virtue of the belief, similarly when the want to be rock star enjoys making music for its own sake the belief that making music is fun is now true.
In many games there is no "absolutely optimal" strategy. Consider the Prisoner's Dilemma. The optimal strategy depends entirely upon the strategies of the other players. Entirely.
In the standard, one-shot, non-cooperative Prisoner Dilemma, "Defect" is the optimal strategy, regardless of what the other player does.
Great post. The points strike me as rather obvious, but now that there's an articulately-written LessWrong post about them I can say them out loud without embarrassment, even among fellow rationalists.
That said, the post could probably be shorter. But not too much shorter! Too short wouldn't look as respectable.
Note: the author now disclaims this post, and asserts that his past self was insufficiently skilled in the art of rationality to "take the good and discard the bad" even when you don't yet know how to justify it. You can, of course, get all the benefits described below, without once compromising your epistemics.
Today, we're going to talk about Dark rationalist techniques: productivity tools which seem incoherent, mad, and downright irrational. These techniques include:
I expect many of you are already up in arms. It seems obvious that consistency is a virtue, that compartmentalization is a flaw, and that one should never modify their terminal goals.
I claim that these 'obvious' objections are incorrect, and that all three of these techniques can be instrumentally rational.
In this article, I'll promote the strategic cultivation of false beliefs and condone mindhacking on the values you hold most dear. Truly, these are Dark Arts. I aim to convince you that sometimes, the benefits are worth the price.
Changing your Terminal Goals
In many games there is no "absolutely optimal" strategy. Consider the Prisoner's Dilemma. The optimal strategy depends entirely upon the strategies of the other players. Entirely.
Intuitively, you may believe that there are some fixed "rational" strategies. Perhaps you think that even though complex behavior is dependent upon other players, there are still some constants, like "Never cooperate with DefectBot". DefectBot always defects against you, so you should never cooperate with it. Cooperating with DefectBot would be insane. Right?
Wrong. If you find yourself on a playing field where everyone else is a TrollBot (players who cooperate with you if and only if you cooperate with DefectBot) then you should cooperate with DefectBots and defect against TrollBots.
Consider that. There are playing fields where you should cooperate with DefectBot, even though that looks completely insane from a naïve viewpoint. Optimality is not a feature of the strategy, it is a relationship between the strategy and the playing field.
Take this lesson to heart: in certain games, there are strange playing fields where the optimal move looks completely irrational.
I'm here to convince you that life is one of those games, and that you occupy a strange playing field right now.
Here's a toy example of a strange playing field, which illustrates the fact that even your terminal goals are not sacred:
Imagine that you are completely self-consistent and have a utility function. For the sake of the thought experiment, pretend that your terminal goals are distinct, exclusive, orthogonal, and clearly labeled. You value your goals being achieved, but you have no preferences about how they are achieved or what happens afterwards (unless the goal explicitly mentions the past/future, in which case achieving the goal puts limits on the past/future). You possess at least two terminal goals, one of which we will call
A
.Omega descends from on high and makes you an offer. Omega will cause your terminal goal
A
to become achieved over a certain span of time, without any expenditure of resources. As a price of taking the offer, you must switch out terminal goalA
for terminal goalB
. Omega guarantees thatB
is orthogonal toA
and all your other terminal goals. Omega further guarantees that you will achieveB
using less time and resources than you would have spent onA
. Any other concerns you have are addressed via similar guarantees.Clearly, you should take the offer. One of your terminal goals will be achieved, and while you'll be pursuing a new terminal goal that you (before the offer) don't care about, you'll come out ahead in terms of time and resources which can be spent achieving your other goals.
So the optimal move, in this scenario, is to change your terminal goals.
There are times when the optimal move of a rational agent is to hack its own terminal goals.
You may find this counter-intuitive. It helps to remember that "optimality" depends as much upon the playing field as upon the strategy.
Next, I claim that such scenarios not restricted to toy games where Omega messes with your head. Humans encounter similar situations on a day-to-day basis.
Humans often find themselves in a position where they should modify their terminal goals, and the reason is simple: our thoughts do not have direct control over our motivation.
Unfortunately for us, our "motivation circuits" can distinguish between terminal and instrumental goals. It is often easier to put in effort, experience inspiration, and work tirelessly when pursuing a terminal goal as opposed to an instrumental goal. It would be nice if this were not the case, but it's a fact of our hardware: we're going to do X more if we want to do X for its own sake as opposed to when we force X upon ourselves.
Consider, for example, a young woman who wants to be a rockstar. She wants the fame, the money, and the lifestyle: these are her "terminal goals". She lives in some strange world where rockstardom is wholly dependent upon merit (rather than social luck and network effects), and decides that in order to become a rockstar she has to produce really good music.
But here's the problem: She's a human. Her conscious decisions don't directly affect her motivation.
In her case, it turns out that she can make better music when "Make Good Music" is a terminal goal as opposed to an instrumental goal.
When "Make Good Music" is an instrumental goal, she schedules practice time on a sitar and grinds out the hours. But she doesn't really like it, so she cuts corners whenever akrasia comes knocking. She lacks inspiration and spends her spare hours dreaming of stardom. Her songs are shallow and trite.
When "Make Good Music" is a terminal goal, music pours forth, and she spends every spare hour playing her sitar: not because she knows that she "should" practice, but because you couldn't pry her sitar from her cold dead fingers. She's not "practicing", she's pouring out her soul, and no power in the 'verse can stop her. Her songs are emotional, deep, and moving.
It's obvious that she should adopt a new terminal goal.
Ideally, we would be just as motivated to carry out instrumental goals as we are to carry out terminal goals. In reality, this is not the case. As a human, your motivation system does discriminate between the goals that you feel obligated to achieve and the goals that you pursue as ends unto themselves.
As such, it is sometimes in your best interest to modify your terminal goals.
Mind the terminology, here. When I speak of "terminal goals" I mean actions that feel like ends unto themselves. I am speaking of the stuff you wish you were doing when you're doing boring stuff, the things you do in your free time just because they are fun, the actions you don't need to justify.
This seems like the obvious meaning of "terminal goals" to me, but some of you may think of "terminal goals" more akin to self-endorsed morally sound end-values in some consistent utility function. I'm not talking about those. I'm not even convinced I have any.
Both types of "terminal goal" are susceptible to strange playing fields in which the optimal move is to change your goals, but it is only the former type of goal — the actions that are simply fun, that need no justification — which I'm suggesting you tweak for instrumental reasons.
I've largely refrained from goal-hacking, personally. I bring it up for a few reasons:
I've crossed paths with many a confused person who (without any explicit thought on their part) had really silly terminal goals. We've all met people who are acting as if "Acquire Money" is a terminal goal, never noticing that money is almost entirely instrumental in nature. When you ask them "but what would you do if money was no issue and you had a lot of time", all you get is a blank stare.
Even the LessWrong Wiki entry on terminal values describes a college student for which university is instrumental, and getting a job is terminal. This seems like a clear-cut case of a Lost Purpose: a job seems clearly instrumental. And yet, we've all met people who act as if "Have a Job" is a terminal value, and who then seem aimless and undirected after finding employment.
These people could use some goal hacking. You can argue that Acquire Money and Have a Job aren't "really" terminal goals, to which I counter that many people don't know their ass from their elbow when it comes to their own goals. Goal hacking is an important part of becoming a rationalist and/or improving mental health.
Goal-hacking in the name of consistency isn't really a Dark Side power. This power is only Dark when you use it like the musician in our example, when you adopt terminal goals for instrumental reasons. This form of goal hacking is less common, but can be very effective.
I recently had a personal conversation with Alexei, who is earning to give. He noted that he was not entirely satisfied with his day-to-day work, and mused that perhaps goal-hacking (making "Do Well at Work" an end unto itself) could make him more effective, generally happier, and more productive in the long run.
Goal-hacking can be a powerful technique, when correctly applied. Remember, you're not in direct control of your motivation circuits. Sometimes, strange though it seems, the optimal action involves fooling yourself.
You don't get good at programming by sitting down and forcing yourself to practice for three hours a day. I mean, I suppose you could get good at programming that way. But it's much easier to get good at programming by loving programming, by being the type of person who spends every spare hour tinkering on a project. Because then it doesn't feel like practice, it feels like fun.
This is the power that you can harness, if you're willing to tamper with your terminal goals for instrumental reasons. As rationalists, we would prefer to dedicate to instrumental goals the same vigor that is reserved for terminal goals. Unfortunately, we find ourselves on a strange playing field where goals that feel justified in their own right win the lion's share of our attention.
Given this strange playing field, goal-hacking can be optimal.
You don't have to completely mangle your goal system. Our aspiring musician from earlier doesn't need to destroy her "Become a Rockstar" goal in order to adopt the "Make Good Music" goal. If you can successfully convince yourself to believe that something instrumental is a means unto itself (e.g. terminal), while still believing that it is instrumental, then more power to you.
This is, of course, an instance of Intentional Compartmentalization.
Intentional Compartmentalization
As soon as you endorse modifying your own terminal goals, Intentional Compartmentalization starts looking like a pretty good idea. If Omega offers to achieve
A
at the price of dropping A and adoptingB
, the ideal move is to take the offer after finding a way to not actually care about B.A consistent agent cannot do this, but I have good news for you: You're a human. You're not consistent. In fact, you're great at being inconsistent!
You might expect it to be difficult to add a new terminal goal while still believing that it's instrumental. You may also run into strange situations where holding an instrumental goal as terminal directly contradicts other terminal goals.
For example, our aspiring musician might find that she makes even better music if "Become a Rockstar" is not among her terminal goals.
This means she's in trouble: She either has to drop "Become a Rockstar" and have a better chance at actually becoming a rockstar, or she has to settle for a decreased chance that she'll become a rockstar.
Or, rather, she would have to settle for one of these choices — if she wasn't human.
I have good news! Humans are really really good at being inconsistent, and you can leverage this to your advantage. Compartmentalize! Maintain goals that are "terminal" in one compartment, but which you know are "instrumental" in another, then simply never let those compartments touch!
This may sound completely crazy and irrational, but remember: you aren't actually in control of your motivation system. You find yourself on a strange playing field, and the optimal move may in fact require mental contortions that make epistemic rationalists shudder.
Hopefully you never run into this particular problem (holding contradictory goals in "terminal" positions), but this illustrates that there are scenarios where compartmentalization works in your favor. Of course we'd prefer to have direct control of our motivation systems, but given that we don't, compartmentalization is a huge asset.
Take a moment and let this sink in before moving on.
Once you realize that compartmentalization is OK, you are ready to practice my second Dark Side technique: Intentional Compartmentalization. It has many uses outside the realm of goal-hacking.
See, motivation is a fickle beast. And, as you'll remember, your conscious choices are not directly attached to your motivation levels. You can't just decide to be more motivated.
At least, not directly.
I've found that certain beliefs — beliefs which I know are wrong — can make me more productive. (On a related note, remember that religious organizations are generally more coordinated than rationalist groups.)
It turns out that, under these false beliefs, I can tap into motivational reserves that are otherwise unavailable. The only problem is, I know that these beliefs are downright false.
I'm just kidding, that's not actually a problem. Compartmentalization to the rescue!
Here's a couple example beliefs that I keep locked away in my mental compartments, bound up in chains. Every so often, when I need to be extra productive, I don my protective gear and enter these compartments. I never fully believe these things — not globally, at least — but I'm capable of attaining "local belief", of acting as if I hold these beliefs. This, it turns out, is enough.
Nothing is Beyond My Grasp
We'll start off with a tame belief, something that is soundly rooted in evidence outside of its little compartment.
I have a global belief, outside all my compartments, that nothing is beyond my grasp.
Others may understand things easier I do or faster than I do. People smarter than myself grok concepts with less effort than I. It may take me years to wrap my head around things that other people find trivial. However, there is no idea that a human has ever had that I cannot, in principle, grok.
I believe this with moderately high probability, just based on my own general intelligence and the fact that brains are so tightly clustered in mind-space. It may take me a hundred times the effort to understand something, but I can still understand it eventually. Even things that are beyond the grasp of a meager human mind, I will one day be able to grasp after I upgrade my brain. Even if there are limits imposed by reality, I could in principle overcome them if I had enough computing power. Given any finite idea, I could in theory become powerful enough to understand it.
This belief, itself, is not compartmentalized. What is compartmentalized is the certainty.
Inside the compartment, I believe that Nothing is Beyond My Grasp with 100% confidence. Note that this is ridiculous: there's no such thing as 100% confidence. At least, not in my global beliefs. But inside the compartments, while we're in la-la land, it helps to treat Nothing is Beyond My Grasp as raw, immutable fact.
You might think that it's sufficient to believe Nothing is Beyond My Grasp with very high probability. If that's the case, you haven't been listening: I don't actually believe Nothing is Beyond My Grasp with an extraordinarily high probability. I believe it with moderate probability, and then I have a compartment in which it's a certainty.
It would be nice if I never needed to use the compartment, if I could face down technical problems and incomprehensible lingo and being really out of my depth with a relatively high confidence that I'm going to be able to make sense of it all. However, I'm not in direct control of my motivation. And it turns out that, through some quirk in my psychology, it's easier to face down the oppressive feeling of being in way over my head if I have this rock-solid "belief" that Nothing is Beyond My Grasp.
This is what the compartments are good for: I don't actually believe the things inside them, but I can still act as if I do. That ability allows me to face down challenges that would be difficult to face down otherwise.
This compartment was largely constructed with the help of The Phantom Tollbooth: it taught me that there are certain impossible tasks you can do if you think they're possible. It's not always enough to know that if I believe I can do a thing, then I have a higher probability of being able to do it. I get an extra boost from believing I can do anything.
You might be surprised about how much you can do when you have a mental compartment in which you are unstoppable.
My Willpower Does Not Deplete
Here's another: My Willpower Does Not Deplete.
Ok, so my willpower actually does deplete. I've been writing about how it does, and discussing methods that I use to avoid depletion. Right now, I'm writing about how I've acknowledged the fact that my willpower does deplete.
But I have this compartment where it doesn't.
Ego depletion is a funny thing. If you don't believe in ego depletion, you suffer less ego depletion. This does not eliminate ego depletion.
Knowing this, I have a compartment in which My Willpower Does Not Deplete. I go there often, when I'm studying. It's easy, I think, for one to begin to feel tired, and say "oh, this must be ego depletion, I can't work anymore." Whenever my brain tries to go there, I wheel this bad boy out of his cage. "Nope", I respond, "My Willpower Does Not Deplete".
Surprisingly, this often works. I won't force myself to keep working, but I'm pretty good at preventing mental escape attempts via "phantom akrasia". I don't allow myself to invoke ego depletion or akrasia to stop being productive, because My Willpower Does Not Deplete. I have to actually be tired out, in a way that doesn't trigger the My Willpower Does Not Deplete safeguards. This doesn't let me keep going forever, but it prevents a lot of false alarms.
In my experience, the strong version (My Willpower Does Not Deplete) is much more effective than the weak version (My Willpower is Not Depleted Yet), even though it's more wrong. This probably says something about my personality. Your mileage may vary. Keep in mind, though, that the effectiveness of your mental compartments may depend more on the motivational content than on degree of falsehood.
Anything is a Placebo
Placebos work even when you know they are placebos.
This is the sort of madness I'm talking about, when I say things like "you're on a strange playing field".
Knowing this, you can easily activate the placebo effect manually. Feeling sick? Here's a freebie: drink more water. It will make you feel better.
No? It's just a placebo, you say? Doesn't matter. Tell yourself that water makes it better. Put that in a nice little compartment, save it for later. It doesn't matter that you know what you're doing: your brain is easily fooled.
Want to be more productive, be healthier, and exercise more effectively? Try using Anything is a Placebo! Pick something trivial and non-harmful and tell yourself that it helps you perform better. Put the belief in a compartment in which you act as if you believe the thing. Cognitive dissonance doesn't matter! Your brain is great at ignoring cognitive dissonance. You can "know" you're wrong in the global case, while "believing" you're right locally.
For bonus points, try combining objectives. Are you constantly underhydrated? Try believing that drinking more water makes you more alert!
Brains are weird.
Truly, these are the Dark Arts of instrumental rationality. Epistemic rationalists recoil in horror as I advocate intentionally cultivating false beliefs. It goes without saying that you should use this technique with care. Remember to always audit your compartmentalized beliefs through the lens of your actual beliefs, and be very careful not to let incorrect beliefs leak out of their compartments.
If you think you can achieve similar benefits without "fooling yourself", then by all means, do so. I haven't been able to find effective alternatives. Brains have been honing compartmentalization techniques for eons, so I figure I might as well re-use the hardware.
It's important to reiterate that these techniques are necessary because you're not actually in control of your own motivation. Sometimes, incorrect beliefs make you more motivated. Intentionally cultivating incorrect beliefs is surely a path to the Dark Side: compartmentalization only mitigates the damage. If you make sure you segregate the bad beliefs and acknowledge them for what they are then you can get much of the benefit without paying the cost, but there is still a cost, and the currency is cognitive dissonance.
At this point, you should be mildly uncomfortable. After all, I'm advocating something which is completely epistemically irrational. We're not done yet, though.
I have one more Dark Side technique, and it's worse.
Willful Inconsistency
I use Intentional Compartmentalization to "locally believe" things that I don't "globally believe", in cases where the local belief makes me more productive. In this case, the beliefs in the compartments are things that I tell myself. They're like mantras that I repeat in my head, at the System 2 level. System 1 is fragmented and compartmentalized, and happily obliges.
Willful Inconsistency is the grown-up, scary version of Intentional Compartmentalization. It involves convincing System 1 wholly and entirely of something that System 2 does not actually believe. There's no compartmentalization and no fragmentation. There's nowhere to shove the incorrect belief when you're done with it. It's taken over the intuition, and it's always on. Willful Inconsistency is about having gut-level intuitive beliefs that you explicitly disavow.
Your intuitions run the show whenever you're not paying attention, so if you're willfully inconsistent then you're going to actually act as if these incorrect beliefs are true in your day-to-day life, unless your forcibly override your default actions. Ego depletion and distraction make you vulnerable to yourself.
Use this technique with caution.
This may seem insane even to those of you who took the previous suggestions in stride. That you must sometimes alter your terminal goals is a feature of the playing field, not the agent. The fact that you are not in direct control of your motivation system readily implies that tricking yourself is useful, and compartmentalization is an obvious way to mitigate the damage.
But why would anyone ever try to convince themselves, deep down at the core, of something that they don't actually believe?
The answer is simple: specialization.
To illustrate, let me explain how I use willful inconsistency.
I have invoked Willful Inconsistency on only two occasions, and they were similar in nature. Only one instance of Willful Inconsistency is currently active, and it works like this:
I have completely and totally convinced my intuitions that unfriendly AI is a problem. A big problem. System 1 operates under the assumption that UFAI will come to pass in the next twenty years with very high probability.
You can imagine how this is somewhat motivating.
On the conscious level, within System 2, I'm much less certain. I solidly believe that UFAI is a big problem, and that it's the problem that I should be focusing my efforts on. However, my error bars are far wider, my timespan is quite broad. I acknowledge a decent probability of soft takeoff. I assign moderate probabilities to a number of other existential threats. I think there are a large number of unknown unknowns, and there's a non-zero chance that the status quo continues until I die (and that I can't later be brought back). All this I know.
But, right now, as I type this, my intuition is screaming at me that the above is all wrong, that my error bars are narrow, and that I don't actually expect the status quo to continue for even thirty years.
This is just how I like things.
See, I am convinced that building a friendly AI is the most important problem for me to be working on, even though there is a very real chance that MIRI's research won't turn out to be crucial. Perhaps other existential risks will get to us first. Perhaps we'll get brain uploads and Robin Hanson's emulation economy. Perhaps it's going to take far longer than expected to crack general intelligence. However, after much reflection I have concluded that despite the uncertainty, this is where I should focus my efforts.
The problem is, it's hard to translate that decision down to System 1.
Consider a toy scenario, where there are ten problems in the world. Imagine that, in the face of uncertainty and diminishing returns from research effort, I have concluded that the world should allocate 30% of resources to problem A, 25% to problem B, 10% to problem C, and 5% to each of the remaining problems.
Because specialization leads to massive benefits, it's much more effective to dedicate 30% of researchers to working on problem A rather than having all researchers dedicate 30% of their time to problem A. So presume that, in light of these conclusions, I decide to dedicate myself to problem A.
Here we have a problem: I'm supposed to specialize in problem A, but at the intuitive level problem A isn't that big a deal. It's only 30% of the problem space, after all, and it's not really that much worse than problem B.
This would be no issue if I were in control of my own motivation system: I could put the blinders on and focus on problem A, crank the motivation knob to maximum, and trust everyone else to focus on the other problems and do their part.
But I'm not in control of my motivation system. If my intuitions know that there are a number of other similarly worthy problems that I'm ignoring, if they are distracted by other issues of similar scope, then I'm tempted to work on everything at once. This is bad, because output is maximized if we all specialize.
Things get especially bad when problem A is highly uncertain and unlikely to affect people for decades if not centuries. It's very hard to convince the monkey brain to care about far-future vagaries, even if I've rationally concluded that those are where I should dedicate my resources.
I find myself on a strange playing field, where the optimal move is to lie to System 1.
Allow me to make that more concrete:
I'm much more motivated to do FAI research when I'm intuitively convinced that we have a hard 15 year timer until UFAI.
Explicitly, I believe UFAI is one possibility among many and that the timeframe should be measured in decades rather than years. I've concluded that it is my most pressing concern, but I don't actually believe we have a hard 15 year countdown.
That said, it's hard to understate how useful it is to have a gut-level feeling that there's a short, hard timeline. This "knowledge" pushes the monkey brain to go all out, no holds barred. In other words, this is the method by which I convince myself to actually specialize.
This is how I convince myself to deploy every available resource, to attack the problem as if the stakes were incredibly high. Because the stakes are incredibly high, and I do need to deploy every available resource, even if we don't have a hard 15 year timer.
In other words, Willful Inconsistency is the technique I use to force my intuition to feel as if the stakes are as high as I've calculated them to be, given that my monkey brain is bad at responding to uncertain vague future problems. Willful Inconsistency is my counter to Scope Insensitivity: my intuition has difficulty believing the results when I do the multiplication, so I lie to it until it acts with appropriate vigor.
This is the final secret weapon in my motivational arsenal.
I don't personally recommend that you try this technique. It can have harsh side effects, including feelings of guilt, intense stress, and massive amounts of cognitive dissonance. I'm able to do this in large part because I'm in a very good headspace. I went into this with full knowledge of what I was doing, and I am confident that I can back out (and actually correct my intuitions) if the need arises.
That said, I've found that cultivating a gut-level feeling that what you're doing must be done, and must be done quickly, is an extraordinarily good motivator. It's such a strong motivator that I seldom explicitly acknowledge it. I don't need to mentally invoke "we have to study or the world ends". Rather, this knowledge lingers in the background. It's not a mantra, it's not something that I repeat and wear thin. Instead, it's this gut-level drive that sits underneath it all, that makes me strive to go faster unless I explicitly try to slow down.
This monkey-brain tunnel vision, combined with a long habit of productivity, is what keeps me Moving Towards the Goal.
Those are my Dark Side techniques: Willful Inconsistency, Intentional Compartmentalization, and Terminal Goal Modification.
I expect that these techniques will be rather controversial. If I may be so bold, I recommend that discussion focus on goal-hacking and intentional compartmentalization. I acknowledge that willful inconsistency is unhealthy and I don't generally recommend that others try it. By contrast, both goal-hacking and intentional compartmentalization are quite sane and, indeed, instrumentally rational.
These are certainly not techniques that I would recommend CFAR teach to newcomers, and I remind you that "it is dangerous to be half a rationalist". You can royally screw you over if you're still figuring out your beliefs as you attempt to compartmentalize false beliefs. I recommend only using them when you're sure of what your goals are and confident about the borders between your actual beliefs and your intentionally false "beliefs".
It may be surprising that changing terminal goals can be an optimal strategy, and that humans should consider adopting incorrect beliefs strategically. At the least, I encourage you to remember that there are no absolutely rational actions.
Modifying your own goals and cultivating false beliefs are useful because we live in strange, hampered control systems. Your brain was optimized with no concern for truth, and optimal performance may require self deception. I remind the uncomfortable that instrumental rationality is not about being the most consistent or the most correct, it's about winning. There are games where the optimal move requires adopting false beliefs, and if you find yourself playing one of those games, then you should adopt false beliefs. Instrumental rationality and epistemic rationality can be pitted against each other.
We are fortunate, as humans, to be skilled at compartmentalization: this helps us work around our mental handicaps without sacrificing epistemic rationality. Of course, we'd rather not have the mental handicaps in the first place: but you have to work with what you're given.
We are weird agents without full control of our own minds. We lack direct control over important aspects of ourselves. For that reason, it's often necessary to take actions that may seem contradictory, crazy, or downright irrational.
Just remember this, before you condemn these techniques: optimality is as much an aspect of the playing field as of the strategy, and humans occupy a strange playing field indeed.