In which I attempt to apply findings from behavioral psychology to my own life.
Behavioral Psychology Finding #1: Habituation
The psychological process of "extinction" or "habituation" occurs when a stimulus is administered repeatedly to an animal, causing the animal's response to gradually diminish. You can imagine that if you were to eat your favorite food for breakfast every morning, it wouldn't be your favorite food after a while. Habituation tends to happen the fastest when the following three conditions are met:
- The stimulus is delivered frequently
- The stimulus is delivered in small doses
- The stimulus is delivered at regular intervals
Source is here.
Applied Habituation
I had a project I was working on that was really important to me, but whenever I started working on it I would get demoralized. So I habituated myself to the project: I alternated 2 minutes of work with 2 minutes of sitting in the yard for about 20 minutes. This worked.
Interestingly enough, about halfway through this exercise I realized that what was really making it difficult for me to work on my project was the fact that it involved so many choices. So as my 20 minutes progressed, I started spending my 2 minutes trying to make as difficult decisions as possible. This habituation to decision demoralization seems to have had an immediate, fairly lasting impact on a wide variety of activities.
I'm really looking forward to hearing from someone who attempts to apply habituation to an ugh field.
Applied Habituation in Reverse
If you want to enjoy your favorite song until the day you die, dance to it infrequently at irregular intervals while it plays full blast. (Reversed conditions for habituation.)
Behavioral Psychology Finding #2: Intermittent Reinforcement
The reason why slot machines are so engaging is because they deliver rewards at random. If slot machines payed small rewards out on every round, playing them would be like work.
Applied Intermittent Reinforcement
For a while, there was a time-consuming chore that I was required to do every evening. I would often put it off until 2-3 AM and work while sleepy as a result.
To solve this problem, I started eating a gummy worm with 50% probability each time I did the chore at a pre-determined time early in the evening. (I gave myself the first two gummy worms with 100% probability to start things off.) My success rate with this method was very high.
Further Research
Another self-help technique I've had tremendous success with is using Linux's cron utility to cause Firefox tabs to open periodically and tell me to switch activities if I'm wasting time. However, I've found that forcing myself to switch activities is highly stressful.
Perhaps it's possible to habituate the negative response to activity switching by having practice sessions where you periodically switch between distraction and work? Or maybe you could use intermittent reinforcement and randomly decide to give yourself something nice if you're successful in an upgrade to a higher-quality activity.
(I'm not experimenting with these at the moment because I'm currently fairly happy with my work/relaxation balance.)
Thanks to Psychohistorian for reminding me I wanted to write about this. I'm hoping he won't get mad at me for writing on the same topic he did so soon after his post.
(To summarize my upcoming point in tl;dr form: if you don't find yourself rationalizing "maybe I'm onto the pattern" while your stomach rumbles as you contemplate the upside of getting gummi bears marginally more often, you might be tickling a different variety-seeking mechanism than you think. Nothing wrong with that, but if you want to get really good at optimizing that tickle, detailed knowledge about which mechanism it is might be helpful.)
From time to time when reading technical articles related to effective strategies for artificial agents faced with "n-armed bandit" problems, I am reminded of observed animal behavior patterns like PREE, and wonder how close the correspondence might be. N-armed bandits have been studied for a long time, and it seems like an obvious conjecture, but I have never seen much analysis of this. I never encounter such analysis spontaneously when people talk about a particular psych observation, even at ML-friendly sites like LW. And hunting for it with e.g. Google "partial reinforcement n-armed bandit" suggests that it must be a pretty obscure topic, because in the articles I find, the analysis I am looking for is swamped by different topics like reinforcement learning, and obscure topics like how a web designer trying to optimize humans' response to the website can usefully think of his website A/B testing as an n-armed bandit problem.
Can anyone recommend systematic attempts to explore how close this correspondence might be?
Of the usual pop psychology examples of overresponse to partial reinforcement, it looks to me as though gambling truly is narrowly tuned to the PREE phenomenon, and is working essentially by fooling an agent designed to solve a bandit problem. Other examples, however, tend to be sufficiently ambiguous or contradictory in various ways that I thinksomething unrelated could be going on. Humans can respond to variety in all sorts of positive ways. E.g., (dammit, I'm going blank on the name of) the classic confounding effect in industrial productivity studies where change itself, in either direction, can easily cause a positive effect independent of whether the new situation is objectively better in any useful sense.
Notice that successful gambling operations are contrived so that if you ever did discover even a small pattern (e.g., 53% success instead of 48%) it follows by perfectly correct analysis that the discovery would be enormously valuable. Under such extreme conditions, even a small nudge from a simple n-armed bandit heuristic (like a nagging intuition corresponding to a high prior probability that high variance implies a significant probability of discovering something that improves performance by a mere 10%) can get amplified to dramatically wrong behavior. Also notice that there is a strong observed pattern of compulsive gamblers fooling themselves into thinking they are finding small patterns. If gambling were a case of partial reinforcement directly tickling purely subconscious deep structures unrelated to n-armed banditry, then "I'm onto the pattern" might still sometimes be used to rationalize the irrational behavior, but it's not clear why it would be a strongly favored rationalization (compared to, e.g., "risk taking makes me glamorous").
Compare this to behavior patterns that aren't observed. E.g., I've never heard of anyone making cigarettes qualitatively more addictive by making them unpredictable, e.g., by selling mixed packs of placebo and nicotinic cigarettes. Could this be because there's no way for the robot to get rationally excited about the enormous upside of spotting a small pattern in such randomness? (And anything close to this which does succeed, e.g. toy prizes in cereal boxes, tends to be successful for only about as long as the robot's inputs from the world model let it be strongly uncertain about the upside..)
People do claim to spot partial-reinforcement-related phenomena in other behavior patterns which can't easily be explained as a bandit problem heuristic being tricked. E.g., people often accuse World of Warcraft and similar games of manipulating the intermittent reward mechanism to cause addictive behavior. WoW treasures are indeed randomized, and people do indeed become fascinated by the game, and I don't see how the robot could be getting excited about a huge upside of spotting the pattern. But WoW is in the entertainment industry. WoW developers could have saved a substantial amount of money by hiring far fewer artists to create far fewer kinds of trees and other decorations, but that it would be a bad idea. Hollywood could save even more money by aggressively reusing sets and actors and props and scripts between movies. Even in extremes like soap operas where many customers are looking for repetitive essentially-predictable escape, a successful entertainment product benefits from many kinds of variety. It seems to me that the positive importance of randomizing treasures needn't be explained by partial reinforcement any more than the positive importance, in a soap opera in which villains walk onto the stage in hundreds of different episodes, of avoiding a clear pattern of villains entering stage right every single time.