Ah, delicious dark chocolate M&Ms, colorfully filling a glass jar with your goodness. How do I love thee? About four of you an hour. Here's a brief rundown of my most recent motivation hacking experiment.
1. Gwern has an interesting article arguing that Massive Open Online Courses (MOOCs) may shift the learning advantage from intelligence toward conscientiousness (actually he's not sure about the intelligence part). This shift occurs because MOOCs select for higher-quality instruction and better feedback, broadly speaking and over time, but it's much harder to stay on task without a malevolent instructor and bad grades breathing down your neck. This thesis jives with my own experience; if I get stuck on a math problem, I just google "an intuitive approach to x," and I usually find a couple of people begging to teach me the concept. But it's harder to get started and to stay focused than in a classroom.
2. Given that knowledge compounds and grants increasing advantages, I'd really like to keep taking advantage of MOOCs. Some MOOCs are better than others, but many are better than your standard college course - and they're free. For a non-technical guy getting technical, like me, it's a golden age of education. So, it would be great if I were highly conscientious. Gwern points out that conscientiousness is a relatively stable Big Five personality trait.
3. The question then becomes, can conscientiousness be developed? Well, I'm not a Cartesian agent, so wouldn't it make sense to reward myself for conscientiousness? Enter the M&Ms. I set a daily target for pomodoros. When I finish a pomodoro, I get a big peanut M&M or two small ones. If I finish two in a row, I get two servings, and so on. In this way, I encourage myself to get started, and then to keep going to build Deep Focus. Each pomodoro becomes cause for celebration, and I find my rapid progress through pomodoros (and chocolate) energizing, where long periods of distraction were tiring.
This has worked fantastically well for the last two weeks. I hit my pomodoro target for paid work, then switch to educational work. I plan to keep it up, and maybe I'll use chocolate as motivation somewhere else as well. Now back to my M&Ms, green, yellow, blue, orange, brown, red . . .
There's a nice conventional categorisation of behaviour modification programmes that goes like this:
Fixed-ratio: a reward is given after a fixed number of nonreinforced responses (e.g. an M&M after every pomodoro, or even fifth pomodoro). Fixed-interval: a reward is given after a fixed interval of time (e.g. you might always set the pomodoro for 25 minutes as per convention). Variable-ratio: a reward is given after a variable number of nonreinforced responses (e.g. you flip a coin after every pomodoro to decide whether you get an M&M). Variable-interval: a reward is given after a variable time interval (i.e. you find some way to determine how long to set the pomodoro, perhaps with a lower bound).
The schedule of reinforcement you're using is left a bit vague. It looks like you're following an FR schedule but could also be doing an FI or VI schedule. But for the purposes of offering advice to people who might want to try something like this, I'll assume you're using either FR or FI.
Psychologists categorise schedules in that way because they want to study the effects of differences in reinforcement. In particular they've been interested in the effects of changes in schedules on the extinction of a behaviour. One major result from the literature (which is reported in most psych textbooks that include a chapter on learning theories) is that variable schedules (using either ratios of respondes or time intervals) are much more resistant to extinction than fixed schedules. As an example, consider a slot machine at a casino; it doesn't have a fixed ratio of 1 reward for every nth try. Instead it varies the ratio of attempts and rewarded attempts, taking advantage of the much stronger reinforcement effect.
So my first piece of advice is: do not use fixed schedules. Varying the rate of reinforcement (either as a function of time or number of completed pomodoros) will help make the good habit you're trying to build stick if your pomodoros use is ever disrupted (because you're busy, you somehow forget, or whatever).
Another result from the literature is that ratio schedules produce higher response rates. This occurs because faster responding increases the likelihood of being reward sooner, since ratio schedules don't depend on time but on attempts. In many situations you might want to take advantage of this and opt for a VR schedule (say if you wanted to encourage a child to behave). In this case, though, it would probably only lead to extinction or abuses. Extinction because if your time intervals are somewhat long (say around 30-60 minutes), then the rewards might be given too infrequently to build your motivation and give you energy. Abuses because the big spaces between rewards might encourage you to cheat the system and eat some M&Ms anyhow because you want the energy.
That leads me to my second bit of advice: don't use a VR schedule; instead vary the time interval. I suggest finding some way to randomise the selection (like rolling dice, throwing darts, or having an algorithm spit out a number) and putting a lower bound on the time intervals (to give yourself enough time to build some flow and focus).
hmm, idea, how well'd this work: you have a machine that drops the reward with a certain low probability every second, but you have to put it back rather than eat it if you weren't doing the task?