Self-motivated hard work is the primary source of the intense, optimistic engagement known as flow—one of the greatest forms of happiness that makes us come alive with purpose and potential (Csikszentmihalyi, 1975). Sadly, for most people work does not feel so rewarding most of the time. Instead we often have to persevere through long periods of hard, painful, and unrewarding work when we could be doing something much more enjoyable. When faced with this motivational challenge people often give up too easily, get sidetracked, or procrastinate (Steel, 2007). The problem is not that we are not willing or unable to work hard. To the contrary, we crave being productively engaged in challenging tasks. Thus, instead of blaming ourselves for our limited will-power, it may be more productive to take a critical look at the carrots and the sticks that are supposed to help us stay motivated. Who put them there and why? Are these incentives helpful, distracting, irrelevant, or out of sight? If you could place them differently and add new ones, where would they go? Often, the problem is that the rewards we experience in the short run are misaligned with what we want to accomplish. In the short run the extremely valuable work that brings us closer to our cherished goals can be aversive while activities that are irrelevant or even opposed to everything we want to accomplish can be pleasant and rewarding. Hence, when we struggle to be engaged with something that we care about, then perhaps we are not the problem but the incentives are, or as Jane McGonigal (2011) put it "Reality is broken".

So, if reality is broken, then what can we do to fix it?  One approach is to design better incentive structures that make the pursuit of our goals more engaging. If we want to go this way, then there is a lot to be learned from games, because their incentive structures are so well designed that they let people enjoy hard work for many hours on end (McGonigal, 2011). In the past five years, the success of video games has inspired the gamification of education, work, health, and business. Gamification is the use of game elements, like points, levels, badges, and quests to engage, motivate, and nudge people in non-game contexts. There are even tools like SuperBetter and Habitica that individuals like you and I can use to gamify our own lives. Previous studies have shown that gamification can have positive effects on motivation, engagement, behavior, learning outcomes, and health—but only when it is done right (Hamari, Koivisto, & Sarsa, 2014; Roepke, et al., 2015). But when gamification is done wrong it can have negative effects by incentivizing counter-productive behaviors. So far gamification has been an art, and there is very little science about how to do it right. This motivated my advisor and me to develop a practical theory of optimal gamification.

In this blog post I focus on how our theory could be applied in practice. If you would like to learn about the technical details or read more about our experiments, then please take look at our CogSci paper (Lieder & Griffiths, submitted). I will start with a very brief summary of our method, provide an intuitive explanation of what it does, and then dive into how you can implement it in your own life. I will close with an outlook on how our method could be applied to gamify our todo lists.


Level 1: Optimal Gamification

Our method for optimal gamification draws on the theory of Markov decision processes (MDPs; Sutton & Barto, 1998) and the shaping theorem (Ng, Harada, & Russell, 1999). The basic idea is to align each action's immediate reward with its value in the long run. Therefore the points should complement the immediate rewards of doing something (e.g., how painful it is) by the value that it generates in the long run. Concretely, the points awarded for an activity should be chosen such that the right thing to do looks best in the short run when you combine how many points it is worth with how it feels when you do it. Furthermore, the points have to be assigned in such a way that when you undo something you lose as many points as you earned when you did it. We evaluated the effectiveness of our method in two behavioral experiments. Our first experiment demonstrated that incentive structures designed by our method can indeed help people make better, less short-sighted decisions—especially when course of action that is best in the long run is unpleasant in the short run. We also found that less principled approaches to gamification can encourage ruthless rushing towards a goal that causes more harm than good, and we showed that our method is guaranteed to avoid these perils. In the second experiment we found that the optimal incentive structures designed with our method can be effectively implemented using game elements like points and badges. These results suggest that the proposed method provides a principled way to leverage gamification to help people make better decisions.

Our method proceeds in three steps:

1.    Model the situation and the decision-maker's goals and options as a MDP.

2.    Solve the MDP to obtain the optimal value function V* or approximate it.

3.    Set the number of points for progressing from stage s to stage s' to V*(s')-V*(s).  

Intuitively, this means that the number of points that is awarded for doing something should reflect how much better the resulting state (i.e., s') is than the previous one (i.e., s). For instance, achieving a goal is worth 1000 points then completing 10% of the work required to reach the goal should be rewarded with 100 points. So let's think about how you could apply this approach right now without having to solve MDPs.


Level 2: Practical Implications

In my day-to-day life I try to approximate optimal gamification as follows:

1.    Set a concrete goal that you would like to achieve and figure out how many points it is worth, e.g. writing this blog post was worth 1000 points to me.

2.    Set several milestones along the way to the goal to divide the path into small steps that feel very manageable.

3.    For each milestone, determine how far you will have come when you get there as a percentage of the total distance to the goal, e.g. 10%, 20%, 30%, ..., 100% for the first, second, third, ..., and the tenth milestone respectively.

4.    Assign each milestone the corresponding fraction of the total value of achieving the goal, e.g. 100 points, 200 points, 300 points, ..., and 1000 points for the first, second, third, ..., and tenth milestone respectively.

5.    Figure out what you have to do to get from one milestone to the next. If this is a simple activity, then its reward should be the difference between the value of next milestone and the value of the current milestone, e.g. 100 points. If it is a complex sequence of actions, then make it a subgoal and apply steps 1-3 figure out how to achieve it.

6.    Once you are done with step 5, you can add those points to your todo-list.

7.    Now it is time to get things done and reward yourself. You start at 0 points, but whenever you complete one of the steps, you earn as many points as you have assigned to it and can increment your (daily) score.

Earning these points can be very rewarding if you remind yourself what they stand for. If your goal was worth $1,000,000 to you and you assigned 1000 points to it, then 10 points should be worth $10,000 to you. But if this is not rewarding enough for you, you can think of ways that make the points more pleasurable. You could, for instance, make a high-score list that motivates you to beat your personal best day after day or start a high-score competition with your friends. You could also set yourself the goal to achieve a certain number of points by a certain time and promise yourself a treat if you achieve it.

There are many other ways that you could assign points to the items on you todo list. Feel free to do whatever works for you. But it may be useful to keep in mind that the way in which optimal gamification assigns points has several formal properties that are necessary to avoid negative side-effects:

a) Each item's score reflects how valuable is in the long run.

Optimal gamification works because it aligns each action's immediate reward with its long-term value. To help you make better decisions the points should be designed such that the course of action that is best in the long run looks best in the short run. This entails incentivizing unpleasant or unrewarding activities that will pay off later—especially when their less productive alternatives are very rewarding in the short run.

b) Beware of cycles!

The shaping theorem (Ng, et al., 1999) requires that going back and forth between two states receives a net pseudo-reward of zero. When your pseudo-rewards along a circle add up to a positive value, then you may be incentivizing yourself to create unnecessary problems for yourself. This can happen when the action for which you reward yourself can only be executed in an undesirable state, and you do not equally punish yourself for falling back into that state. For instance adding points for losing weight will inadvertently incentivize you to regain weight afterwards unless you subtract at least the same number of points for gaining weight. Similarly, if you reward yourself for solving interpersonal conflicts but don’t punish yourself for creating them, then you may be setting yourself up for trouble. To avoid such problems, creating a problem must be punished by at least as many points as you earn by solving it. 

c) Two ways to achieve the same goal should yield the same number of points.

The shaping theorem also requires that all paths that lead to the same final state (e.g., having submitted a paper by the deadline) should yield the same amount of reward. If this is not the case your pseudo-rewards may bias you towards a suboptimal path. For instance, if you reward your all-nighter on the last night before the deadline by the reward value of a month’s worth of work, you are incentivizing yourself to procrastinate. Similarly, if you reward one activity that leads towards your goal much more heavily than others, then you may be biasing yourself towards a reckless course of action that may achieve the goal at an unreasonably high cost. For instance, rewarding yourself 100 times as much for working 100% on a project than for working on it 50% might lead you to complete the project early at the expense of your health, your friendships, your education, and all your other projects. To avoid this problem, al paths that lead to the same state should yield the same amount of reward. 

d) Pseudo-rewards should be awarded for state-transitions instead of actions.

Many applications of gamification reward "good" actions with points regardless of when or how often these actions are taken. But according to the shaping theorem, the number of points must depend on the state in which the action is taken and the state that it leads to. If your pseudo-rewards were based only on what you do but not on when you do it, then you might keep rewarding yourself for something even when it is no longer valuable, because the underlying state has changed. For instance, at some point your reward for losing weight has to diminish or else you may be setting yourself up for anorexia.


Level 3: Todo-list gamification

Todo list gamification

My first practical application is to manually gamify my todo-list every morning. I find this very helpful and motivating: Assigning points to the items on my todo list makes me realize how much I value them. This is useful for prioritizing important task. Earning points allows me to perceive my progress more more accurately and more vividly. This helps me feel great about getting something important done even when it was only a single item on my todo list and took me a lot of time and effort to accomplish. Conversely, the point scheme also prevents me from feeling so good about checking off small things that I become tempted to neglect the big ones that are much more important. Gamification thereby remedies the todo list's shortcoming that it makes each item seem equally important. I highly recommend gamifying your todo lists. It can be highly motivating. Yet, adding the points manually takes some effort and my point scheme is often somewhat arbitrary and probably suboptimal.

To make todo list gamification easier and more effective, I am planning to develop an easy-to-use website or app that will do optimal gamification for you. Its graphical user interface would allow you to create hierarchical todo-lists, ask you 1 or 2 simple questions about each item on your list and then gamify your todo-list for you. To do this, it will translate your list and your answers into a MDP, compute its optimal value function, and use it to determine how valuable it is to complete each item. The tool could also help you set manageable subgoals and determine what is most important and should be done first. Last but not least, a website or app can also leverage additional game elements to make the points that you earn more rewarding: It can track your productivity and provide instant feedback that makes your progress more salient. It can send you on a quest that gives you a goals along with small actionable steps. The tool could allow you to realize that you are getting ever more productive by visualizing your progress over time. As you become more effective, you level up and your quests will become increasingly more challenging.  It might include a scoreboard that lets you compete with yourself and/or others and win prizes for your performance. Last but not least, if you need an extra push, you can tie your points to social rewards, your favorite treat, money, or access to your favorite music, apps, or websites. There are many more possibilities, and I invite you to think about it and share your ideas. In brief, there is wealth of opportunities to leverage game elements to make goal achievement fun and easy.

Join me on my quest! An adventure awaits. 

Gamification can be a useful tool to make achieving your goals easier and more engaging. However, gamification only works when it is done right. The theory of MDPs and pseudo-rewards provide the formal tools needed to do gamification right. With the help of these tools we can design incentive structures that help people overcome motivational obstacles, do the right thing and achieve their goals. But more research and development needs to be done to make optimal gamification practical.

If you have any thoughts or ideas for what to do next, noticed a problem with the approach, or would like to be part of our team and contribute to building a tool helps people achieve their goals, please send me an e-mail.


References and recommended readings

Csikszentmihalyi, M. (1975). Beyond boredom and anxiety: the experience of play in work and games. San Francisco: Jossey-Bass.

Lieder, F., & Griffiths, T.L. (submitted). Helping people make better decisions using optimal gamification. CogSci 2016. [Manuscript]

McGonigal, J. (2011). Reality is broken: Why games make us better and how they can change the world. New York: Penguin.

McGonigal, J. (2015). SuperBetter: A revolutionary approach to getting stronger, happier, braver and more resilient–powered by the science of games. London, UK: Penguin Press.

Hamari, J., Koivisto, J., & Sarsa, H. (2014). Does gamification work?–A literature review of empirical studies on gamification. In 47th Hawaii international conference on system sciences (pp. 3025–3034).

Ng, A. Y., Harada, D., & Russell, S. (1999). Policy invariance under reward transformations: Theory and application to reward shaping. In I. Bratko & S. Dzeroski (Eds.), Proceedings of the 16th annual international conference on machine learning (Vol. 16, pp. 278–287). San Francisco, CA, USA: Morgan Kaufmann.

Roepke, A. M., Jaffee, S. R., Riffle, O. M., McGonigal, J., Broome, R., & Maxwell, B. (2015). Randomized controlled trial of SuperBetter, a smartphone-based/Internet-based self-help tool to reduce depressive symptoms. Games for health journal4(3), 235-246.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA, USA: MIT press.

 

New Comment
11 comments, sorted by Click to highlight new comments since:

Great. The summary I took home and added to my Anki deck (kind of TLDR):

Gamification works if

  • Each item's score reflects how valuable is in the long run.
  • Partial completion should give partial points
  • Reversing should cancel points (otherwise: cycles of repeated falling back)
  • All paths should result in the same points (thus: reward state transitions instead of actions)

I thought most about this:

For instance, rewarding yourself 100 times as much for working 100% on a project than for working on it 50% might lead you to complete the project early at the expense of your health, your friendships, your education, and all your other projects.

This kind of assumes that all these other aspects also get scores - otherwise by definition these are less rewarding. I'm wondering whether it is a good idea to have a diverse set of things that get scored (kind of along the line of rewarding complexity of human value). But I also wonder whether this takes focus out of the system. And working focussed on one specific topic is more efficient than spreading out ones energy. I wonder whether this can be priced in somehow or implies a certain minimum size of an item.

This is a topic I am very interested in and would like to see explored in depth, but the huge wall of text at the beginning (and in other parts) meant I couldn't read this article.

Please chop this into paragraphs.

Sounds exciting, but I see at least two problems:

1) For many things I want to do, I don't know exactly how valuable they are.

Imagine a student who wants to learn programming -- they may have an idea about "average programmer salary", but they don't know whether they are going to be better or worse than the average, and how much time will it take them to get there. How should they calculate a value of a lesson? Does learning two programming languages have twice as high value as one of them? Or is the second one superfluous because at any given time they are likely to only use one? Or is the second one an insurance against scenarios where "something goes wrong with one specific programming language"? Generally, learning any new skill has this kind of problems.

2) What about "maintenance" tasks? Things that need to be done regularly, such as washing dishes or vacuuming one's room, that don't bring something new, but rather avoid losing deterioration of one's state.

In some situation, the maintenance cost could be subtracted from the things that requires the maintenance. For example, if washing the dishes is 10 points, and having a food cooked is 100 points, I should actually only give myself 90 points when cooking lunch. (The math is same, but either we can imagine it as "cooking = food + dirty dishes = 100 + (-10) = 90", or we could treat washing the dishes as the last stage of the "cooking lunch" project.) But for things like vacuuming the room, it is difficult to point out what exactly creates the debt. Maybe we could simplify some things as "daily costs of living (at given quality)", which means that every morning I would get automatically a few negative points for things getting worse without maintenance.

I think a better term to use is to "prioritize" rather than gamify.

I also think there are better ways to do tasks when taking into account different situations. Making a more simple meal might mean less effort and less time washing your dishes. Using disposable ones is also possible if you value your time more than the ever increasing total money spent on it. I'd argue that daily tasks are the most easiest to optimize as you will inevitably get more practice and you'll see how can stuff be done better.

This kind of gamification algorithm is interesting, but doesn't sit well with different types of productivity task that I encounter far more often. For example, it doesn't apply well to creating a habit, where you just have to do the same task over and over, or where there aren't clear paths to completion (and researching how to get there is part of the task).

Many games have the concept of a streak where you're rewarded for maintaining continuity. The rewards escalate with the length of the streak and the player is penalized when they break it. Basic reward structures for habit formation could be modeled like this.

[-]Elo10

Please fix your formatting.

Done! It is still not perfect, but hopefully good enough.

It is good enough.

Practical implementation of gamifing:

Carma system on lesswrong create some kind of a game which encourage to write more comments

Gaming on stock exchange or looking on it could eat a lot of time. So game may be addictive and destructive to other non gamified behavior - I know it on my self.

But looking on own weight each morning is the best way to make its loosing a game - and it is know and working practice.

In fact I know see that I spent most my time on gamified activities, but some of them are not useful.