Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Making intentions concrete - Trigger-Action Planning

20 Kaj_Sotala 01 December 2016 08:34PM

I'll do it at some point.

I'll answer this message later.

I could try this sometime.

For most people, all of these thoughts have the same result. The thing in question likely never gets done - or if it does, it's only after remaining undone for a long time and causing a considerable amount of stress. Leaving the "when" ambiguous means that there isn't anything that would propel you into action.

What kinds of thoughts would help avoid this problem? Here are some examples:

  • When I find myself using the words "later" or "at some point", I'll decide on a specific time when I'll actually do it.
  • If I'm given a task that would take under five minutes, and I'm not in a pressing rush, I'll do it right away.
  • When I notice that I'm getting stressed out about something that I've left undone, I'll either do it right away or decide when I'll do it.
Picking a specific time or situation to serve as the trigger of the action makes it much more likely that it actually gets done.

Could we apply this more generally? Let's consider these examples:
  • I'm going to get more exercise.
  • I'll spend less money on shoes.
  • I want to be nicer to people.
These goals all have the same problem: they're vague. How will you actually implement them? As long as you don't know, you're also going to miss potential opportunities to act on them.

Let's try again:
  • When I see stairs, I'll climb them instead of taking the elevator.
  • When I buy shoes, I'll write down how much money I've spent on shoes this year.
  • When someone does something that I like, I'll thank them for it.
These are much better. They contain both a concrete action to be taken, and a clear trigger for when to take it.

Turning vague goals into trigger-action plans

Trigger-action plans (TAPs; known as "implementation intentions" in the academic literature) are "when-then" ("if-then", for you programmers) rules used for behavior modification [i]. A meta-analysis covering 94 studies and 8461 subjects [ii] found them to improve people's ability for achieving their goals [iii]. The goals in question included ones such as reducing the amount of fat in one's diet, getting exercise, using vitamin supplements, carrying on with a boring task, determination to work on challenging problems, and calling out racist comments. Many studies also allowed the subjects to set their own, personal goals.

TAPs were found to work both in laboratory and real-life settings. The authors of the meta-analysis estimated the risk of publication bias to be small, as half of the studies included were unpublished ones.

Designing TAPs

TAPs work because they help us notice situations where we could carry out our intentions. They also help automate the intentions: when a person is in a situation that matches the trigger, they are much more likely to carry out the action. Finally, they force us to turn vague and ambiguous goals into more specific ones.

A good TAP fulfills three requirements [iv]:
  • The trigger is clear. The "when" part is a specific, visible thing that's easy to notice. "When I see stairs" is good, "before four o'clock" is bad (when before four exactly?). [v]
  • The trigger is consistent. The action is something that you'll always want to do when the trigger is fulfilled. "When I leave the kitchen, I'll do five push-ups" is bad, because you might not have the chance to do five push-ups each time when you leave the kitchen. [vi]
  • The TAP furthers your goals. Make sure the TAP is actually useful!
However, there is one group of people who may need to be cautious about using TAPs. One paper [vii] found that people who ranked highly on so-called socially prescribed perfectionism did worse on their goals when they used TAPs. These kinds of people are sensitive to other people's opinions about them, and are often highly critical of themselves. Because TAPs create an association between a situation and a desired way of behaving, it may make socially prescribed perfectionists anxious and self-critical. In two studies, TAPs made college students who were socially prescribed perfectionists (and only them) worse at achieving their goals.

For everyone else however, I recommend adopting this TAP:

When I set myself a goal, I'll turn it into a TAP.

Origin note

This article was originally published in Finnish at kehitysto.fi. It draws heavily on CFAR's material, particularly the workbook from CFAR's November 2014 workshop.

Footnotes

[i] Gollwitzer, P. M. (1999). Implementation intentions: strong effects of simple plans. American psychologist, 54(7), 493.

[ii] Gollwitzer, P. M., & Sheeran, P. (2006). Implementation intentions and goal achievement: A meta‐analysis of effects and processes. Advances in experimental social psychology, 38, 69-119.

[iii] Effect size d = .65, 95% confidence interval [.6, .7].

[iv] Gollwitzer, P. M., Wieber, F., Myers, A. L., & McCrea, S. M. (2010). How to maximize implementation intention effects. Then a miracle occurs: Focusing on behavior in social psychological theory and research, 137-161.

[v] Wieber, Odenthal & Gollwitzer (2009; unpublished study, discussed in [iv]) tested the effect of general and specific TAPs on subjects driving a simulated car. All subjects were given the goal of finishing the course as quickly as possible, while also damaging their car as little as possible. Subjects in the "general" group were additionally given the TAP, "If I enter a dangerous situation, then I will immediately adapt my speed". Subjects in the "specific" group were given the TAP, "If I see a black and white curve road sign, then I will immediately adapt my speed". Subjects with the specific TAP managed to damage their cars less than the subjects with the general TAP, without being any slower for it.

[vi] Wieber, Gollwitzer, et al. (2009; unpublished study, discussed in [iv]) tested whether TAPs could be made even more effective by turning them into an "if-then-because" form: "when I see stairs, I'll use them instead of taking the elevator, because I want to become more fit". The results showed that the "because" reasons increased the subjects' motivation to achieve their goals, but nevertheless made TAPs less effective.

The researchers speculated that the "because" might have changed the mindset of the subjects. While an "if-then" rule causes people to automatically do something, "if-then-because" leads people to reflect upon their motivates and takes them from an implementative mindset to a deliberative one. Follow-up studies testing the effect of implementative vs. deliberative mindsets on TAPs seemed to support this interpretation. This suggests that TAPs are likely to work better if they can be carried out as consistently and as with little thought as possible.

[vii] Powers, T. A., Koestner, R., & Topciu, R. A. (2005). Implementation intentions, perfectionism, and goal progress: Perhaps the road to hell is paved with good intentions. Personality and Social Psychology Bulletin, 31(7), 902-912.

[Link] Finding slices of joy

4 Kaj_Sotala 28 November 2016 10:10AM

[Link] How Feasible Is the Rapid Development of Artificial Superintelligence?

7 Kaj_Sotala 24 October 2016 08:43AM

[Link] Software for moral enhancement (kajsotala.fi)

6 Kaj_Sotala 30 September 2016 12:12PM

[Link] An appreciation of the Less Wrong Sequences (kajsotala.fi)

5 Kaj_Sotala 30 September 2016 12:11PM

[link] MIRI's 2015 in review

9 Kaj_Sotala 03 August 2016 12:03PM

https://intelligence.org/2016/07/29/2015-in-review/

The introduction:

As Luke had done in years past (see 2013 in review and 2014 in review), I (Malo) wanted to take some time to review our activities from last year. In the coming weeks Nate will provide a big-picture strategy update. Here, I’ll take a look back at 2015, focusing on our research progress, academic and general outreach, fundraising, and other activities.

After seeing signs in 2014 that interest in AI safety issues was on the rise, we made plans to grow our research team. Fueled by the response to Bostrom’s Superintelligence and the Future of Life Institute’s “Future of AI” conference, interest continued to grow in 2015. This suggested that we could afford to accelerate our plans, but it wasn’t clear how quickly.

In 2015 we did not release a mid-year strategic plan, as Luke did in 2014. Instead, we laid out various conditional strategies dependent on how much funding we raised during our 2015 Summer Fundraiser. The response was great; we had our most successful fundraiser to date. We hit our first two funding targets (and then some), and set out on an accelerated 2015/2016 growth plan.

As a result, 2015 was a big year for MIRI. After publishing our technical agenda at the start of the year, we made progress on many of the open problems it outlined, doubled the size of our core research team, strengthened our connections with industry groups and academics, and raised enough funds to maintain our growth trajectory. We’re very grateful to all our supporters, without whom this progress wouldn’t have been possible.

[link] Simplifying the environment: a new convergent instrumental goal

4 Kaj_Sotala 22 April 2016 06:48AM

http://kajsotala.fi/2016/04/simplifying-the-environment-a-new-convergent-instrumental-goal/

Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.

Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.

I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.

[link] Disjunctive AI Risk Scenarios

10 Kaj_Sotala 05 April 2016 12:51PM

Arguments for risks from general AI are sometimes criticized on the grounds that they rely on a series of linear events, each of which has to occur for the proposed scenario to go through. For example, that a sufficiently intelligent AI could escape from containment, that it could then go on to become powerful enough to take over the world, that it could do this quickly enough without being detected, etc.

The intent of my following series of posts is to briefly demonstrate that AI risk scenarios are in fact disjunctive: composed of multiple possible pathways, each of which could be sufficient by itself. To successfully control the AI systems, it is not enough to simply block one of the pathways: they all need to be dealt with.

I've got two posts in this series up so far:

AIs gaining a decisive advantage discusses four different ways by which AIs could achieve a decisive advantage over humanity. The one-picture version is:

AIs gaining the power to act autonomously discusses ways by which AIs might come to act as active agents in the world, despite possible confinement efforts or technology. The one-picture version (which you may wish to click to enlarge) is:

These posts draw heavily on my old paper, Responses to Catastrophic AGI Risk, as well as some recent conversations here on LW. Upcoming posts will try to cover more new ground.

[paper] [link] Defining human values for value learners

5 Kaj_Sotala 03 March 2016 09:29AM

MIRI recently blogged about the workshop paper that I presented at AAAI.

My abstract:

Hypothetical “value learning” AIs learn human values and then try to act according to those values. The design of such AIs, however, is hampered by the fact that there exists no satisfactory definition of what exactly human values are. After arguing that the standard concept of preference is insufficient as a definition, I draw on reinforcement learning theory, emotion research, and moral psychology to offer an alternative definition. In this definition, human values are conceptualized as mental representations that encode the brain’s value function (in the reinforcement learning sense) by being imbued with a context-sensitive affective gloss. I finish with a discussion of the implications that this hypothesis has on the design of value learners.

Their summary:

Economic treatments of agency standardly assume that preferences encode some consistent ordering over world-states revealed in agents’ choices. Real-world preferences, however, have structure that is not always captured in economic models. A person can have conflicting preferences about whether to study for an exam, for example, and the choice they end up making may depend on complex, context-sensitive psychological dynamics, rather than on a simple comparison of two numbers representing how much one wants to study or not study.

Sotala argues that our preferences are better understood in terms of evolutionary theory and reinforcement learning. Humans evolved to pursue activities that are likely to lead to certain outcomes — outcomes that tended to improve our ancestors’ fitness. We prefer those outcomes, even if they no longer actually maximize fitness; and we also prefer events that we have learned tend to produce such outcomes.

Affect and emotion, on Sotala’s account, psychologically mediate our preferences. We enjoy and desire states that are highly rewarding in our evolved reward function. Over time, we also learn to enjoy and desire states that seem likely to lead to high-reward states. On this view, our preferences function to group together events that lead on expectation to similarly rewarding outcomes for similar reasons; and over our lifetimes we come to inherently value states that lead to high reward, instead of just valuing such states instrumentally. Rather than directly mapping onto our rewards, our preferences map onto our expectation of rewards.

Sotala proposes that value learning systems informed by this model of human psychology could more reliably reconstruct human values. On this model, for example, we can expect human preferences to change as we find new ways to move toward high-reward states. New experiences can change which states my emotions categorize as “likely to lead to reward,” and they can thereby modify which states I enjoy and desire. Value learning systems that take these facts about humans’ psychological dynamics into account may be better equipped to take our likely future preferences into account, rather than optimizing for our current preferences alone.

Would be curious to hear whether anyone here has any thoughts. This is basically a "putting rough ideas together and seeing if they make any sense" kind of paper, aimed at clarifying the hypothesis and seeing whether others kind find any obvious holes in it, rather than being at the stage of a serious scientific theory yet.

 

 

[link] "The Happiness Code" - New York Times on CFAR

13 Kaj_Sotala 15 January 2016 06:34AM

http://www.nytimes.com/2016/01/17/magazine/the-happiness-code.html

Long. Mostly quite positive, though does spend a little while rolling its eyes at the Eliezer/MIRI connection and the craziness of taking things like cryonics and polyamory seriously.

View more: Next