Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
I'll do it at some point.
I'll answer this message later.
I could try this sometime.
For most people, all of these thoughts have the same result. The thing in question likely never gets done - or if it does, it's only after remaining undone for a long time and causing a considerable amount of stress. Leaving the "when" ambiguous means that there isn't anything that would propel you into action.
What kinds of thoughts would help avoid this problem? Here are some examples:
- When I find myself using the words "later" or "at some point", I'll decide on a specific time when I'll actually do it.
- If I'm given a task that would take under five minutes, and I'm not in a pressing rush, I'll do it right away.
- When I notice that I'm getting stressed out about something that I've left undone, I'll either do it right away or decide when I'll do it.
- I'm going to get more exercise.
- I'll spend less money on shoes.
- I want to be nicer to people.
- When I see stairs, I'll climb them instead of taking the elevator.
- When I buy shoes, I'll write down how much money I've spent on shoes this year.
- When someone does something that I like, I'll thank them for it.
- The trigger is clear. The "when" part is a specific, visible thing that's easy to notice. "When I see stairs" is good, "before four o'clock" is bad (when before four exactly?). [v]
- The trigger is consistent. The action is something that you'll always want to do when the trigger is fulfilled. "When I leave the kitchen, I'll do five push-ups" is bad, because you might not have the chance to do five push-ups each time when you leave the kitchen. [vi]
- The TAP furthers your goals. Make sure the TAP is actually useful!
[i] Gollwitzer, P. M. (1999). Implementation intentions: strong effects of simple plans. American psychologist, 54(7), 493.
As Luke had done in years past (see 2013 in review and 2014 in review), I (Malo) wanted to take some time to review our activities from last year. In the coming weeks Nate will provide a big-picture strategy update. Here, I’ll take a look back at 2015, focusing on our research progress, academic and general outreach, fundraising, and other activities.
After seeing signs in 2014 that interest in AI safety issues was on the rise, we made plans to grow our research team. Fueled by the response to Bostrom’s Superintelligence and the Future of Life Institute’s “Future of AI” conference, interest continued to grow in 2015. This suggested that we could afford to accelerate our plans, but it wasn’t clear how quickly.
In 2015 we did not release a mid-year strategic plan, as Luke did in 2014. Instead, we laid out various conditional strategies dependent on how much funding we raised during our 2015 Summer Fundraiser. The response was great; we had our most successful fundraiser to date. We hit our first two funding targets (and then some), and set out on an accelerated 2015/2016 growth plan.
As a result, 2015 was a big year for MIRI. After publishing our technical agenda at the start of the year, we made progress on many of the open problems it outlined, doubled the size of our core research team, strengthened our connections with industry groups and academics, and raised enough funds to maintain our growth trajectory. We’re very grateful to all our supporters, without whom this progress wouldn’t have been possible.
Convergent instrumental goals (also basic AI drives) are goals that are useful for pursuing almost any other goal, and are thus likely to be pursued by any agent that is intelligent enough to understand why they’re useful. They are interesting because they may allow us to roughly predict the behavior of even AI systems that are much more intelligent than we are.
Instrumental goals are also a strong argument for why sufficiently advanced AI systems that were indifferent towards human values could be dangerous towards humans, even if they weren’t actively malicious: because the AI having instrumental goals such as self-preservation or resource acquisition could come to conflict with human well-being. “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”
I’ve thought of a candidate for a new convergent instrumental drive: simplifying the environment to make it more predictable in a way that aligns with your goals.
Arguments for risks from general AI are sometimes criticized on the grounds that they rely on a series of linear events, each of which has to occur for the proposed scenario to go through. For example, that a sufficiently intelligent AI could escape from containment, that it could then go on to become powerful enough to take over the world, that it could do this quickly enough without being detected, etc.
The intent of my following series of posts is to briefly demonstrate that AI risk scenarios are in fact disjunctive: composed of multiple possible pathways, each of which could be sufficient by itself. To successfully control the AI systems, it is not enough to simply block one of the pathways: they all need to be dealt with.
I've got two posts in this series up so far:
AIs gaining a decisive advantage discusses four different ways by which AIs could achieve a decisive advantage over humanity. The one-picture version is:
AIs gaining the power to act autonomously discusses ways by which AIs might come to act as active agents in the world, despite possible confinement efforts or technology. The one-picture version (which you may wish to click to enlarge) is:
These posts draw heavily on my old paper, Responses to Catastrophic AGI Risk, as well as some recent conversations here on LW. Upcoming posts will try to cover more new ground.
View more: Next