How can I reduce existential risk from AI?

lukeprog

LESSWRONG
LW

How can I reduce existential risk from AI? — LessWrong

63 How can I reduce existential risk from AI?

by lukeprog

13th Nov 2012

9 min read

63

Suppose you think that reducing the risk of human extinction is the highest-value thing you can do. Or maybe you want to reduce "x-risk" because you're already a comfortable First-Worlder like me and so you might as well do something epic and cool, or because you like the community of people who are doing it already, or whatever.

Suppose also that you think AI is the most pressing x-risk, because (1) mitigating AI risk could mitigate all other existential risks, but not vice-versa, and because (2) AI is plausibly the first existential risk that will occur.

In that case, what should you do? How can you reduce AI x-risk?

It's complicated, but I get this question a lot, so let me try to provide some kind of answer.

Meta-work, strategy work, and direct work

When you're facing a problem and you don't know what to do about it, there are two things you can do:

1. Meta-work: Amass wealth and other resources. Build your community. Make yourself stronger. Meta-work of this sort will be useful regardless of which "direct work" interventions turn out to be useful for tackling the problem you face. Meta-work also empowers you to do strategic work.

2. Strategy work: Purchase a better strategic understanding of the problem you're facing, so you can see more clearly what should be done. Usually, this will consist of getting smart and self-critical people to honestly assess the strategic situation, build models, make predictions about the effects of different possible interventions, and so on. If done well, these analyses can shed light on which kinds of "direct work" will help you deal with the problem you're trying to solve.

When you have enough strategic insight to have discovered some interventions that you're confident will help you tackle the problem you're facing, then you can also engage in:

3. Direct work: Directly attack the problem you're facing, whether this involves technical research, political action, particular kinds of technological development, or something else.

Thinking with these categories can be useful even though the lines between them are fuzzy. For example, you might have to do some basic awareness-raising in order to amass funds for your cause, and then once you've spent those funds on strategy work, your strategy work might tell you that a specific form of awareness-raising is useful for political action that counts as "direct work." Also, some forms of strategy work can feel like direct work, depending on the type of problem you're tackling.

Meta-work for AI x-risk reduction

Make money. Become stronger. Build a community, an audience, a movement. Store your accumulated resources in yourself, in your community, in a donor-advised fund, or in an organization that can advance your causes better than you can as an individual.

Make money: In the past 10 years, many people have chosen to start businesses or careers that (1) will predictably generate significant wealth they can spend on AI x-risk reduction, (2) will be enjoyable enough to "stick with it," and (3) will not create large negative externalities. But certainly, the AI x-risk reduction community needs a lot more people to do this! If you want advice, the folks at 80,000 Hours are the experts on "ethical careers" of this sort.
Become stronger: Sometimes it makes sense to focus on improving your productivity, your research skills, your writing skills, your social skills, etc. before you begin using those skills to achieve your goals. Example: Vladimir Nesov has done some original research, but mostly he has spent the last few years improving his math skills before diving into original research full-time.
Build a community / a movement: Individuals can change the world, but communities and movements can do even better, if they're well-coordinated. Read What Psychology Can Teach Us About Spreading Social Change. Launch (or improve) a Less Wrong group. Join a THINK group. Help grow and improve the existing online communities that tend to have high rates of interest in x-risk reduction: LessWrong, Singularity Volunteers, and 80,000 Hours. Help write short primers on crucial topics. To reach a different (and perhaps wealthier, more influential) audience, maybe do help with something like the Singularity Summit.
Develop related skills in humanity. In other words, "make humanity stronger in ways that are almost certainly helpful for reducing AI x-risk" (though strategic research may reveal they are not nearly the most helpful ways to reduce AI x-risk). This might include, for example, getting better at risk analysis with regard to other catastrophic risks, or improving our generalized forecasting abilities by making wider use of prediction markets.
Fund a person or organization doing (3) or (4) above. The Singularity Institute probably does more AI x-risk movement building than anyone, followed by the Future of Humanity Institute. There are lots of organizations doing things that plausibly fall under (4).

Note that if you mostly contribute to meta work, you want to also donate a small sum (say, $15/mo) to strategy work or direct work. If you only contribute to meta work for a while, an outside view (around SI, anyway) suggests there's a good chance you'll never manage to ever do anything non-meta. A perfect Bayesian agent might not optimize this way, but optimal philanthropy for human beings works differently.

Strategy work for AI x-risk reduction

How can we improve our ability to do long-term technological forecasting? Is AGI more likely to be safe if developed sooner (Goertzel & Pitt 2012) or later (Muehlhauser & Salamon 2012)? How likely is hard takeoff vs. soft takeoff? Could we use caged AGIs or WBEs to develop safe AGIs or WBEs? How might we reduce the chances of an AGI arms race (Shulman 2009)? Which interventions should we prioritize now, to reduce AI x-risk?

These questions and many others have received scant written analysis — unless you count the kind of written analysis that is (1) written with much vagueness and ambiguity, (2) written in the author's own idiosyncratic vocabulary, (3) written with few citations to related work, and is (4) spread across a variety of non-linear blog articles, forum messages, and mailing list postings. (The trouble with that kind of written analysis is that it is mostly impenetrable or undiscoverable to most researchers, especially the ones who are very busy because they are highly productive and don't have time to comb through 1,000 messy blog posts.)

Here, then, is how you might help with strategy work for AI x-risk reduction:

Consolidate and clarify the strategy work currently only available in a disorganized, idiosyncratic form. This makes it easier for researchers around the world to understand the current state of play, and build on it. Examples include Chalmers (2010), Muehlhauser & Helm (2012), Muehlhauser & Salamon (2012), Yampolskiy & Fox (2012), and (much of) Nick Bostrom's forthcoming scholarly monograph on machine superintelligence.
Write new strategic analyses. Examples include Yudkowsky (2008), Sotala & Valpola (2012), Shulman & Sandberg (2010), Shulman (2010), Shulman & Armstrong (2009), Bostrom (2012), Bostrom (2003), Omohundro (2008), Goertzel & Pitt (2012), Yampolskiy (2012), and some Less Wrong posts: Muehlhauser (2012), Yudkowsky (2012), etc. See here for a list of desired strategic analyses (among other desired articles).
Assist with (1) or (2), above. This is what SI's "remote researchers" tend to do, along with many SI volunteers. Often, there are "chunks" of research that can be broken off and handed to people who are not an article's core authors, e.g. "Please track down many examples of the 'wrong wish' trope so I can use a vivid example in my paper" or "Please review and summarize the part of the machine ethics literature that has to do with learning preferences from examples."
Provide resources and platforms that make it easier for researchers to contribute to strategy work. Things like my AI risk bibliography and list of forthcoming and desired articles on AI risk make it easier for researchers to find relevant work, and to know what projects would be helpful to take on. SI's public BibTeX file and Mendeley group make it easier for researchers to find relevant papers. The AGI conference, and volumes like Singularity Hypotheses, provide publishing venues for researchers in this fledgling field. Recent improvements to the Less Wrong wiki will hopefully make it easier for researchers to understand the (relatively new) concepts relevant to AI x-risk strategy work. A scholarly AI risk wiki would be even better. It would also help to find editors of prestigious journals who are open to publishing well-written AGI risk papers, so that university researchers can publish on these topics without hurting their chances to get tenure.
Fund a person or organization doing any of the above. Again, the most obvious choices are the Singularity Institute or the Future of Humanity Institute. Most of the articles and "resources" above were produced by either SI or FHI. SI offers more opportunities for (3). The AGI conference is organized by Ben Goertzel and others, who are of course always looking for sponsors for the AGI conference.

Direct work for AI x-risk reduction

We are still at an early stage in doing strategy work on AI x-risk reduction. Because of this, most researchers in the field feel pretty uncertain about which interventions would be most helpful for reducing AI x-risk. Thus, they focus on strategic research, so they can purchase more confidence about which interventions would be helpful.

Despite this uncertainty, I'll list some interventions that at least some people have proposed for mitigating AI x-risk, focusing on the interventions that are actionable today.

Safe AGI research? Many proposals have been made for developing AGI designs with internal motivations beneficial to humans — including Friendly AI (Yudkowsky 2008) and GOLEM (Goertzel 2010) — but researchers disagree about which approaches are most promising (Muehlhauser & Helm 2012; Goertzel & Pitt 2012).
AI boxing research? Many proposals have been made for confining AGIs (Yampolskiy 2012; Armstrong et al. 2012). But such research programs may end up being fruitless, since it may be that a superintelligence will always be able to think its way out of any confinement designed by a human-level intelligence.
AI safety promotion? One may write about the importance of AI safety, persuade AI safety researchers to take up a greater concern for safety, and so on.
Regulate AGI development? Or not? Hughes (2001) and Daley (2011) call for regulation of AGI development. To bring this about, citizens could petition their governments and try to persuade decision-makers. McGinnis (2010) and Goertzel & Pitt (2012), however, oppose AGI regulation.
Accelerate AGI? Or not? Muehlhauser & Salamon (2012) recommend accelerating AGI safety research relative to AGI capabilities research, so that the first AGIs have a better chance of being designed to be safe. Goertzel & Pitt (2012), in contrast, argue that "the pace of AGI progress is sufficiently slow that practical work towards human-level AGI is in no danger of outpacing associated ethical theorizing," and argue that AGI development will be safest if it happens sooner rather than later.
Accelerate WBE? Or not? Participants in a 2011 workshop concluded that accelerating WBE probably increases AI x-risk, but Koene (2012) argues that WBE is safer than trying to create safe AGI.
Ban AGI and hardware development? Or not? Joy (2000) famously advocated a strategy of relinquishment, and Berglas (2009) goes so far as to suggest we abandon further computing power development. Most people, of course, disagree. In any case, it is doubtful that groups with such views could overcome the economic and military incentives for further computing and AGI development.
Foster positive values? Kurzweil (2005) and others argue that one way to increase the odds that AGIs will behave ethically is to increase the chances that the particular humans who create them are moral. Thus, one might reduce AI x-risk by developing training and technology for moral enhancement (Persson and Savulescu 2008).
Cognitive enhancement? Some have wondered whether the problem of safe AGI is so difficult that it will require cognitive enhancement for humans to solve it. Meanwhile, others worry that cognitive enhancement will only accelerate the development of dangerous technologies (Persson and Savulescu 2008)

Besides engaging in these interventions directly, one may of course help to fund them. I don't currently know of a group pushing for AGI development regulations, or for banning AGI development. You could accelerate AGI by investing in AGI-related companies, or you could accelerate AGI safety research (and AI boxing research) relative to AGI capabilities research by funding SI or FHI, who also probably do the most AI safety promotion work. You could fund research on moral enhancement or cognitive enhancement by offering grants for such research. Or, if you think "low-tech" cognitive enhancement is promising, you could fund organizations like Lumosity (brain training) or the Center for Applied Rationality (rationality training).

Conclusion

This is a brief guide to what you can do to reduce existential risk from AI. A longer guide could describe the available interventions in more detail, and present the arguments for and against each one. But that is "strategic work," and requires lots of time (and therefore money) to produce.