Thanks for putting all this stuff in one place!
It makes me kind of sad that we still have more or less no answer to so many big, important questions. Does anyone else share this worry?
Notice also that until now, we didn't even have a summary of the kind in this post! So yeah, we're still at an early stage of strategic work, which is why SI and FHI are spending so much time on strategic work.
I'll note, however, that I expect significant strategic insights to come from the technical work (e.g. FAI math). Such work should will give us insight into how hard the problems actually are, what architectures look most promising, to what degree the technical work can be outsourced to the mainstream academic community, and so on.
I expect significant strategic insights to come from the technical work (e.g. FAI math).
Interesting point. I'm worried that, while FAI math will help us understand what is dangerous or outsourceable from our particular path, many many other paths to AGI are possible, and we won't learn from FAI math which of those other paths are dangerous or likely.
I feel like one clear winning strategy is safety promotion. It seems that almost no bad can come from promoting safety ideas among AI researchers and investors. It also seems relatively easy, in that requires only regular human skills of networking, persuasion, et cetera.
You're probably right about safety promotion, but calling it "clear" may be an overstatement. A possible counterargument:
Existing AI researchers are likely predisposed to think that their AGI is likely to naturally be both safe and powerful. If they are exposed to arguments that it will instead naturally be both dangerous and very powerful (the latter half of the argument can't be easily omitted; the potential danger is in part because of the high potential power), would it not be a natural result of confirmation bias for the preconception-contradicting "dangerous" half of the argument to be disbelieved and the preconception-confirming "very powerful" half of the argument to be believed?
Half of the AI researcher interviews posted to LessWrong appear to be with people who believe that "Garbage In, Garbage Out" only applies to arithmetic, not to morality. If the end result of persuasion is that as many as half of them have that mistake corrected while the remainder are merely convinced that they should work even harder, that may not be a net win.
I've lost all disrespect for the "stealing" of generic ideas, and roughly 25% of the intended purpose of my personal quotes files is so that I can "rob everyone blind" if I ever try writing fiction again. Any aphorisms I come up with myself are free to be folded, spindled, and mutilated. I try to cite originators when format and poor memory permit, and receiving the same favor would be nice, but I certainly wouldn't mind seeing my ideas spread completely unattributed either.
I’m having trouble deciding where I should target my altruism. I’m basically just starting out in life, and I recognize that this is a really valuable opportunity that I won’t get in the future – I have no responsibilities and no sunk costs yet. For a long time, I’ve been looking into the idea of efficient charity, and I’ve been surfing places like 80K Hours and Give Well. But I get the feeling that I might be too emotionally invested in the idea of donating to, say, disease prevention or inoculation or de-worming (found to be among the most efficient conventional charities) over, say, Friendly AI.
I think that given my skills, personality, etc, there are some good and bad reasons to go into either existential risk mitigation or health interventions, but it’s not like the balance is going to be exactly even. I need some help figuring out what to do – I suppose I could work out some “combination,” but it’s usually not a good idea to hedge your bets in this way because you aren’t making maximum impact.
Direct reasons for health interventions:
Lots of good data, an actual dollar amount per life has been calculated; low risk of failure, whereas for everything I’ve read I’m really not s
Sounds like you should ask for a call with the philanthropic career experts at 80,000 Hours if you haven't already.
When someone proposes what we should do, where by we he implicitly refers to a large group of people he has no real influence over (as in the banning AGI & hardware development proposal), I'm wondering what is the value of this kind of speculation - other than amusing oneself with a picture of "what would this button do" on a simulation of Earth under one's hands.
As I see it, there's no point in thinking about these kind of "large scale" interventions that are closely interweaved with politics. Better to focus on what relatively small groups of people can do (this includes, e.g. influencing a few other AGI development teams to work on FAI), and in this context, I think out best hope is in deeply understanding the mechanics of intelligence and thus having at least a chance at creating FAI before some team that doesn't care the least about safety dooms us all - and there will be such teams, regardless of what we do today, just take a look at some of the "risks from AI" interviews...
Suppose you think that reducing the risk of human extinction is the highest-value thing you can do. Or maybe you want to reduce "x-risk" because you're already a comfortable First-Worlder like me and so you might as well do something epic and cool, or because you like the community of people who are doing it already, or whatever.
I think this post is great: important, informative, concise, and well-referenced. However, my impression is that the opening paragraph trivializes the topic. If you were listing the things we could do to reduce or eliminate global poverty, would you preface your article by saying that "reducing global poverty is cool"? You probably wouldn't. Then why write that kind of preface when the subject is existential risk reduction, which is even more important?
Greetings. New to LessWrong, but particularly compelled by the discussion of existential risk.
It seems like one of the priorities would be to ease the path for people, once they're aware of existential risk, to move swiftly through doing meta work to doing strategy work and direct work. For myself, once I'd become aware of existential risk as a whole, it became an attractor for a whole range of prior ideas and I had to find a way towards direct work as soon as possible. That's easier said than done.
Yet it seems like the shortest path would be to catal...
Note that if you mostly contribute to meta work, you want to also donate a small sum (say, $15/mo) to strategy work or direct work.
I suspect the mechanism underlying this is a near/far glitch that's also responsible for procrastination. I've had the experience of putting important but deadline-free stuff off for months, sincerely believing that I would get to it at some point (no, I wasn't short on free time...)
It takes time to develop the skill of noticing that now is a good time to do that thing you've been telling yourself to do, and then actually biting the bullet and doing it.
I don't currently know of a group pushing (...) for banning AGI development. You could accelerate AGI by investing in AGI-related companies (...)
This is not meant as a criticism of the post, but it seems like we should be able to do better than having some of us give money to groups pushing for banning AGI development, and others invest in AGI-related companies to accelerate AGI, especially if both of these are altruists with a reasonably similar prior aiming to reduce existential risk...
(Both giving to strategic research instead seems like a reasonable alternative.)
Right... it's a bit like in 2004 when my friend insisted that we both waste many hours to go vote on the presidential election, even though we both knew we were voting for opposite candidates. It would have been wiser for us both to stay home and donate to something we both supported (e.g. campaign finance reform), in whatever amount reflected the value of the time we actually spent voting.
I should note that investing in an AGI company while also investing in AGI safety research need not be as contradictory as it sounds, if you can use your investment in the AGI company to bias its development work toward safety, as Legg once suggested. In fact, I know at least three individuals (that I shall not name) who appear to be doing exactly this.
So I read the title and thought you mean the risk of AI having existential crises... which is an interesting question, when you think about it.
How can I reduce existential risk from AI?
First answer that sprung to my mind: You could work to increase existential risk from other sources. If you make it less likely that an AI will ever be built you reduce the risk of AI. Start work on self-replicating nano-tech or biological weapons. Or even just blow an important building and make it look like an Arabic speaker did it.
That leads to the second solution: When working on AI (or with genies or problem solvers in general) take care construct questions that are not lost purposes.
Regarding "making money" / "accumulating wealth": Why is wealth in my hands preferable to wealth in someone else's hands?
Because it's extremely unlikely that a random person will be at least as concerned with existential risk as you are.
But why is it likely that I'll be better at doing anything about it? Just because I try to be rational, doesn't mean I'm any good at it - especially at something where we have no idea what the correct actions are. How do I know that my efforts will even have a positive dot-product with the "do the right thing" vector?
I realize (and I'm probably not alone in this) that I've been implicitly using this {meta-work, strategy-work, direct-work} process to try and figure out where/how to contribute. Thanks for this guide/analysis.
I conjecture that one of the "community of people" links was meant to go somewhere other than where it currently does. (SIAI?)
The investment control software of large financial companies seems the most likely source of rogue AI.
Charlie Stross, in his 2005 novel Accelerando, implicitly suggested "financial device" AI as the most likely seed for rogue AI. Last year, David Brin independently and explicitly promoted the possibility. The idea seems more likely today.
With a multi-million dollar salary from an evil bank, you can contribute to species survival.
You're welcome.
And I'm dead serious.
Apparently, Brin first wrote a full outline of the idea here: http://ieet.org/i...
Lumosity panned in The Guardian: http://www.guardian.co.uk/science/2009/feb/26/brain-training-games-which
I know discussing politics on LW is discouraged, but is voting in elections a viable method of decreasing existential risk by making it more likely that those who are elected will take more action to decrease it? If so, what parties should be voted for? If this isn't something that should be discussed on LW, just say so and I can make a reddit post on it.
How important is trying to personally live longer for decreasing existential risk? IMO, It seems that most risk of existential catastrophes occurs sooner rather than later, so I doubt living much longer is extremely important. For example, Wikipedia says that a study at the Singularity Summit found that the median date for the singularity occurring is 2040, and one personal gave 80% confidence intervals from 5 - 100 years. Nanotechnology seems to be predicted to come sooner rather than later as well. What does everyone else think?
Are there any decent arguments saying that working on trying to develop safe AGI would increase existential risk? I've found none, but I'd like to know because I'm considering developing AGI as a career.
Edit: What about AI that's not AGI?
If you only contribute to meta work for a while, the outside view (around SI, anyway) suggests there's a good chance you'll forget to ever do anything non-meta.
Keeping a to-do list may be a cheaper way of keeping yourself from forgetting.
I don't think it's actually a problem of "forgetting"; I should probably clarify that language. It's more about habit formation. If one takes up the habit of doing no direct work day after day, it may be difficult to break that habit later.
Suppose you think that reducing the risk of human extinction is the highest-value thing you can do. Or maybe you want to reduce "x-risk" because you're already a comfortable First-Worlder like me and so you might as well do something epic and cool, or because you like the community of people who are doing it already, or whatever.
Suppose also that you think AI is the most pressing x-risk, because (1) mitigating AI risk could mitigate all other existential risks, but not vice-versa, and because (2) AI is plausibly the first existential risk that will occur.
In that case, what should you do? How can you reduce AI x-risk?
It's complicated, but I get this question a lot, so let me try to provide some kind of answer.
Meta-work, strategy work, and direct work
When you're facing a problem and you don't know what to do about it, there are two things you can do:
1. Meta-work: Amass wealth and other resources. Build your community. Make yourself stronger. Meta-work of this sort will be useful regardless of which "direct work" interventions turn out to be useful for tackling the problem you face. Meta-work also empowers you to do strategic work.
2. Strategy work: Purchase a better strategic understanding of the problem you're facing, so you can see more clearly what should be done. Usually, this will consist of getting smart and self-critical people to honestly assess the strategic situation, build models, make predictions about the effects of different possible interventions, and so on. If done well, these analyses can shed light on which kinds of "direct work" will help you deal with the problem you're trying to solve.
When you have enough strategic insight to have discovered some interventions that you're confident will help you tackle the problem you're facing, then you can also engage in:
3. Direct work: Directly attack the problem you're facing, whether this involves technical research, political action, particular kinds of technological development, or something else.
Thinking with these categories can be useful even though the lines between them are fuzzy. For example, you might have to do some basic awareness-raising in order to amass funds for your cause, and then once you've spent those funds on strategy work, your strategy work might tell you that a specific form of awareness-raising is useful for political action that counts as "direct work." Also, some forms of strategy work can feel like direct work, depending on the type of problem you're tackling.
Meta-work for AI x-risk reduction
Make money. Become stronger. Build a community, an audience, a movement. Store your accumulated resources in yourself, in your community, in a donor-advised fund, or in an organization that can advance your causes better than you can as an individual.
Make money: In the past 10 years, many people have chosen to start businesses or careers that (1) will predictably generate significant wealth they can spend on AI x-risk reduction, (2) will be enjoyable enough to "stick with it," and (3) will not create large negative externalities. But certainly, the AI x-risk reduction community needs a lot more people to do this! If you want advice, the folks at 80,000 Hours are the experts on "ethical careers" of this sort.
Become stronger: Sometimes it makes sense to focus on improving your productivity, your research skills, your writing skills, your social skills, etc. before you begin using those skills to achieve your goals. Example: Vladimir Nesov has done some original research, but mostly he has spent the last few years improving his math skills before diving into original research full-time.
Build a community / a movement: Individuals can change the world, but communities and movements can do even better, if they're well-coordinated. Read What Psychology Can Teach Us About Spreading Social Change. Launch (or improve) a Less Wrong group. Join a THINK group. Help grow and improve the existing online communities that tend to have high rates of interest in x-risk reduction: LessWrong, Singularity Volunteers, and 80,000 Hours. Help write short primers on crucial topics. To reach a different (and perhaps wealthier, more influential) audience, maybe do help with something like the Singularity Summit.
Develop related skills in humanity. In other words, "make humanity stronger in ways that are almost certainly helpful for reducing AI x-risk" (though strategic research may reveal they are not nearly the most helpful ways to reduce AI x-risk). This might include, for example, getting better at risk analysis with regard to other catastrophic risks, or improving our generalized forecasting abilities by making wider use of prediction markets.
Fund a person or organization doing (3) or (4) above. The Singularity Institute probably does more AI x-risk movement building than anyone, followed by the Future of Humanity Institute. There are lots of organizations doing things that plausibly fall under (4).
Note that if you mostly contribute to meta work, you want to also donate a small sum (say, $15/mo) to strategy work or direct work. If you only contribute to meta work for a while, an outside view (around SI, anyway) suggests there's a good chance you'll never manage to ever do anything non-meta. A perfect Bayesian agent might not optimize this way, but optimal philanthropy for human beings works differently.
Strategy work for AI x-risk reduction
How can we improve our ability to do long-term technological forecasting? Is AGI more likely to be safe if developed sooner (Goertzel & Pitt 2012) or later (Muehlhauser & Salamon 2012)? How likely is hard takeoff vs. soft takeoff? Could we use caged AGIs or WBEs to develop safe AGIs or WBEs? How might we reduce the chances of an AGI arms race (Shulman 2009)? Which interventions should we prioritize now, to reduce AI x-risk?
These questions and many others have received scant written analysis — unless you count the kind of written analysis that is (1) written with much vagueness and ambiguity, (2) written in the author's own idiosyncratic vocabulary, (3) written with few citations to related work, and is (4) spread across a variety of non-linear blog articles, forum messages, and mailing list postings. (The trouble with that kind of written analysis is that it is mostly impenetrable or undiscoverable to most researchers, especially the ones who are very busy because they are highly productive and don't have time to comb through 1,000 messy blog posts.)
Here, then, is how you might help with strategy work for AI x-risk reduction:
Consolidate and clarify the strategy work currently only available in a disorganized, idiosyncratic form. This makes it easier for researchers around the world to understand the current state of play, and build on it. Examples include Chalmers (2010), Muehlhauser & Helm (2012), Muehlhauser & Salamon (2012), Yampolskiy & Fox (2012), and (much of) Nick Bostrom's forthcoming scholarly monograph on machine superintelligence.
Write new strategic analyses. Examples include Yudkowsky (2008), Sotala & Valpola (2012), Shulman & Sandberg (2010), Shulman (2010), Shulman & Armstrong (2009), Bostrom (2012), Bostrom (2003), Omohundro (2008), Goertzel & Pitt (2012), Yampolskiy (2012), and some Less Wrong posts: Muehlhauser (2012), Yudkowsky (2012), etc. See here for a list of desired strategic analyses (among other desired articles).
Assist with (1) or (2), above. This is what SI's "remote researchers" tend to do, along with many SI volunteers. Often, there are "chunks" of research that can be broken off and handed to people who are not an article's core authors, e.g. "Please track down many examples of the 'wrong wish' trope so I can use a vivid example in my paper" or "Please review and summarize the part of the machine ethics literature that has to do with learning preferences from examples."
Provide resources and platforms that make it easier for researchers to contribute to strategy work. Things like my AI risk bibliography and list of forthcoming and desired articles on AI risk make it easier for researchers to find relevant work, and to know what projects would be helpful to take on. SI's public BibTeX file and Mendeley group make it easier for researchers to find relevant papers. The AGI conference, and volumes like Singularity Hypotheses, provide publishing venues for researchers in this fledgling field. Recent improvements to the Less Wrong wiki will hopefully make it easier for researchers to understand the (relatively new) concepts relevant to AI x-risk strategy work. A scholarly AI risk wiki would be even better. It would also help to find editors of prestigious journals who are open to publishing well-written AGI risk papers, so that university researchers can publish on these topics without hurting their chances to get tenure.
Fund a person or organization doing any of the above. Again, the most obvious choices are the Singularity Institute or the Future of Humanity Institute. Most of the articles and "resources" above were produced by either SI or FHI. SI offers more opportunities for (3). The AGI conference is organized by Ben Goertzel and others, who are of course always looking for sponsors for the AGI conference.
Direct work for AI x-risk reduction
We are still at an early stage in doing strategy work on AI x-risk reduction. Because of this, most researchers in the field feel pretty uncertain about which interventions would be most helpful for reducing AI x-risk. Thus, they focus on strategic research, so they can purchase more confidence about which interventions would be helpful.
Despite this uncertainty, I'll list some interventions that at least some people have proposed for mitigating AI x-risk, focusing on the interventions that are actionable today.
Safe AGI research? Many proposals have been made for developing AGI designs with internal motivations beneficial to humans — including Friendly AI (Yudkowsky 2008) and GOLEM (Goertzel 2010) — but researchers disagree about which approaches are most promising (Muehlhauser & Helm 2012; Goertzel & Pitt 2012).
AI boxing research? Many proposals have been made for confining AGIs (Yampolskiy 2012; Armstrong et al. 2012). But such research programs may end up being fruitless, since it may be that a superintelligence will always be able to think its way out of any confinement designed by a human-level intelligence.
AI safety promotion? One may write about the importance of AI safety, persuade AI safety researchers to take up a greater concern for safety, and so on.
Regulate AGI development? Or not? Hughes (2001) and Daley (2011) call for regulation of AGI development. To bring this about, citizens could petition their governments and try to persuade decision-makers. McGinnis (2010) and Goertzel & Pitt (2012), however, oppose AGI regulation.
Accelerate AGI? Or not? Muehlhauser & Salamon (2012) recommend accelerating AGI safety research relative to AGI capabilities research, so that the first AGIs have a better chance of being designed to be safe. Goertzel & Pitt (2012), in contrast, argue that "the pace of AGI progress is sufficiently slow that practical work towards human-level AGI is in no danger of outpacing associated ethical theorizing," and argue that AGI development will be safest if it happens sooner rather than later.
Accelerate WBE? Or not? Participants in a 2011 workshop concluded that accelerating WBE probably increases AI x-risk, but Koene (2012) argues that WBE is safer than trying to create safe AGI.
Ban AGI and hardware development? Or not? Joy (2000) famously advocated a strategy of relinquishment, and Berglas (2009) goes so far as to suggest we abandon further computing power development. Most people, of course, disagree. In any case, it is doubtful that groups with such views could overcome the economic and military incentives for further computing and AGI development.
Foster positive values? Kurzweil (2005) and others argue that one way to increase the odds that AGIs will behave ethically is to increase the chances that the particular humans who create them are moral. Thus, one might reduce AI x-risk by developing training and technology for moral enhancement (Persson and Savulescu 2008).
Cognitive enhancement? Some have wondered whether the problem of safe AGI is so difficult that it will require cognitive enhancement for humans to solve it. Meanwhile, others worry that cognitive enhancement will only accelerate the development of dangerous technologies (Persson and Savulescu 2008)
Besides engaging in these interventions directly, one may of course help to fund them. I don't currently know of a group pushing for AGI development regulations, or for banning AGI development. You could accelerate AGI by investing in AGI-related companies, or you could accelerate AGI safety research (and AI boxing research) relative to AGI capabilities research by funding SI or FHI, who also probably do the most AI safety promotion work. You could fund research on moral enhancement or cognitive enhancement by offering grants for such research. Or, if you think "low-tech" cognitive enhancement is promising, you could fund organizations like Lumosity (brain training) or the Center for Applied Rationality (rationality training).
Conclusion
This is a brief guide to what you can do to reduce existential risk from AI. A longer guide could describe the available interventions in more detail, and present the arguments for and against each one. But that is "strategic work," and requires lots of time (and therefore money) to produce.
My thanks to Michael Curzi for inspiring this post.