Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Reply to Holden on The Singularity Institute

44 Post author: lukeprog 10 July 2012 11:20PM

Holden Karnofsky of GiveWell has objected to the Singularity Institute (SI) as a target for optimal philanthropy. As someone who thinks that existential risk reduction is really important and also that the Singularity Institute is an important target of optimal philanthropy, I would like to explain why I disagree with Holden on these subjects. (I am also SI's Executive Director.)

Mostly, I'd like to explain my views to a broad audience. But I'd also like to explain my views to Holden himself. I value Holden's work, I enjoy interacting with him, and I think he is both intelligent and capable of changing his mind about Big Things like this. Hopefully Holden and I can continue to work through the arguments together, though of course we are both busy with many other things.

I appreciate the clarity and substance of Holden's objections, and I hope to reply in kind. I begin with an overview of some basic points that may be familiar to most Less Wrong veterans, and then I reply point-by-point to Holden's post. In the final section, I summarize my reply to Holden.

Holden raised many different issues, so unfortunately this post needed to be long. My apologies to Holden if I have misinterpreted him at any point.


Contents

  • Existential risk reduction is a critical concern for many people, given their values and given many plausible models of the future. Details here.
  • Among existential risks, AI risk is probably the most important. Details here.
  • SI can purchase many kinds of AI risk reduction more efficiently than other groups can. Details here.
  • These points and many others weigh against many of Holden's claims and conclusions. Details here.
  • Summary of my reply to Holden


Comments

I must be brief, so while reading this post I am sure many objections will leap to your mind. To encourage constructive discussion on this post, each question (posted as a comment on this page) that follows the template described below will receive a reply from myself or another SI representative.

Please word your question as clearly and succinctly as possible, and don't assume your readers will have read this post before reading your question (because: the conversations here may be used as source material for a comprehensive FAQ).

Here's an example of how you could word the first paragraph of your question: "You claimed that [insert direct quote here], and also that [insert another direct quote here]. That seems to imply that [something something]. But that doesn't seem to take into account that [blah blah blah]. What do you think of that?"

If your question needs more explaining, leave the details to subsequent paragraphs in your comment. Please post multiple questions as multiple comments, so they can be voted upon and replied to individually. If you don't follow these rules, I can't guarantee SI will have time to give you a reply. (We probably won't.)


Why many people care greatly about existential risk reduction

Why do many people consider existential risk reduction to be humanity's most important task? I can't say it much better than Nick Bostrom does, so I'll just quote him:

An existential risk is one that threatens the premature extinction of Earth-originating intelligent life or the permanent and drastic destruction of its potential for desirable future development. Although it is often difficult to assess the probability of existential risks, there are many reasons to suppose that the total such risk confronting humanity over the next few centuries is significant...

Humanity has survived what we might call natural existential risks [asteroid impacts, gamma ray bursts, etc.] for hundreds of thousands of years; thus it is prima facie unlikely that any of them will do us in within the next hundred...

In contrast, our species is introducing entirely new kinds of existential risk—threats we have no track record of surviving... In particular, most of the biggest existential risks seem to be linked to potential future technological breakthroughs that may radically expand our ability to manipulate the external world or our own biology. As our powers expand, so will the scale of their potential consequences—intended and unintended, positive and negative. For example, there appear to be significant existential risks in some of the advanced forms of biotechnology, molecular nanotechnology, and machine intelligence that might be developed in the decades ahead.

What makes existential catastrophes especially bad is not that they would [cause] a precipitous drop in world population or average quality of life. Instead, their significance lies primarily in the fact that they would destroy the future... To calculate the loss associated with an existential catastrophe, we must consider how much value would come to exist in its absence. It turns out that the ultimate potential for Earth-originating intelligent life is literally astronomical.

One gets a large number even if one confines one’s consideration to the potential for biological human beings living on Earth. If we suppose... that our planet will remain habitable for at least another billion years, and we assume that at least one billion people could live on it sustainably, then the potential exist for at least 1018 human lives. [The numbers get way bigger if you consider the expansion of posthuman civilization to the rest of the galaxy or the prospect of mind uploading.]

Even if we use the most conservative of these estimates, which entirely ignores the possibility of space colonization and software minds, we find that the expected loss of an existential catastrophe is greater than the value of 1016 human lives...

These considerations suggest that the loss in expected value resulting from an existential catastrophe is so enormous that the objective of reducing existential risks should be a dominant consideration whenever we act out of an impersonal concern for humankind as a whole.

I refer the reader to Bostrom's paper for further details and additional arguments, but neither his paper nor this post can answer every objection one might think of.

Nor can I summarize all the arguments and evidence related to estimating the severity and time horizon of every proposed existential risk. Even the 500+ pages of Oxford University Press' Global Catastrophic Risks can barely scratch the surface of this enormous topic. As explained in Intelligence Explosion: Evidence and Import, predicting long-term technological progress is hard. Thus, we must

examine convergent outcomes that—like the evolution of eyes or the emergence of markets—can come about through any of several different paths and can gather momentum once they begin.

I'll say more about convergent outcomes later, but for now I'd just like to suggest that:

  1. Many humans living today value both current and future people enough that if existential catastrophe is plausible this century, then upon reflection (e.g. after counteracting their unconscious, default scope insensitivity) they would conclude that reducing the risk of existential catastrophe is the most valuable thing they can do — whether through direct work or by donating to support direct work. It is to these people I appeal. (I also have much to say to people who e.g. don't care about future people, but it is too much to say here and now.)

  2. As it turns out, we do have good reason to believe that existential catastrophe is plausible this century.

I don't have the space here to discuss the likelihood of different kinds of existential catastrophe that could plausibly occur this century (see GCR for more details), so instead I'll talk about just one of them: an AI catastrophe.


AI risk: the most important existential risk

There are two primary reasons I think AI is the most important existential risk:

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future. Machine superintelligence working in the service of humane goals could use its intelligence and resources to prevent all other existential catastrophes. (Eliezer: "I distinguish 'human', that which we are, from 'humane'—that which, being human, we wish we were.")

Reason 2: AI is probably the first existential risk we must face (given my evidence, only the tiniest fraction of which I can share in a blog post).

One reason AI may be the most urgent existential risk is that it's more likely for AI (compared to other sources of catastrophic risk) to be a full-blown existential catastrophe (as opposed to a merely billions dead catastrophe). Humans are smart and adaptable; we are already set up for a species-preserving number of humans to survive (e.g. in underground bunkers with stockpiled food, water, and medicine) major catastrophes from nuclear war, superviruses, supervolcano eruption, and many cases of asteroid impact or nanotechnological ecophagy.

Machine superintelligences, however, could intelligently seek out and neutralize humans which they (correctly) recognize as threats to the maximal realization of their goals. Humans are surprisingly easy to kill if an intelligent process is trying to do so. Cut off John's access to air for a few minutes, or cut off his water supply for a few days, or poke him with a sharp stick, and he dies. Forever. (Post-humans might shudder at this absurdity like we shudder at the idea that people used to die from their teeth.)

Why think AI is coming anytime soon? This is too complicated a topic to breach here. See Intelligence Explosion: Evidence and Import for a brief analysis of AI timelines. Or try The Uncertain Future, which outputs an estimated timeline for human-level AI based on your predictions of various technological developments. (SI is currently collaborating with the Future of Humanity Institute to write another paper on this subject.)

It's also important to mention that the case for caring about AI risk is less conjunctive that many seem to think, which I discuss in more detail here.


SI can purchase several kinds of AI risk reduction more efficiently than others can

The two organizations working most directly to reduce AI risk are the Singularity Institute and the Future of Humanity Institute (FHI). Luckily, these organizations complement each other well, as I pointed out back before I was running SI:

  • FHI is part of Oxford, and thus can bring credibility to existential risk reduction. Resulting output: lots of peer-reviewed papers, books from OUP like Global Catastrophic Risks, conferences, media appearances, etc.

  • SI is independent and is less constrained by conservatism or the university system. Resulting output: Very novel (and, to the mainstream, "weird") research on Friendly AI, and the ability to do unusual things that are nevertheless quite effective at finding/creating lots of new people interested in rationality and existential risk reduction: (1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world, and (3) the Singularity Summit, a mainstream-aimed conference that brings in people who end up making significant contributions to the movement — e.g. Tomer Kagan (an SI donor and board member) and David Chalmers (author of The Singularity: A Philosophical Analysis and The Singularity: A Reply).

A few weeks later, Nick Bostrom (Director of FHI) said the same things (as far as I know, without having read my comment):

I think there is a sense that both organizations are synergistic. If one were about to go under... that would probably be the one [to donate to]. If both were doing well... different people will have different opinions. We work quite closely with the folks from [the Singularity Institute]...

There is an advantage to having one academic platform and one outside academia. There are different things these types of organizations give us. If you wanna get academics to pay more attention to this, to get postdocs to work on this, that's much easier to do within academia; also to get the ear of policy-makers and media... On the other hand, for [SI] there might be things that are easier for them to do. More flexibility, they're not embedded in a big bureaucracy. So they can more easily hire people with non-standard backgrounds... and also more grass-roots stuff like Less Wrong...

FHI is, despite its small size, a highly productive philosophy department. More importantly, FHI has focused its research work on AI risk issues for the past 9 months, and plans to continue on that path for at least another 12 months. This is important work that should be supported. (Note that FHI recently hired SI research associate Daniel Dewey.)

SI lacks FHI's publishing productivity and its university credibility, but as an organization SI is improving quickly, and it can seize many opportunities for AI risk reduction that FHI is not well-positioned to seize. (New organizations will also tend to be less capable of seizing these opportunities than SI, due to the financial and human capital already concentrated at SI and FHI.)

Here are some examples of projects that SI is probably better able to carry out than FHI, given its greater flexibility (and assuming sufficient funding):


My replies to Holden, point by point

Holden's post makes so many claims that I'll just have to work through his post from beginning to end, and then summarize where I think we stand at the end.


GiveWell Labs

Holden opened "Thoughts on the Singularity Institute" by noting that SI was previously outside Givewell's scope, since GiveWell was focused on specific domains like poverty reduction. With the launch of GiveWell Labs, GiveWell is now open to evaluating any giving opportunity, including SI.

I admire this move. I'm sure people have been bugging GiveWell to do this for a long time, but almost none of those people appreciate how hard it is to launch broad new initiatives like this with the limited budget of an organization like Givewell or the Singularity Institute. Most of them also do not understand how much work is required to write something like "Thoughts on the Singularity Institute", "Reply to Holden on Tool AI", or this post.


Three possible outcomes

Next, Holden wrote:

[I hope] that one of these three things (or some combination) will happen:

  1. New arguments are raised that cause me to change my mind and recognize SI as an outstanding giving opportunity. If this happens I will likely attempt to raise more money for SI (most likely by discussing it with other GiveWell staff and collectively considering a GiveWell Labs recommendation).

  2. SI concedes that my objections are valid and increases its determination to address them. A few years from now, SI is a better organization and more effective in its mission.

  3. SI can't or won't make changes, and SI's supporters feel my objections are valid, so SI loses some support, freeing up resources for other approaches to doing good.

As explained at the top of Holden's post, I had already conceded that many of Holden's objections (especially concerning past organizational competence) are valid, and had been working to address them, even before Holden's post was published. So outcome #2 is already true in part.

I hope for outcome #1, too, but I don't expect Holden to change his opinion overnight. There are too many possible objections to which Holden has not yet heard a good response. But hopefully this post and its comment threads will successfully address some of Holden's (and others') objections.

Outcome #3 is unlikely since SI is already making changes, though of course it's possible we will be unable to raise sufficient funding for SI despite making these changes, or even because of our efforts to make these changes. (Improving general organizational effectiveness is important but it costs money and is not exciting to donors.)


SI's mission is more important than SI as an organization

Holden said:

whatever happens as a result of my post will be positive for SI's mission, whether or not it is positive for SI as an organization. I believe that most of SI's supporters and advocates care more about the former than about the latter, and that this attitude is far too rare in the nonprofit world.

Clearly, SI's mission is more important than SI as an organization. If somebody launches an organization more effective (at AI risk reduction) than SI but just as flexible, then SI should probably fold itself and try to move its donor base, support community, and the best of its human capital to that new organization.

That said, it's probably easier to reform SI into a more effective organization than it is to launch a new one, since SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

(On the other hand, SI has also concentrated some bad reputation which a new organization could launch without. But I still think the weight of the arguments is in favor of reforming SI.)


SI's arguments need to be clearer

Holden:

I do not believe that [my objections to SI's apparent views] constitute a sharp/tight case for the idea that SI's work has low/negative value; I believe, instead, that SI's own arguments are too vague for such a rebuttal to be possible. There are many possible responses to my objections, but SI's public arguments (and the private arguments) do not make clear which possible response (if any) SI would choose to take up and defend. Hopefully the dialogue following this post will clarify what SI believes and why.

I agree that SI's arguments are often vague. For example, Chris Hallquist reported:

I've been trying to write something about Eliezer's debate with Robin Hanson, but the problem I keep running up against is that Eliezer's points are not clearly articulated at all. Even making my best educated guesses about what's supposed to go in the gaps in his arguments, I still ended up with very little.

I know the feeling! That's why I've tried to write as many clarifying documents as I can, including the Singularity FAQ, Intelligence Explosion: Evidence and Import, The Singularity and Machine Ethics, Facing the Singularity, So You Want to Save the World, and How to Purchase AI Risk Reduction.

Unfortunately, it takes lots of resources to write up hundreds of arguments and responses to objections in clear and precise language, and we're working on it. (For comparison, Nick Bostrom's forthcoming book on machine superintelligence will barely scratch the surface of the things SI and FHI researchers have worked out in conversation, and it will probably take him 2+ years to write in total, and Bostrom is already an unusually prolific writer.) Hopefully SI's responses to Holden's post have helped to clarify our positions already.


Holden's objection #1 punts to objection #2

The first objection on Holden's numbered list was:

it seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous.

I'm glad Holden agrees with us that successful Friendly AI is very hard. SI has spent much of its effort trying to show people that the first 20 solutions they come up with all fail. See: AI as a Positive and Negative Factor in Global Risk, The Singularity and Machine Ethics, Complex Value Systems are Required to Realize Valuable Futures, etc. Holden mentions the standard SI worry about the hidden complexity of wishes, and the one about a friendly utility function still causing havoc because the AI's priors are wrong (problem 3.6 from my list of open problems in AI risk research).

There are reasons to think FAI is harder still. What if we get the utility function right and we get the priors right but the AI's values change for the worse when it updates its ontology? What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem? What if no humans are altruistic enough to choose to build FAI over an AI that will make them king of the universe? What if the idea of FAI is incoherent? (The human brain is an existence proof for the possibility of general intelligence, but we have no existence proof for the possibility of a decision theoretic agent which stably optimizes the world according to a set of preferences over states of affairs.)

So, yeah. Friendly AI is hard. But as I said elsewhere:

The point is that not trying as hard as you can to build Friendly AI is even worse, because then you almost certainly get uFAI. At least by trying to build FAI, we've got some chance of winning.

So Holden's objection #1 objection really just punts to objection #2, about tool-AGI, as the last paragraph in this section of Holden's post seems to indicate:

So far, all I have argued is that the development of "Friendliness" theory can achieve at best only a limited reduction in the probability of an unfavorable outcome. However, as I argue in the next section, I believe there is at least one concept - the "tool-agent" distinction - that has more potential to reduce risks, and that SI appears to ignore this concept entirely.

So if Holden's objection #2 doesn't work, then objection #1 ends up reducing to "the development of Friendliness theory can achieve at best a reduction in AI risk," which is what SI has been saying all along.


Tool AI

Holden's second numbered objection was:

SI appears to neglect the potentially important distinction between "tool" and "agent" AI.

Eliezer wrote a whole post about this here. To sum up:

(1) Whether you're working with Tool AI or Agent AI, you need the "Friendly AI" domain experts that SI is trying to recruit:

A "Friendly AI programmer" is somebody who specializes in seeing the correspondence of mathematical structures to What Happens in the Real World. It's somebody who looks at Hutter's specification of AIXI and reads the actual equations - actually stares at the Greek symbols and not just the accompanying English text - and sees, "Oh, this AI will try to gain control of its reward channel," as well as numerous subtler issues like, "This AI presumes a Cartesian boundary separating itself from the environment; it may drop an anvil on its own head." Similarly, working on TDT means e.g. looking at a mathematical specification of decision theory, and seeing "Oh, this is vulnerable to blackmail" and coming up with a mathematical counter-specification of an AI that isn't so vulnerable to blackmail.

Holden's post seems to imply that if you're building a non-self-modifying planning Oracle (aka 'tool AI') rather than an acting-in-the-world agent, you don't need a Friendly AI programmer because FAI programmers only work on agents. But this isn't how the engineering skills are split up. Inside the AI, whether an agent AI or a planning Oracle, there would be similar AGI-challenges like "build a predictive model of the world", and similar FAI-conjugates of those challenges like finding the 'user' inside an AI-created model of the universe. The insides would look a lot more similar than the outsides. An analogy would be supposing that a machine learning professional who does sales optimization for an orange company couldn't possibly do sales optimization for a banana company, because their skills must be about oranges rather than bananas.

(2) Tool AI isn't that much safer than Agent AI, because Tool AIs have lots of hidden "gotchas" that cause havoc, too. (See Eliezer's post for examples.)

These points illustrate something else Eliezer wrote:

What the human species needs from an x-risk perspective is experts on This Whole Damn Problem [of AI risk], who will acquire whatever skills are needed to that end. The Singularity Institute exists to host such people and enable their research—once we have enough funding to find and recruit them.

Indeed. We need places for experts who specialize in seeing the consequences of mathematical objects for things humans value (e.g. the Singularity Institute) just like we need places for experts on efficient charity (e.g. Givewell).

Anyway, it's worth pointing out that Holden did not make the common (and mistaken) argument that "We should just build Tool AIs instead of Agent AIs and then we'll be fine." This is wrong for many reasons, but one obvious point is that there are incentives to build Agent AIs (because they're powerful), so even if the first 6 teams are careful enough to build only Tool AIs, the 7th team could still build Agent AI and destroy the world.

Instead, Holden pointed out that you could use Tool AI to increase your chances of successfully building agenty FAI:

if developing "Friendly AI" is what we seek, a tool-AGI could likely be helpful enough in thinking through this problem as to render any previous work on "Friendliness theory" moot. Among other things, a tool-AGI would allow transparent views into the AGI's reasoning and predictions without any reason to fear being purposefully misled, and would facilitate safe experimental testing of any utility function that one wished to eventually plug into an "agent."

After reading Eliezer's reply, however, you can probably guess my replies to this paragraph:

  1. Tool AI isn't as safe as Holden thinks.
  2. But yeah, a Friendly AI team may very well use "Tool AI" to aid Friendliness research if it can figure out a safe way to do that. This doesn't obviate the need for Friendly AI researchers; it's part of their research toolbox.

So Holden's Objection #2 doesn't work, which (as explained earlier) means that his Objection #1 (as stated) doesn't work either.


SI's mission assumes a scenario that is far less conjunctive than it initially appears.

Holden's objection #3 is:

SI's envisioned scenario is far more specific and conjunctive than it appears at first glance, and I believe this scenario to be highly unlikely.

His main concern here seemed to be that technological developments and other factors would render earlier FAI work irrelevant. But Eliezer's clarifications about what we mean by "FAI team" render this objection moot, at least as it is currently stated. The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.

Holden's confusion about what SI means by "FAI team" is common and understandable, and it is one reason that SI's mission assumes a scenario that is far less conjunctive than it appears to many. We aren't saying we need an FAI team because we know lots of specific things about how AGI will be built 30 years from now. We're saying you need experts on "the consequences of mathematical objects for things humans value" (an FAI team) because AGIs are mathematical objects and will have big consequences. That's pretty disjunctive.

Similarly, many people think SI's mission is predicated on hard takeoff. After all, we call ourselves the "Singularity Institute," Eliezer has spent a lot of time arguing for hard takeoff, and our current research summary frames AI risk in terms of recursive self-improvement.

But the case for AI as a global risk, and thus the need for dedicated experts on AI risk and "the consequences of mathematical objects for things humans value", isn't predicated on hard takeoff. Instead, it looks something like this:

(1) Eventually, most tasks are performed by machine intelligences.

The improved flexibility, copyability, and modifiability of machine intelligences make them economically dominant even without other advantages (Brynjolfsson & McAfee 2011; Hanson 2008). In addition, there is plenty of room "above" the human brain in terms of hardware and software for general intelligence (Muehlhauser & Salamon 2012; Sotala 2012; Kurzweil 2005).

(2) Machine intelligences don't necessarily do things we like.

We don't necessarily control AIs, since advanced intelligences may be inherently goal-oriented (Omohundro 2007), and even if we build advanced "Tool AIs," these aren't necessarily safe either (Yudkowsky 2012) and there will be significant economic incentives to transform them into autonomous agents (Brynjolfsson & McAfee 2011). We don't value most possible futures, but it's very hard to get an autonomous AI to do exactly what you want (Yudkowsky 2008, 2011; Muehlhauser & Helm 2012; Arkin 2009).

(3) There are things we can do to increase the probability that machine intelligences do things we like.

Further research can clarify (1) the nature and severity of the risk, (2) how to engineer goal-oriented systems safely, (3) how to increase safety with differential technological development, (4) how to limit and control machine intelligences (Armstrong et al. 2012; Yampolskiy 2012), (5) solutions to AI development coordination problems, and more.

(4) We should do those things now.

People aren't doing much about these issues now. We could wait until we understand better (e.g.) what kind of AI is likely, but: (1) it might take a long time to resolve the core issues, including difficult technical subproblems that require time-consuming mathematical breakthroughs, (2) incentives may be badly aligned (e.g. there seem to be strong economic incentives to build AI, but not to take into account social and global risks for AI), (3) AI may not be that far away (Muehlhauser & Salamon 2012), and (4) the transition to machine dominance may be surprisingly rapid due to (e.g.) intelligence explosion (Chalmers 2010, 2012; Muehlhauser & Salamon 2012) or computing overhang.

What do I mean by "computing overhang"? We may get the hardware needed for AI long before we get the software, such that once software for general intelligence is figured out, there is tons of computing hardware sitting around for running AIs (a "computing overhang"). Thus we could switch from a world with one autonomous AI to a world with 10 billion autonomous AIs at the speed of copying software, and thereby transition rapidly from human dominance to AI dominance even without an intelligence explosion. (This is one of the many, many things we haven't yet written up in detail up due to lack of resources.)

(This broad argument is greatly compressed from a paper outline developed by Paul Christiano, Carl Shulman, Nick Beckstead, and myself. We'd love to write the paper at some point, but haven't had the resources to do so. The fuller version of this argument is of course more detailed.)


SI's public argumentation

Next, Holden turned to the topic of SI's organizational effectiveness:

when evaluating a group such as SI, I can't avoid placing a heavy weight on (my read on) the general competence, capability and "intangibles" of the people and organization, because SI's mission is not about repeating activities that have worked in the past...

There are several reasons that I currently have a negative impression of SI's general competence, capability and "intangibles."

The first reason Holden gave for his negative impression of SI is:

SI has produced enormous quantities of public argumentation... Yet I have never seen a clear response to any of the three basic objections I listed in the previous section. One of SI's major goals is to raise awareness of AI-related risks; given this, the fact that it has not advanced clear/concise/compelling arguments speaks, in my view, to its general competence.

I agree in part. Here's what I think:

  • SI hasn't made its arguments as clear, concise, and compelling as I would like. We're working on that. It takes time, money, and people who are (1) smart and capable enough to do AI risk research work and yet somehow (2) willing to work for non-profit salaries and (3) willing to not advance their careers like they would if they chose instead to work at a university.
  • There are a huge number of possible objections to SI's arguments, and we haven't had the resources to write up clear and compelling replies to all of them. (See Chalmers 2012 for quick rebuttals to many objections to intelligence explosion, but what he covers in that paper barely scratches the surface.) As Eliezer wrote, Holden's complaint that SI hasn't addressed his particular objections "seems to lack perspective on how many different things various people see as the one obvious solution to Friendly AI. Tool AI wasn't the obvious solution to John McCarthy, I.J. Good, or Marvin Minsky. Today's leading AI textbook, Artificial Intelligence: A Modern Approach... discusses Friendly AI and AI risk for 3.5 pages but doesn't mention tool AI as an obvious solution. For Ray Kurzweil, the obvious solution is merging humans and AIs. For Jurgen Schmidhuber, the obvious solution is AIs that value a certain complicated definition of complexity in their sensory inputs. Ben Goertzel, J. Storrs Hall, and Bill Hibbard, among others, have all written about how silly Singinst is to pursue Friendly AI when the solution is obviously X, for various different X. Among current leading people working on serious AGI programs labeled as such, neither Demis Hassabis (VC-funded to the tune of several million dollars) nor Moshe Looks (head of AGI research at Google) nor Henry Markram (Blue Brain at IBM) think that the obvious answer is Tool AI. Vernor Vinge, Isaac Asimov, and any number of other SF writers with technical backgrounds who spent serious time thinking about these issues didn't converge on that solution."
  • SI has done a decent job of raising awareness of AI risk, I think. Writing The Sequences and HPMoR have (indirectly) raised more awareness for AI risk that one can normally expect from, say, writing a bunch of clear and precise academic papers about a subject. (At least, it seems that way to me.)


SI's endorsements

The second reason Holden gave for his negative impression of SI is "a lack of impressive endorsements." This one is generally true, despite the three "celebrity endorsements" on our new donate page. More impressive than these is the fact that, as Eliezer mentioned, the latest edition of the leading AI textbook spend several pages talking about AI risk and Friendly AI, and discusses the work of SI-associated researchers like Eliezer Yudkowsky and Steve Omohundro while completely ignoring the existence of the older, more prestigious, and vastly larger mainstream academic field of "machine ethics."

Why don't we have impressive endorsements? To my knowledge, SI hasn't tried very hard to get them. That's another thing we're in the process of changing.


SI and feedback loops

The third reason Holden gave for his negative impression of SI is:

SI seems to have passed up opportunities to test itself and its own rationality by e.g. aiming for objectively impressive accomplishments... Pursuing more impressive endorsements and developing benign but objectively recognizable innovations (particularly commercially viable ones) are two possible ways to impose more demanding feedback loops.

We have thought many times about commercially viable innovations we could develop, but these would generally be large distractions from the work of our core mission. (The Center for Applied Rationality, in contrast, has many opportunities to develop commercially viable innovations in line with its core mission.)

Still, I do think it's important for the Singularity Institute to test itself with tight feedback loops wherever feasible. This is particularly difficult to do for a research organization doing a philosophy of long-term forecasting (30 years is not a "tight" feedback loop in the slightest), but that's what FHI does and they have more "objectively impressive" (that is, "externally proclaimed") accomplishments: lots of peer-reviewed publications, some major awards for its top researcher Nick Bostrom, etc.


SI and rationality

Holden's fourth concern about SI is that it is overconfident about the level of its own rationality, and that this seems to show itself in (e.g.) "insufficient self-skepticism" and "being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously."

What would provide good evidence of rationality? Holden explains:

I endorse Eliezer Yudkowsky's statement, "Be careful … any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility." To me, the best evidence of superior general rationality (or of insight into it) would be objectively impressive achievements (successful commercial ventures, highly prestigious awards, clear innovations, etc.) and/or accumulation of wealth and power. As mentioned above, SI staff/supporters/advocates do not seem particularly impressive on these fronts...

Unfortunately, this seems to misunderstand the term "rationality" as it is meant in cognitive science. As I explained elsewhere:

Like intelligence and money, rationality is only a ceteris paribus predictor of success.

So while it's empirically true (Stanovich 2010) that rationality is a predictor of life success, it's a weak one. (At least, it's a weak predictor of success at the levels of human rationality we are capable of training today.) If you want to more reliably achieve life success, I recommend inheriting a billion dollars or, failing that, being born+raised to have an excellent work ethic and low akrasia.

The reason you should "be careful… any time you find yourself defining the [rationalist] as someone other than the agent who is currently smiling from on top of a giant heap of utility" is because you should "never end up envying someone else's mere choices." You are still allowed to envy their resources, intelligence, work ethic, mastery over akrasia, and other predictors of success.

But I don't mean to dodge the key issue. I think SIers are generally more rational than most people (and so are LWers, it seems), but I think SIers have often overestimated their own rationality, myself included. Certainly, I think SI's leaders have been pretty irrational about organizational development at many times in the past. In internal communications about why SI should help launch CFAR, one reason on my list has been: "We need to improve our own rationality, and figure out how to create better rationalists than exist today."


SI's goals and activities

Holden's fifth concern about SI is the apparent disconnect between SI's goals and its activities:

SI seeks to build FAI and/or to develop and promote "Friendliness theory" that can be useful to others in building FAI. Yet it seems that most of its time goes to activities other than developing AI or theory.

This one is pretty easy to answer. We've focused mostly on movement-building rather than direct research because, until very recently, there wasn't enough community interest or funding to seriously begin to form an FAI team. To do that you need (1) at least a few million dollars a year, and (2) enough smart, altruistic people to care about AI risk that there exist some potential superhero mathematicians for the FAI team. And to get those two things, you've got to do mostly movement-building, e.g. Less Wrong, HPMoR, the Singularity Summit, etc.


Theft

And of course, Holden is (rightly) concerned about the 2009 theft of $118,000 from SI, and the lack of public statements from SI on the matter.

Briefly:

  • Two former employees stole $118,000 from SI. Earlier this year we finally won stipulated judgments against both individuals, forcing them to pay back the full amounts they stole. We have already recovered several thousand dollars of this.
  • We do have much better financial controls now. We consolidated our accounts so there are fewer accounts to watch, and at least three staff members check them regularly, as does our treasurer, who is not an SI staff member or board member.


Pascal's Mugging

In another section, Holden wrote:

A common argument that SI supporters raise with me is along the lines of, "Even if SI's arguments are weak and its staff isn't as capable as one would like to see, their goal is so important that they would be a good investment even at a tiny probability of success."

I believe this argument to be a form of Pascal's Mugging and I have outlined the reasons I believe it to be invalid...

Some problems with Holden's two posts on this subject will be explained in a forthcoming post by Steven Kaas. But as Holden notes, some SI principals like Eliezer don't use "small probability of large impact" arguments, anyway. We in fact argue that the probability of a large impact is not tiny.


Summary of my reply to Holden

Now that I have addressed so many details, let us return to the big picture. My summarized reply to Holden goes like this:

Holden's first two objections can be summarized as arguing that developing the Friendly AI approach is more dangerous than developing non-agent "Tool" AI. Eliezer's post points out that "Friendly AI" domain experts are what you need whether you're working with Tool AI or Agent AI, because (1) both of these approaches require FAI experts (experts in seeing the consequences of mathematical objects for what humans value), and because (2) Tool AI isn't necessarily much safer than Agent AI, because Tool AIs have lots of hidden gotchas, too. Thus, "What the human species needs from an x-risk perspective is experts on This Whole Damn Problem [of AI risk], who will acquire whatever skills are needed to that end. The Singularity Institute exists to host such people and enable their research — once we have enough funding to find and recruit them."

Holden's third objection was that the argument behind SI's mission is more conjunctive than it seems. I replied that the argument behind SI's mission is actually less conjunctive than it often seems, because an "FAI team" works on a broader set of problems than Holden had realized, and because the case for AI risk is more disjunctive than many people realize. These confusions are understandable, however, and they probably are a result of insufficient clear argumentative writing from SI on these matters — a problem we am trying to fix with several recent and forthcoming papers and other communications (like this one).

Holden's next objection concerned SI as an organization: "SI has, or has had, multiple properties that I associate with ineffective organizations." I acknowledged these problems before Holden published his post, and have since outlined the many improvements we've made to organizational effectiveness since I was made Executive Director. I addressed several of Holden's specific worries here.

Finally, Holden recommended giving to a donor-advised fund rather than to SI:

I don't think that "Cause X is the one I care about and Organization Y is the only one working on it" to be a good reason to support Organization Y. For donors determined to donate within this cause, I encourage you to consider donating to a donor-advised fund while making it clear that you intend to grant out the funds to existential-risk-reduction-related organizations in the future....

For one who accepts my arguments about SI, I believe withholding funds in this way is likely to be better for SI's mission than donating to SI

By now I've called into question most of Holden's arguments about SI, but I will still address the issue of donating to SI vs. donating to a donor-advised fund.

First: Which public charity would administer the donor-advised fund? Remember also that in the U.S., the administering charity need not spend from the donor-advised fund as the donor wishes, though they often do.

Second: As I said earlier,

it's probably easier to reform SI into a more effective organization than it is to launch a new one, since SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

The case for funding improvements and growth at SI (as opposed to starving SI as Holden suggests) is bolstered by the fact that SI's productivity and effectiveness have been improving rapidly of late, and many other improvements (and exciting projects) are on our "to-do" list if we can raise sufficient funding to implement them.

Holden even seems to share some of this optimism:

Luke's... recognition of the problems I raise... increases my estimate of the likelihood that SI will work to address them...

I'm aware that SI has relatively new leadership that is attempting to address the issues behind some of my complaints. I have a generally positive impression of the new leadership; I believe the Executive Director and Development Director, in particular, to represent a step forward in terms of being interested in transparency and in testing their own general rationality. So I will not be surprised if there is some improvement in the coming years...


Conclusion

For brevity's sake I have skipped many important details. I may also have misinterpreted Holden somewhere. And surely, Holden and other readers have follow-up questions and objections. This is not the end of the conversation; it is closer to the beginning. I invite you to leave your comments, preferably in accordance with these guidelines (for improved discussion clarity).

Comments (213)

Comment author: lukeprog 09 July 2012 11:43:41PM 11 points [-]

A clarification. In Thoughts on the Singularity Institute, Holden wrote:

I will commit to is reading and carefully considering up to 50,000 words of content that are (a) specifically marked as SI-authorized responses to the points I have raised; (b) explicitly cleared for release to the general public as SI-authorized communications. In order to consider a response "SI-authorized and cleared for release," I will accept explicit communication from SI's Executive Director or from a majority of its Board of Directors endorsing the content in question.

As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.

According to Word Count Tool, these three things add up to a mere 13,940 words.

Comment author: wedrifid 10 July 2012 01:11:36AM 11 points [-]

As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.

Consider removing the first sentence of the final link:

This comment is not intended to be part of the 50,000-word response which Holden invited.

Comment author: lukeprog 10 July 2012 01:26:23AM 6 points [-]

Good point. :)

Fixed.

Comment author: Rain 10 July 2012 02:32:50AM *  54 points [-]

I think this post makes a strong case for needing further donations. Have $3,000.

Comment author: ChrisHallquist 19 July 2012 02:31:59PM 16 points [-]

I agree. Have another $1,100. Also, for those who are interested, a link to a blog post I wrote explaining why I donated.

Comment author: lukeprog 24 July 2012 03:06:44AM 3 points [-]

Thanks!!!

Comment author: lukeprog 10 July 2012 03:00:25AM 10 points [-]

Thanks!!!

Comment author: amywilley 10 July 2012 03:21:24AM 7 points [-]

Thank you :)

Comment author: Ioven 10 July 2012 03:47:32AM 6 points [-]

Thanks Rain !!

Comment author: MichaelAnissimov 11 July 2012 05:52:23AM 5 points [-]

Thank you Rain.

Comment author: lukeprog 10 July 2012 12:10:25AM 20 points [-]

This post and the reactions to it will be an interesting test for my competing models about the value of giving detailed explanations to supporters. Here are just two of them:

One model says that detailed communication with supporters is good because it allows you to make your case for why your charity matters, and thus increase the donors' expectation that your charity can turn money into goods that they value, like poverty reduction or AI risk reduction.

Another model says that detailed communication with supporters is bad because (1) supporters are generally giving out of positive affect toward the organization, and (2) that positive affect can't be increased much once they grok the mission enough to start donating, but (3) the positive affect they feel toward the charity can be overwhelmed by the absolute number of the organization's statements with which they disagree, and (4) more detailed communication with supporters increases this absolute number more quickly than limited communication that repeats the same points again and again (e.g. in a newsletter).

I worry that model #2 may be closer to the truth, in part because of things like (Dilbert-creator) Scott Adams' account of why he decided to blog less:

I hoped that people who loved the blog would spill over to people who read Dilbert, and make my flagship product stronger. Instead, I found that if I wrote nine highly popular posts, and one that a reader disagreed with, the reaction was inevitably “I can never read Dilbert again because of what you wrote in that one post.” Every blog post reduced my income, even if 90% of the readers loved it.

Comment author: komponisto 10 July 2012 12:34:04AM 14 points [-]

An issue that SI must inevitably confront is how much rationality it will assume of its target population of donors. If it simply wanted to raise as much money as possible, there are, I expect, all kinds of Dark techniques it could use (of which decreasing communication is only the tip of the iceberg). The problem is that SI also wants to raise the sanity waterline, since that is integral to its larger mission -- and it's hard (not to mention hypocritical) to do that while simultaneously using fundraising methods that depend on the waterline being below a certain level among its supporters.

Comment author: AlexMennen 10 July 2012 01:15:59AM 7 points [-]

How do you expect to determine the effects of this information on donations from the comments made by supporters? In my case, for instance, I've been fairly encouraged by the explanations like this that have been coming out of SI (and had been somewhat annoyed by the lack of them previously), but my comments tend to sound negative because I tend to focus on things that I'm still not completely satisfied with.

Comment author: lukeprog 10 July 2012 01:28:17AM 4 points [-]

It's very hard. Comments like this help a little.

Comment author: wedrifid 10 July 2012 01:08:30AM *  4 points [-]

Another model says that detailed communication with supporters is bad because (1) supporters are generally giving out of positive affect toward the organization, and (2) that positive affect can't be increased much once they grok the mission enough to start donating, but (3) the positive affect they feel toward the charity can be overwhelmed by the absolute number of the organization's statements with which they disagree, and (4) more detailed communication with supporters increases this absolute number more quickly than limited communication that repeats the same points again and again (e.g. in a newsletter).

As an example datapoint Eliezer's reply to Holden caused a net decrease (not necessarily an enormous one) in both my positive affect for and abstract evaluation of the merit of the organisation based off one particularly bad argument that shocked me. It prompted some degree (again not necessarily a large degree) of updating towards the possibility that SingInst could suffer the same kind of mind-killed thinking and behavior I expect from other organisations in the class of pet-cause idealistic charities. (And that matters more for FAI oriented charities than save-the-puppies charities, with the whole think-right or destroy the world thing.)

When allowing for the possibility that I am wrong and Eliezer is right you have to expect most other supporters to be wrong a non-trivial proportion of the time too so too much talking is going to have negative side effects.

Comment author: lukeprog 10 July 2012 01:27:51AM 1 point [-]

Which issue are you talking about? Is there already a comments thread about it on Eliezer's post?

Comment author: wedrifid 10 July 2012 01:39:55AM 2 points [-]

Which issue are you talking about? Is there already a comments thread about it on Eliezer's post?

Found it. It was nested too deep in a comment tree.

The particular line was:

I would ask him what he knows now, in advance, that all those sane intelligent people will miss. I don't see how you could (well-justifiedly) access that epistemic state.

The position is something I think it is best I don't mention again until (unless) I get around to writing the post "Predicting Failure Without Details" to express the position clearly with references and what limits apply to that kind of reasoning.

Comment author: Cyan 10 July 2012 01:43:25AM 7 points [-]

Isn't it just straight-up outside view prediction?

Comment author: Giles 24 July 2012 02:56:05AM 0 points [-]

Is it possible that supporters might update on communicativeness, separately from updating on what you actually have to say? Generally when I see the SI talking to people, I feel the warm fuzziness before I actually read what you're saying. It just seems like people might associate "detailed engagement with supporters and critics" with the reference class of "good organizations".

Comment author: lukeprog 24 July 2012 03:05:53AM 0 points [-]

Yup, that might be true. I hope so.

Comment author: TheOtherDave 10 July 2012 02:39:15AM 0 points [-]

Presumably, even under model #1, the extent to which detailed communication increases donor expectations of my charity's ability to turn money into valuable goods depends a lot on their pre-existing expectations, the level of expectations justified by the reality, and how effective the communication is at conveying the reality.

Comment author: ChrisHallquist 11 July 2012 08:27:57AM 0 points [-]

I can think of a really big example favoring model #2 within the atheist community. On the oyher hand, you and Eliezer have written so much about your views on these matters that the "detailed communication" toothpaste may not be going back in the tube. And this piece made me much more inclined to support SI, particularly the disjunctive vs. Conjunctive section which did a lot for worries raised by things Eliezer has said in the past.

Comment author: Thrasymachus 13 July 2012 02:45:48PM 6 points [-]

When making the case for SI's comparative advantage, you point to these things:

... [A]nd the ability to do unusual things that are nevertheless quite effective at finding/creating lots of new people interested in rationality and existential risk reduction: (1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world...

What evidence supports these claims?

Comment author: Thrasymachus 22 July 2012 09:43:36PM 1 point [-]

each question (posted as a comment on this page) that follows the template described below will receive a reply from myself or another SI representative.

I appreciate you folks are busy, but I'm going to bump as it has been more than a week. Besides, it strikes me as an important question given the prominence of these things to the claim that SI can buy x-risk reduction more effectively than other orgs.

Comment author: endoself 24 July 2012 05:16:13AM 1 point [-]

You can PM Luke if you want. It's the "Send message" button next to the username on the user page.

Comment author: Thrasymachus 02 August 2012 04:38:25PM 0 points [-]

I'm bumping this again because there's been no response to this question (three weeks since asking), and I poked Luke via PM a week ago. Given this is the main plank supporting SI's claim that it is a good way of spending money, I think this question should be answered.

(especially compare to Holden's post)

Comment author: HoldenKarnofsky 01 August 2012 02:16:55PM 14 points [-]

I greatly appreciate the response to my post, particularly the highly thoughtful responses of Luke (original post), Eliezer, and many commenters.

Broad response to Luke's and Eliezer's points:

As I see it, there are a few possible visions of SI's mission:

  • M1. SI is attempting to create a team to build a "Friendly" AGI.
  • M2. SI is developing "Friendliness theory," which addresses how to develop a provably safe/useful/benign utility function without needing iterative/experimental development; this theory could be integrated into an AGI developed by another team, in order to ensure that its actions are beneficial.
  • M3. SI is broadly committed to reducing AGI-related risks, and work on whatever will work toward that goal, including potentially M1 and M2.

My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team. An organization with a very narrow, specific mission - such as "analyzing how to develop a provably safe/useful/benign utility function without needing iterative/experimental development" - can, relatively easily, establish which other organizations (if any) are trying to provide what it does and what the relative qualifications are; it can set clear expectations for deliverables over time and be held accountable to them; its actions and outputs are relatively easy to criticize and debate. By contrast, an organization with broader aims and less clearly relevant deliverables - such as "broadly aiming to reduce risks from AGI, with activities currently focused on community-building" - is giving a donor (or evaluator) less to go on in terms of what the space looks like, what the specific qualifications are and what the specific deliverables are. In this case it becomes more important that a donor be highly confident in the exceptional effectiveness of the organization and team as a whole.

Many of the responses to my criticisms (points #1 and #4 in Eliezer's response; "SI's mission assumes a scenario that is far less conjunctive than it initially appears" and "SI's goals and activities" section of Luke's response) correctly point out that they have less force, as criticisms, when one views SI's mission as relatively broad. However, I believe that evaluating SI by a broader mission raises the burden of affirmative arguments for SI's impressiveness. The primary such arguments I see in the responses are in Luke's list:

(1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world, and (3) the Singularity Summit, a mainstream-aimed conference that brings in people who end up making significant contributions to the movement — e.g. Tomer Kagan (an SI donor and board member) and David Chalmers (author of The Singularity: A Philosophical Analysis and The Singularity: A Reply).

I've been a consumer of all three of these, and while I've found them enjoyable, I don't find them sufficient for the purpose at hand. Others may reach a different conclusion. And of course, I continue to follow SI's progress, as I understand that it may submit more impressive achievements in the future.

Both Luke and Eliezer seem to disagree with the basic approach I'm taking here. They seem to believe that it is sufficient to establish that (a) AGI risk is an overwhelmingly important issue and that (b) SI compares favorably to other organizations that explicitly focus on this issue. For my part, I (a) disagree with the statement: "the loss in expected value resulting from an existential catastrophe is so enormous that the objective of reducing existential risks should be a dominant consideration whenever we act out of an impersonal concern for humankind as a whole"; (b) do not find Luke's argument that AI, specifically, is the most important existential risk to be compelling (it discusses only how beneficial it would be to address the issue well, not how likely a donor is to be able to help do so); (c) believe it is appropriate to compare the overall organizational impressiveness of the Singularity Institute to that of all other donation-soliciting organizations, not just to that of other existential-risk- or AGI-focused organizations. I would guess that these disagreements, particularly (a) and (c), come down to relatively deep worldview differences (related to the debate over "Pascal's Mugging") that I will probably write more about in the future.

On tool AI:

Most of my disagreements with SI representatives seem to be over how broad a mission is appropriate for SI, and how high a standard SI as an organization should be held to. However, the debate over "tool AI" is different, with both sides making relatively strong claims. Here SI is putting forth a specific point as an underappreciated insight and thus as a potential contribution/accomplishment; my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

My latest thoughts on this disagreement were posted separately in a comment response to Eliezer's post on the subject.

A few smaller points:

  • I disagree with Luke's claim that " objection #1 punts to objection #2." Objection #2 (regarding "tool AI") points out one possible approach to AGI that I believe is both consonant with traditional software development and significantly safer than the approach advocated by SI. But even if the "tool AI" approach is not in fact safer, there may be safer approaches that SI hasn't thought of. SI does not just emphasize the general problem that AGI may be dangerous (something that I believe is a fairly common view), but emphasizes a particular approach to AGI safety, one that seems to me to be highly dangerous. If SI's approach is dangerous relative to other approaches that others are taking/advocating, or even approaches that have yet to be developed (and will be enabled by future tools and progress on AGI), this is a problem for SI.
  • Luke states that rationality is "only a ceteris paribus predictor of success" and that it is a "weak one." I wish to register that I believe rationality is a strong (though not perfect) predictor of success, within the population of people who are as privileged (in terms of having basic needs met, access to education, etc.) as most SI supporters/advocates/representatives. So while I understand that success is not part of the definition of rationality, I stand by my statement that it is "the best evidence of superior general rationality (or of insight into it)."
  • Regarding donor-advised funds: opening an account with Vanguard, Schwab or Fidelity is a simple process, and I doubt any of these institutions would overrule a recommendation to donate to an organization such as SI (in any case, this is easily testable).
Comment author: Wei_Dai 02 August 2012 10:29:56AM 10 points [-]

My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team.

Can you describe a hypothetical organization and some examples of the impressive achievements it might have, which would pass the bar for handling mission M3? What is your estimate of the probability of such an organization coming into existence in the next five or ten years, if a large fraction of current SI donors were to put their money into donor-advised funds instead?

Comment author: lukeprog 03 August 2012 04:23:00AM *  6 points [-]

Just a few thoughts for now:

  • I agree that some of our disagreements "come down to relatively deep worldview differences (related to the debate over 'Pascal's Mugging')." The forthcoming post on this subject by Steven Kaas may be a good place to engage further on this matter.
  • I retain the claim that Holden's "objection #1 punts to objection #2." For the moment, we seem to be talking past each other on this point. The reply Eliezer and I gave on Tool AI was not just that Tool AI has its own safety concerns, but also that understanding the tool AI approach and other possible approaches to the AGI safety problem are part of what an "FAI Programmer" does. We understand why people have gotten the impression that SI's FAI team is specifically about building a "self-improving CEV-maximizing agent", but that's just one approach under consideration, and figuring out which approach is best requires the kind of expertise that SI aims to host.
  • The evidence suggesting that rationality is a weak predictor of success comes from studies on privileged Westerners. Perhaps Holden has a different notion of what counts as a measure of rationality than the ones currently used by psychologists?
  • I've looked further into donor advised funds and now agree that the institutions named by Holden are unlikely to overrule their client's wishes.
  • I, too, would be curious to hear Holden's response to Wei Dai's question.
Comment author: aaronsw 04 August 2012 11:18:27AM *  15 points [-]

On the question of the impact of rationality, my guess is that:

  1. Luke, Holden, and most psychologists agree that rationality means something roughly like the ability to make optimal decisions given evidence and goals.

  2. The main strand of rationality research followed by both psychologists and LWers has been focused on fairly obvious cognitive biases. (For short, let's call these "cognitive biases".)

  3. Cognitive biases cause people to make choices that are most obviously irrational, but not most importantly irrational. For example, it's very clear that spinning a wheel should not affect people's estimates of how many African countries are in the UN. But do you know anyone for whom this sort of thing is really their biggest problem?

  4. Since cognitive biases are the primary focus of research into rationality, rationality tests mostly measure how good you are at avoiding them. These are the tests used in the studies psychologists have done on whether rationality predicts success.

  5. LW readers tend to be fairly good at avoiding cognitive biases (and will be even better if CFAR takes off).

  6. But there are a whole series of much more important irrationalities that LWers suffer from. (Let's call them "practical biases" as opposed to "cognitive biases", even though both are ultimately practical and cognitive.)

  7. Holden is unusually good at avoiding these sorts of practical biases. (I've found Ray Dalio's "Principles", written by Holden's former employer, an interesting document on practical biases, although it also has a lot of stuff I disagree with or find silly.)

  8. Holden's superiority at avoiding practical biases is a big part of why GiveWell has tended to be more successful than SIAI. (Givewell.org has around 30x the amount of traffic as Singularity.org according to Compete.com and my impression is that it moves several times as much money, although I can't find a 2011 fundraising total for SIAI.)

  9. lukeprog has been better at avoiding practical biases than previous SIAI leadership and this is a big part of why SIAI is improving. (See, e.g., lukeprog's debate with EY about simply reading Nonprofit Kit for Dummies.)

  10. Rationality, properly understood, is in fact a predictor of success. Perhaps if LWers used success as their metric (as opposed to getting better at avoiding obvious mistakes), they might focus on their most important irrationalities (instead of their most obvious ones), which would lead them to be more rational and more successful.

Comment author: lukeprog 19 February 2013 04:03:16AM 3 points [-]

For the record, I basically agree with all this.

Comment author: DaFranker 01 August 2012 03:29:42PM *  6 points [-]

I'm very much an outsider to this discussion, and by no means a "professional researcher", but I believe those to be the primary reasons why I'm actually qualified to make the following point. I'm sure it's been made before, but a rapid scan revealed no specific statement of this argument quite as directly and explicitly.

HoldenKarnofsky: (...) my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

I've always understood SI's position on this matter not as one of "We should not focus on building Tool AI! Fully reflectively self-modifying AGIs are the only way to go!", but rather that it is extremely unlikely that we can prevent everyone else from building one.

To my understanding, logic goes: If any programmer with relevant skills is sufficiently convinced, by whatever means and for whatever causes, that building a full traditional AGI is more efficient and will more "lazily" achieve his goals with less resources or achieve them faster, the programmer will build it whether you think it's a good idea or not. As such, SI's "Moral Imperative" is to account for this scenario as there is non-negligible probability of it actually happening, for if they do not, they effectively become hypocritical in claiming to work towards reducing existential AI risk.

To reiterate with silly scare-formatting: It is completely irrelevant, in practice, what SI "advocates" or "promotes" as a preferred approach to building safe AI, because the probability that someone, somewhere, some day is going to use the worst possible approach is definitely non-negligible. If there is not already a sufficiently advanced Friendly AI in place to counter such a threat, we are then effectively defenseless.

To metaphorize, this is a case of: "It doesn't matter if you think only using remote-controlled battle robots would be a better way to resolve international disputes. At some point, someone somewhere is going to be convinced that killing all of you is going to be faster and cheaper and more certain of achieving their goals, so they'll build one giant bomb and throw it at you without first making sure they won't kill themselves in the process."

Comment author: John_Maxwell_IV 03 August 2012 05:28:21AM *  4 points [-]

This looks similar to this point Kaj Sotala made. My own restatement: As the body of narrow AI research devoted to making tools grows larger and larger, building agent AGI gets easier and easier, and there will always be a few Shane Legg types who are crazy enough to try it.

I sometimes suspect that Holden's true rejection to endorsing SI is that the optimal philanthropy movement is fringe enough already, and he doesn't want to associate it with nutty-seeming beliefs related to near-inevitable doom from superintelligence. Sometimes I wish SI would market themselves as being similar to nuclear risk organizations like the Bulletin of Atomic Scientists. After all, EY was an AI researcher who quit and started working on Friendliness when he saw the risks, right? I think you could make a pretty good case for SI's usefulness just working based on analogies from nuclear risk, without any mention of FOOM or astronomical waste or paperclip maximizers.

Ideally we'd have wanted to know about nuclear weapon risks before having built them, not afterwards, right?

Comment author: DaFranker 03 August 2012 12:58:33PM *  1 point [-]

Personally, I highly doubt that to be Holden's true rejection, though it is most likely one of the emotional considerations that cannot be ignored in a strategic perspective. Holden claims to have gone through most of the relevant LessWrong sequence and SIAI public presentation material, which makes the likelihood of a deceptive (or self-deceptive) argumentation lower, I believe.

No, what I believe to be the real issue is that Holden and (Most of SIAI) have disagreements over many specific claims used to justify broader claims - if the specific claims are granted in principle, both seem to generally agree in good bayesian fashion on the broader or more general claim. Much of the disagreements on those specifics also appears to stem from different priors in ethical and moral values, as well as differences in their evaluations and models of human population behaviors and specific (but often unspecified) "best guess" probabilities.

For a generalized example, one strong claim for existential risk being optimal effort is that even a minimal decrease in risk provides immense expected value simply from the sheer magnitude of what could most likely be achieved by humanity throughout the rest of its course of existence. Many experts and scientists outright reject this on the grounds that "future, intangible, merely hypothetical other humans" should not be assigned value on the same order-of-magnitude as current humans, or even one order of magnitude lower.

Comment author: [deleted] 01 August 2012 04:01:43PM 1 point [-]

but rather that it is extremely unlikely that we can prevent everyone else from building one.

Well, SI's mission makes sense on the premise that the best way to prevent a badly built AGI from being developed or deployed is to build a friendly AGI which has that as one of its goals. 'Best way' here is a compromise between, on the one hand, the effectiveness of the FAI relative to other approaches, and on the other, the danger presented by the FAI itself as opposed to other approaches.

So I think Holden's position is that the ratio of danger vs. effectiveness does not weigh favorably for FAI as opposed to tool AI. So to argue against Holden, we would have to argue either that FAI will be less dangerous than he thinks, or that tool AI will be less effective than he thinks.

I take it the latter is the more plausible.

Comment author: DaFranker 02 August 2012 06:27:23PM *  1 point [-]

Indeed, we would have to argue that to argue against Holden.

My initial reaction was to counter this with a claim that we should not be arguing against anyone in the first place, but rather looking for probable truth (concentrate anticipations). And then I realized how stupid that was: Arguments Are Soldiers. If SI (and by the Blue vs Green principle, any SI-supporter) can't even defend a few claims and defeat its opponents, it is obviously stupid and not worth paying attention to.

SI needs some amount of support, yet support-maximization strategies carry a very high risk of introducing highly dangerous intellectual contamination through various forms (including self-reinforcing biases in the minds of researchers and future supporters) that could turn out to cause even more existential risk. Yet, at the same time, not gathering enough support quickly enough dramatically augments the risk that someone, somewhere, is going to trip on a power cable and poof, all humans are just gone.

I am definitely not masterful enough in mathematics and bayescraft to calculate the optimal route through this differential probabilistic maze, but I suspect others could provide a very good estimate.

Also, it's very much worth noting that these very considerations, on a meta level, are an integral part of SI's mission, so figuring out whether that premise you stated is true or not, and whether there are better solutions or not actually is SI's objective. Basically, while I might understand some of the cognitive causes for it, I am still very much rationally confused when someone questions SI's usefulness by questioning the efficiency of subgoal X, while SI's original and (to my understanding) primary mission is precisely to calculate the efficiency of subgoal X.

Comment author: John_Maxwell_IV 03 August 2012 07:03:01PM 4 points [-]

I would guess that these disagreements, particularly (a) and (c), come down to relatively deep worldview differences (related to the debate over "Pascal's Mugging") that I will probably write more about in the future.

How does Givewell plan to deal with the possibility that people who come to Givewell looking for charity advice may have a variety of worldviews that impact their thinking on this?

Comment author: MatthewBaker 11 July 2012 12:14:33AM 5 points [-]

If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex? (I only ask because bugging him for an update has been previously suggested to reduce update speed)

Furthermore. Oracle AI/Nanny AI seem to both fail the heuristic of "other country is about to beat us in a war, should we remove the safety programming" that I use quite often with nearly everyone I debate AI about from outside the LW community. Thank you both for writing such concise yet detailed responses that helped me understand the problem areas of Tool AI better.

Comment author: lukeprog 11 July 2012 12:36:49AM 9 points [-]

If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex?

I think the issue is that we need a successful SPARC and an "Open Problems in Friendly AI" sequence more urgently than we need an HPMOR finale.

Comment author: shokwave 11 July 2012 01:07:13AM *  9 points [-]

"Open Problems in Friendly AI" sequence

an HPMOR finale

A sudden, confusing vision just occurred, of the two being somehow combined. Aaagh.

Comment author: shminux 11 July 2012 04:59:21AM 3 points [-]

Spoiler: Voldemort is a uFAI.

Comment author: arundelo 11 July 2012 05:41:43AM 6 points [-]

For the record:

Nothing in this story so far represents either FAI or UFAI. Consider it Word of God.

(And later in the thread, when asked about "so far": "And I have no intention at this time to do it later, but don't want to make it a blanket prohibition.")

Comment author: NancyLebovitz 15 July 2012 02:03:33AM *  2 points [-]

In the earlier chapters, it seemed to me that the Hogwarts facility dealing with Harry was something like being faced with an AI of uncertain Friendliness.

Correction: It was more like the faculty dealing with an AI that's trying to get itself out of its box.

Comment author: MatthewBaker 11 July 2012 10:21:43PM 0 points [-]

I think our values our positively maximized by delaying the HPMOR finale as long as possible, my post was more out of curiosity to see what would be most helpful to Eliezer.

Comment author: David_Gerard 13 July 2012 08:44:06AM 6 points [-]

In general - never earmark donations. It's a stupendous pain in the arse to deal with. If you trust an organisation enough to donate to them, trust them enough to use the money for whatever they see a need for. Contrapositive: If you don't trust them enough to use the money for whatever they see a need for, don't donate to them.

Comment author: MatthewBaker 13 July 2012 06:01:02PM 2 points [-]

I never have before but this CPA Audit seemed like a logical thing that would encourage my wealthy parents to donate :)

Comment author: fubarobfusco 09 July 2012 10:48:16PM *  11 points [-]

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future.

This equates "managing AI risk" and "building FAI" without actually making the case that these are equivalent. Many people believe that dangerous research can be banned by governments, for instance; it would be useful to actually make the case (or link to another place where it has been made) that managing AI risk is intractable without FAI.

Comment author: lukeprog 09 July 2012 11:02:18PM *  27 points [-]

This is one of the 10,000 things I didn't have the space to discuss in the original post, but I'm happy to briefly address it here!

It's much harder to successfully ban AI research than to successfully ban, say, nuclear weapons. Nuclear weapons require rare and expensive fissile material that requires rare heavy equipment to manufacture. Such things can be tracked to some degree. In contrast, AI research requires... um... a few computers.

Moreover, it's really hard to tell whether the code somebody is running on a computer is potentially dangerous AI stuff or something else. Even if you magically had a monitor installed on every computer to look for dangerous AI stuff, it would have to know what "dangerous AI stuff" looks like, which is hard to do before the dangerous AI stuff is built in the first place.

The monetary, military, and political incentives to build AGI are huge, and would be extremely difficult to counteract through a worldwide ban. You couldn't enforce the ban, anyway, for the reasons given above. That's why Ben Goertzel advocates "Nanny AI," though Nanny AI may be FAI-complete, as mentioned here.

I hope that helps?

Comment author: fubarobfusco 10 July 2012 12:35:48AM 5 points [-]

Yes.

Comment author: aShepherd 13 July 2012 01:07:08AM *  0 points [-]

It does help...I had the same reaction as fubarob.

However, your argument assumes that our general IT capabilities have already matured to the point where AGI is possible. I agree that restricting AGI research then is likely a lost cause. Much less clear to me is whether it is equally futile to try restricting IT and computing research or even general technological progress before such a point. Could we expect to bring about global technological stasis? One may be tempted to say that such an effort is doomed to a fate like global warming accords except ten times deader. I disagree entirely. Both Europe and the United States, in fact, have in recent years been implementing a quite effective policy of zero economic growth! It is true that progress in computing has continued despite the general slowdown but this seems hardly written in stone for the future. In fact we need only consider this paper on existential risks and the "Crunches" section for several examples of how stasis might be brought about.

Can anyone recommend detailed discussions of broad relinquishment, from any point of view? The closest writings I know are Bill McKibben's book Enough and Bill Joy's essay, but anything else would be great.

Comment author: lukeprog 13 July 2012 03:09:09AM 2 points [-]

I'm pretty sure we have the computing power to run at least one AGI, but I can't prove that. Still, restricting general computing progress should delay the arrival of AGI because the more hardware you have, the "dumber" you can be with solving the AGI software problem. (You can run less efficient algorithms.)

Global technological stasis seems just as hard or maybe harder than restricting AGI research. The incentives for AGI are huge, but there might be some points at which you have to spend a lot of money before you get much additional economic/political/military advantage. But when it comes to general computing progress, then it looks to be always the case that a bit more investment can always yield better returns, e.g. by squeezing more processor cores into a box a little more tightly.

Other difficulties of global technological stasis are discussed in the final chapter of GCR, called "The Totalitarian Threat." Basically, you'd need some kind of world government, because any country that decides to slow down its computing progress rapidly falls behind other nations. But political progress is so slow that it seems unlikely we'll get a world government in the next century unless somebody gets a decisive technological advantage via AGI, in which case we're talking about the AGI problem anyway. (The world government scenario looks like the most plausible of Bostrom's "crunches", which is why I chose to discuss it.)

Relinquishment is also discussed (in very different ways) by Berglas (2009), Kaczynski (1995), and De Garis (2005).

Comment author: aShepherd 13 July 2012 06:00:14AM 0 points [-]

Quoting the B-man:

A world government may not be the only form of stable social equilibrium that could permanently thwart progress. Many regions of the world today have great difficulty building institutions that can support high growth. And historically, there are many places where progress stood still or retreated for significant periods of time. Economic and technological progress may not be as inevitable as is appears to us.

It seems to me like saying world government is necessary underestimates the potential impact of growth-undermining ideas. If all large governments for example buy into the idea that cutting government infrastructure spending in a recession boosts employment, then we can assume that global growth will slow as a result of the acceptance of this false assertion. To me the key seems less to have some world nabob pronouncing edicts on technology and more shifting the global economy so as to make edicts merely the gift wrap.

I will definitely take a look at that chapter of GCR. Thanks also for the other links. The little paper by Berglas I found interesting. Mr. K needs no comment, naturally. Might be good reading for similar reasons that Mien Kampf is. With De Garis I have long felt like the kook-factor was too high to warrant messing with. Anyone read him much? Maybe it would be good just for some ideas.

Comment author: lukeprog 13 July 2012 06:45:45AM 1 point [-]

It certainly could be true that economic growth and technological progress can slow down. In fact, I suspect the former if not the latter will slow down, and perhaps the latter, too. That's very different from stopping technological progress that will lead to AGI, though.

Comment author: elharo 08 February 2013 12:59:04PM 1 point [-]

Not only can economic growth and technological progress slow down. They can stop and reverse. Just because we're now further out in front than humanity has ever been before in history does not mean that we can't go backwards. Economic growth is probably more likely to reverse than technological progress. That's what a depression is, after all.

But a sufficiently bad global catastrophe, perhaps one that destroyed the electrical grid and other key infrastructure, could reverse a lot of technological progress too and perhaps knock us way back without necessarily causing complete extinction.

Comment author: aShepherd 14 July 2012 03:51:55AM *  0 points [-]

I think technological stasis could really use more discussion. For example I was able to find this paper by James Hughes discussing relinquishment. He however treats the issue as one of regulation and verification, similar to nuclear weapons, noting that:

We do not yet have effective global institutions capable of preventing determined research of any kind, even apocalyptic. I believe we will be build strong global institutions in the next couple of decades. Global institutions may regulate for safety, prevent weaponization, support technology transfer, and so on. But there will be no support for global governance that attempts to deny developing countries the right to emerging technologies.

Regulation and verification may indeed be a kind of Gordian knot. The more specific the technologies you want to stop, the harder it becomes to do that and still advance generally. Berglas recognizes that problem in his paper and so proposes restricting computing as a whole. Even this, however, may be too specific to cut the knot. The feasibility of stopping economic growth entirely so we never reach the point where regulation is necessary seems to me an unexplored question. If we look at global GDP growth over the past 50 years it's been uniformly positive except for the most recent recession. It's also been quite variable over short periods. Clearly stopping it for longer than a few years would require some new phenomenon driving a qualitative break with the past. That does not mean however that a stop is impossible.

There does exist a small minority camp within the economics profession advocating no-growth policies for environmental or similar reasons. I wonder if anyone has created a roadmap for bringing such policies about on a global level.

Comment author: gwern 14 July 2012 11:42:55PM *  0 points [-]

Out of curiosity, have you read my little essay "Slowing Moore's Law"? It seems relevant.

Comment author: Jonathan_Graehl 10 July 2012 01:53:00AM *  16 points [-]

How do I know that supporting SI doesn't end up merely funding a bunch of movement-building leading to no real progress?

It seems to me that the premise of funding SI is that people smarter (or more appropriately specialized) than you will then be able to make discoveries that otherwise would be underfunded or wrongly-purposed.

I think the (friendly or not) AI problem is hard. So it seems natural for people to settle for movement-building or other support when they get stuck.

That said, some of the collateral output to date has been enjoyable.

Comment author: ciphergoth 10 July 2012 09:01:26AM 10 points [-]

For SI, movement building is directly progress more than it is for, say, Oxfam, because a big part of their mission is to try and persuade people not to do the very dangerous thing.

Comment author: Jonathan_Graehl 10 July 2012 10:05:01PM *  6 points [-]

Good point. But I don't see any evidence that anyone who was likely to create an AI soon, now won't.

Those whose profession and status is in approximating AI largely won't change course for what must seem to them like sci-fi tropes. [1]

Or, put another way, there are working computer scientists who are religious - you can't expect reason everywhere in someone's life.

[1] but in the long run, perhaps SI and others can offer a smooth transition for dangerously smart researchers into high-status alternatives such as FAI or other AI risk mitigation.

Comment author: endoself 11 July 2012 12:09:52AM *  7 points [-]

But I don't see any evidence that anyone who was likely to create an AI soon, now won't.

According to Luke, Moshe Looks (head of Google's AGI team) is now quite safety conscious, and a Singularity Institute supporter.

Comment author: lukeprog 19 October 2012 12:53:52AM *  2 points [-]

Update: It's not really correct to say that Google has "an AGI team." Moshe Looks has been working on program induction, and this guy said that some people are working on AI "on a large scale," but I'm not aware of any publicly-visible Google project which has the ambitions of, say, Novamente.

Comment author: lincolnquirk 11 July 2012 04:46:09AM 4 points [-]

The plausible story in movement-building is not convincing existing AGI PIs to stop a long program of research, but instead convincing younger people who would otherwise eventually become AGI researchers to do something safer. The evidence to look for would be people who said "well, I was going to do AI research but instead I decided to get involved with SingInst type goals" -- and I suspect someone who knows the community better might be able to cite quite a few people for whom this is true, though I don't have any names myself.

Comment author: Jonathan_Graehl 11 July 2012 05:40:09PM *  2 points [-]

I didn't think of that. I expect current researchers to be dead or nearly senile by the time we have plentiful human substitutes/emulations, so I shouldn't care that incumbents are unlikely to change careers (except for the left tail - I'm very vague in my expectation).

Comment author: lukeprog 11 June 2013 01:59:12AM 8 points [-]

Behold, I come bearing real progress! :)

Comment author: Jonathan_Graehl 11 June 2013 04:38:39AM 2 points [-]

The best possible response. I haven't ready any of them yet, but the topics seem relevant to the long range goal of becoming convinced of the Friendliness of complicated programs.

Comment author: lukeprog 10 July 2012 01:58:29AM *  6 points [-]

Movement-building is progress, but...

I hear ya. If I'm your audience, you're preaching to the choir. Open Problems in Friendly AI — more in line with what you'd probably call "real progress" — is something I've been lobbying for since I was hired as a researcher in September 2011, and I'm excited that Eliezer plans to begin writing it in mid-August, after SPARC.

some of the collateral output to date has been enjoyable

Such as?

Comment author: Jonathan_Graehl 10 July 2012 10:01:01PM *  3 points [-]

The philosophy and fiction have been fun (though they hardly pay my bills).

I've profited from reading well-researched posts on the state of evidence-based (social-) psychology / nutrition / motivation / drugs, mostly from you, Yvain, Anna, gwern, and EY (and probably a dozen others whose names aren't available).

The bias/rationality stuff was fun to think but "ugh fields", for me at least, turned out to be the only thing that mattered. I imagine that's different for other types of people, though.

Additionally, the whole project seems to have connected people who didn't belong to any meaningful communities (thinking of various regional meetup clusters).

Comment author: JaneQ 12 July 2012 07:18:33AM 4 points [-]

It seems to me that the premise of funding SI is that people smarter (or more appropriately specialized) than you will then be able to make discoveries that otherwise would be underfunded or wrongly-purposed.

But then SI has to have dramatically better idea what research has to be funded to protect the mankind, than every other group of people capable of either performing such research or employing people to perform such research.

Muehlhauser has stated that SI should be compared to alternatives in form of the organizations working on the AI risk mitigation, but that seems like an overly narrow choice reliant on presumption that it is not an alternative to not work on AI risk mitigation now.

For example, 100 years ago it would seem to have been too early to fund work on AI risk mitigation; that may still be the case; as the time gone on one could naturally expect that the opinions will form a distribution and the first organizations offering AI risk mitigation will pop up earlier than the time at which such work is effective. When we look into the past through the goggles of notoriety, we don't see all the failed early starts.

Comment author: Vladimir_Nesov 13 July 2012 12:54:16AM *  6 points [-]

For example, 100 years ago it would seem to have been too early to fund work on AI risk mitigation

Disagree. There are many remaining theoretical (philosophical and mathematical) difficulties whose investigation doesn't depend on the current level of technology. It would've been better to start working on the problem 300 years ago, when AI risk was still far away. Value of information on this problem is high, and we don't (didn't) know that there is nothing to be discovered, it wouldn't be surprising if some kind of progress is made.

Comment author: Eliezer_Yudkowsky 14 July 2012 12:43:30AM 13 points [-]

I do think OP is right that in practice, 100 years ago, it would have been really hard to figure out what an AI issue looked like. This was pre-Godel, pre-decision-theory, pre-Bayesian-revolution, and pre-computer. Yes, a sufficiently competent Earth would be doing AI math before it had the technology for computers, in full awareness of what it meant - but that's a pretty darned competent Earth we're talking about.

Comment author: JaneQ 14 July 2012 09:29:50AM 2 points [-]

I think it is fair to say Earth was doing the "AI math" before the computers. Extending to the today - there is a lot of mathematics to be done for a good, safe AI - but how are we to know that the SI has the actionable effort planning skills required to correctly identify and fund research in such mathematics?

I know that you believe that you have the required skills; but note that in my model such belief results from both the presence of extraordinary effort planning skill, and from absence of effort planning skills. The prior probability of extraordinary effort planning skill is very low. Furthermore as the effort planning is, to some extent, a cross domain skill, the prior inefficacy (which was criticized by Holden) seem to be a fairly strong evidence against extraordinary skills in this area.

Comment author: Eliezer_Yudkowsky 14 July 2012 11:36:07PM 10 points [-]

If my writings (on FAI, on decision theory, and on the form of applied-math-of-optimization called human rationality) so far haven't convinced you that I stand a sufficient chance of identifying good math problems to solve to maintain the strength of an input into existential risk, you should probably fund CFAR instead. This is not, in any way shape or form, the same skill as the ability to manage a nonprofit. I have not ever, ever claimed to be good at managing people, which is why I kept trying to have other people doing it.

Comment author: JaneQ 16 July 2012 02:07:09PM *  4 points [-]

I'm not sure why you think that such writings should convince a rational person that you have the relevant skill. If you were an art critic, even a very good one, that would not convince people you are a good artist.

This is not, in any way shape or form, the same skill as the ability to manage a nonprofit.

Indeed, but you are asking me to assume that the skills you display writing your articles are the same skill as the skills relevant to directing the AI effort.

edit: Furthermore, when it comes to works on rationality as 'applied math of optimization', the most obvious way to classify those writings is to look for some great success attributable to your writings - some highly successful businessmen saying how much the article on such and such fallacy helped them succeed, that sort of thing.

Comment author: AlexanderD 14 November 2012 07:27:10AM *  3 points [-]

It seems to me that the most obvious way to demonstrate the brilliance and excellent outcomes of the applied math of optimization would be to generate large sums of money, rather than seeking endorsements.

The Singularity Institute could begin this at no cost (beyond opportunity cost of staff time) by employing the techniques of rationality in a fake market, for example, if stock opportunities were the chosen venue. After a few months of fake profits, SI could set them up with $1,000. If that kept growing, then a larger investment could be considered.

This has been done, very recently. Someone on Overcoming Bias recently wrote of how they and some friends made about $500 each with a small investment by identifying an opportunity for arbitrage between the markets on InTrade and another prediction market, without any loss.

Money can be made, according to proverb, by being faster, luckier, or smarter. It's impossible to create luck in the market, and in the era of microsecond purchases by Goldman Sachs it's very nearly impossible to be faster, but an organization (or perhaps associated organizations?) devoted to defeating internal biases and mathematically assessing the best choices in the world should be striving to be smarter.

While it seems very interesting and worthwhile to work on existential risk from UFAI directly, it seems like the smarter thing to do might be to devote a decade to making an immense pile of money for the institute and developing the associated infrastructure (hiring money managers, socking a bunch away into Berkshire Hathaway for safety, etc.) Then hire a thousand engineers and mathematicians. And what's more, you'll raise awareness of UFAI an incredibly greater amount than you would have otherwise, plugging along as another $1-2m charity.

I'm sure this must have been addressed somewhere, of course - there is simply way too much written in too many places by too many smart people. But it is odd to me that SI's page on Strategic Insight doesn't have as #1: Become Rich. Maybe if someone notices this comment, they can point me to the argument against it?

Comment author: Kawoomba 14 November 2012 08:28:29AM 1 point [-]

The official introductory SI pages may have to sugarcoat such issues due to PR considerations ("everyone get rich, then donate your riches" sends off a bad vibe).

As you surmised, your idea has been brought up quite often in various contexts, especially in optimal charity discussions. For many/most endeavors, the globally optimal starting steps are "acquire more capabilities / become more powerful" (players of strategy games may be more explicitly cognizant of that stratagem).

I also do remember speculation that friendly AI and unfriendly AI may act very similarly at first - both choosing the optimal path to powering up, so that they can pursue the differing goals of their respective utility functions more efficiently at a future point in time. So your thoughts on the matter seem compatible with the local belief cluster.

Your money proverb seems to still hold true, anecdotally I'm acquainted with some CS people making copious amounts of money on NASDAQ doing simple ANOVA analyses, while barely being able to spell the companies' names. So why aren't we doing that? Maybe a combination of mental inertia and being locked into a research/get endorsements modus operandi, which may be hard to shift out of into a more active "let's create start-ups"/"let's do day-trading" mode.

A goal-function of "seek influential person X's approval" will lead to a different mind set from "let quantifiable results speak for themselves", the latter will allow you not to optimize every step of the way for signalling purposes.

Comment author: paper-machine 13 July 2012 01:20:37AM 4 points [-]

How would you even pose the question of AI risk to someone in the eighteenth century?

I'm trying to imagine what comes out the other end of Newton's chronophone, but it sounds very much like "You should think really hard about how to prevent the creation of man-made gods."

Comment author: Vladimir_Nesov 13 July 2012 01:27:24AM *  3 points [-]

I don't think it's plausible that people could stumble on the problem statement 300 years ago, but within that hypothetical, it wouldn't have been too early.

Comment author: JaneQ 13 July 2012 02:04:02PM 2 points [-]

It seems to me that 100 years ago (or more) you would have to consider pretty much any philosophy and mathematics to be relevant to AI risk reduction, as well as reduction of other potential risks, and the attempts to select the work particularly conductive to the AI risk reduction would not be able to succeed. Effort planning is the key to success.

On somewhat unrelated: Reading the publications and this thread, there is point of definitions that I do not understand: what exactly does S.I. mean when it speaks of "utility function" in the context of an AI? Is it a computable mathematical function over a model, such that the 'intelligence' component computes the action that results in maximum of that function taken over the world state resulting from the action?

Comment author: johnlawrenceaspden 16 July 2012 11:30:09AM 0 points [-]

Surely "Effort planning is a key to success"?

Also, and not just wanting to flash academic applause lights but also genuinely curious, which mathematical successes have been due to effort planning? Even in my own mundane commercial programming experiences, the company which won the biggest was more "This is what we'd like, go away and do it and get back to us when it's done..." than "We have this Gantt chart...".

Comment author: summerstay 18 July 2012 02:06:22PM 1 point [-]

There are very few people who would have understood in the 18th century, but Leibniz would have understood in the 17th. He underestimated the difficulty in creating an AI, like everyone did before the 1970s, but he was explicitly trying to do it.

Comment author: paper-machine 18 July 2012 05:52:04PM *  0 points [-]

Your definition of "explicit" must be different from mine. Working on prototype arithmetic units and toying with the universal characteristic is AI research? He subscribed wholeheartedly to the ideographic myth; the most he would have been capable of is a machine that passes around LISP tokens.

In any case, based on the Monadology, I don't believe Leibniz would consider the creation of a godlike entity to be theologically possible.

Comment author: army1987 15 July 2012 12:28:58PM 0 points [-]

Oh, wait... The tale of the Tower of Babel was told via chronophone by people from the future right before succumbing to uFAI!

Comment author: johnlawrenceaspden 16 July 2012 11:34:43AM 0 points [-]

How about: "Eventually your machines will be so powerful they can grant wishes. But remember that they are not benevolent. What will you wish for when you can make a wish-machine?"

Comment author: army1987 13 July 2012 10:14:03PM 1 point [-]

That's hindsight. Nobody could have reasonably foreseen the rise of very powerful computing machines that far ago.

Comment author: Jonathan_Graehl 12 July 2012 11:49:51PM 1 point [-]

100 years ago it would seem to have been too early to fund work on AI risk mitigation

Hilarious, and an unfairly effective argument. I'd like to know such people, who can entertain an idea that will still be tantalizing yet unresolved a century out.

that seems like an overly narrow choice reliant on presumption that it is not an alternative to not work on AI risk mitigation now.

Yes. I agree with everything else, too, with the caveat that SI is not the first organization to draw attention to AI risk) - not that you said so.

Comment author: RobertLumley 09 July 2012 10:42:50PM *  16 points [-]

Regarding the theft:

I was telling my friend (who recently got into HPMOR and lurks a little on LW) about Holden's critique, specifically with regard to the theft. He's an accounting and finance major, and was a bit taken aback. His immediate response was to ask if SI had an outside accountant audit their statements. We searched around and it doesn't look like to us that you do. He immediately said that he would never donate to an organization that did not have an accountant audit their statements, and knowing how much I follow LW, immediately advised me to not to either. This seems like a really good step for addressing the transparency issues here, and now that he mentions it, seems a very prudent and obvious thing for any nonprofit to do.

Edit 2: Luke asked me to clarify, I am not necessarily endorsing not donating to SI because of this, unless this problem is a concern of yours. My intent was only to suggest ways SI can improve and may be turning away potential donors.

Edit: He just mentioned to me that the big four accounting firms often do pro bono work because it can be a tax write-off. This may be worth investigating.

Comment author: lukeprog 09 July 2012 11:33:07PM 24 points [-]

Also note that thefts of this size are not as rare as they appear, because many non-profits simply don't report them. I have inside knowledge about very few charities, but even I know one charity that suffered a larger theft than SI did, and they simply didn't tell anybody. They knew that donors would punish them for the theft and not at all reward them for reporting it. Unfortunately, this is probably true for SI, too, which did report the theft.

Comment author: Eliezer_Yudkowsky 10 July 2012 12:56:21AM 25 points [-]

Yep. We knew that would happen at the time - it was explicitly discussed in the Board meeting - and we went ahead and reported it anyway, partly because we didn't want to have exposable secrets, partly because we felt honesty was due our donors, and partially because I'd looked up embezzlement-related stuff online and had found that a typical nonprofit-targeting embezzler goes through many nonprofits before being reported and prosecuted by a nonprofit "heroic" enough, if you'll pardon the expression, to take the embarrassment-hit in order to stop the embezzler.

Comment author: shminux 10 July 2012 01:39:44AM *  2 points [-]

I suspect that some of the hit was due to partial disclosure. Outsiders were left guessing what exactly had transpired and why, and what specific steps were taken to address the issue. Maybe you had to do it this way for legal reasons, but this was never spelled out explicitly.

Comment author: Eliezer_Yudkowsky 10 July 2012 07:32:35PM 2 points [-]

Pretty sure it was spelled out explicitly.

Comment author: lukeprog 09 July 2012 10:47:19PM *  19 points [-]

Yes, we're currently in the process of hiring a bookkeeper (interviewed one, scheduling interviews with 2 others), which will allow us to get our books in enough order that an accountant will audit our statements. We do have an outside accountant prepare our 990s already. Anyway, this all requires donations. We can't get our books cleaned up and audited unless we have the money to do so.

Also, it's my impression that many or most charities our size and smaller don't have their books audited by an accountant because it's expensive to do so. It's largely the kind of thing a charity does when they have a bigger budget than we currently do. But I'd be curious to see if there are statistics on this somewhere; I could be wrong.

And yes, we are investigating the possibility of getting pro bono work from an accounting firm; it's somewhere around #27 on my "urgent to-do list." :)

Edit: BTW, anyone seriously concerned about this matter is welcome to earmark their donations for "CPA audit" so that those donations are only used for (1) paying a bookkeeper to clean up our processes enough so that an accountant will sign off on them, and (2) paying for a CPA audit of our books. I will personally make sure those earmarks are honored.

Comment author: private_messaging 11 July 2012 04:11:39PM *  2 points [-]

How many possible universes could here be (what % of the universes), where not donating to a charity that does not do accounting right when pulling in 500 grand a year, would result in destruction of mankind? 500 grand a year is not so little when you can get away with it. My GF's family owns a company smaller than that (in the US) and it has books in order.

Comment author: homunq 23 July 2012 04:41:30PM 0 points [-]

Yeah, that would be really unfair, wouldn't it? And so it's hard to believe it could be true. And so it must not be.

(I actually don't believe it is likely to be true. But the fact it sounds silly and unfairly out-of-proportion is one of the worst possible arguments against it.)

Comment author: Eliezer_Yudkowsky 10 July 2012 12:47:41AM 13 points [-]

You can't deduct the value of services donated to nonprofits. Not sure your friend is as knowledgeable as stated. Outside accounting is expensive and the IRS standard is to start doing it once your donations hit $2,000,000/year, which we haven't hit yet. Also, SIAI recently passed an IRS audit.

Comment author: Vaniver 10 July 2012 01:48:13AM 10 points [-]

Fifteen seconds of Googling resulted in Deloitte's pro-bono service, which is done for CSR and employee morale rather than tax avoidance. Requests need to originate with Deloitte personnel- I know a friend who works there who might be interested in LW, but it'd be a while before I'd be comfortable asking him to recommend SI. It's a big enough company that it's likely that there are some HPMOR or LW fans that work there.

Comment author: Cosmos 11 July 2012 01:47:13AM 7 points [-]

Interesting!

"Applications for a contribution of pro bono professional services must be made by Deloitte personnel. To be considered for a pro bono engagement, a nonprofit organization (NPO) with a 501c3 tax status must have an existing relationship with Deloitte through financial support, volunteerism, Deloitte personnel serving on its Board of Directors or Trustees, or a partner, principal or director (PPD) sponsor (advocate for the duration of the engagement). External applications for this program are not accepted. Organizations that do not currently have a relationship with Deloitte are welcome to introduce themselves to the Deloitte Community Involvement Leader in their region, in the long term interest of developing one."

Deloitte is requiring a very significant investment from its employees before offering pro bono services. Nonetheless, I have significant connections there and would be willing to explore this option with them.

Comment author: RobertLumley 11 July 2012 02:27:09PM 7 points [-]

You might want to pm this directly to lukeprog to make sure that he sees this comment. Since you replied to Vaniver, he may have not seen it, and this seems important enough to merit the effort.

Comment author: Cosmos 13 July 2012 03:46:28AM 0 points [-]

Thanks for the excellent idea! I did in fact email Lukeprog personally to let him know. :)

Comment author: lukeprog 10 July 2012 02:03:02AM 7 points [-]

Thanks. As I said, this is something on our to-do list, but I didn't know about Deloitte in particular.

Comment author: lukeprog 10 July 2012 12:53:39AM 9 points [-]

Clarifications:

  • In California, a non-profit is required to hire a CPA audit once donations hit $2m/yr, which SI hasn't hit yet. That's the way in which outside accounting is "IRS standard" after $2m/yr.
  • SI is in the process of passing an IRS audit for the year 2010.
Comment author: lukeprog 10 July 2012 01:07:58AM 8 points [-]

Eliezer is right: RobertLumley's friend is mistaken:

can the value of your time and services while providing pro bono legal services qualify as a charitable contribution that is deductible from gross income on your federal tax return? Unfortunately, in a word, nope.

According to IRS Publication 526, “you cannot deduct the value of your time or services, including blood donations to the Red Cross or to blood banks, and the value of income lost while you work as an unpaid volunteer for a qualified organization.”

Comment author: somervta 10 July 2012 01:01:50AM 1 point [-]

He may be referring to the practice of being paid for work, then giving it back as a tax-deductible charitable donation. My understanding is that you can also deduct expenses you incur while working for a non-profit - admittedly not something I can see applying to accounting. There's also cause marketing, but that's getting a bit further afield.

Comment author: GuySrinivasan 10 July 2012 02:02:45AM 2 points [-]

In the one instance of a non-profit getting accounting work done that I know of, the non-profit paid and then received an equal donation. Magic.

Comment author: Eliezer_Yudkowsky 11 July 2012 12:04:28AM 3 points [-]

This is exactly equivalent to not paying, which is precisely the IRS rationale for why donated services aren't directly deductible.

Comment author: lukeprog 10 July 2012 01:12:11AM 0 points [-]

"the big four accounting firms often do pro bono work because it can be a tax write-off" doesn't sound much like "being paid for work, then giving it back as a tax-deductible charitable donation".

Comment author: RobertLumley 10 July 2012 01:56:56PM 1 point [-]

In talking to him, I think he may have just known they do pro bono work and assumed it was because of taxes. Given Vaniver's comment, this seems pretty likely to me. He did say that the request usually has to originate from inside the company, which is consistent with that comment.

Comment author: somervta 13 July 2012 12:30:17AM 0 points [-]

Ah. That would make more sense.

Comment author: lukeprog 11 July 2012 04:34:29AM 7 points [-]

I expected more disagreement than this. Was my post really that persuasive?

Comment author: Kaj_Sotala 11 July 2012 08:21:49AM *  19 points [-]

I linked this to an IRC channel full of people skeptical of SI. One person commented that

the reply doesn't seem to be saying much

and another that

I think most arguments are 'yes we are bad but we will improve'
and some opinion based statement about how FAI is the most improtant thing on the world.

Which was somewhat my reaction as well - I can't put a finger on it and say exactly what it is that's wrong, but somehow it feels like this post isn't "meaty" enough to elicit much of a reaction, positive or negative. Which on the other feels odd, since e.g. the "SI's mission assumes a scenario that is far less conjunctive than it initially appears" heading makes an important point that SI hasn't really communicated well in the past. Maybe it just got buried under the other stuff, or something.

Comment author: lukeprog 11 July 2012 08:31:32AM *  3 points [-]

That's an unfortunate response, given that I offered a detailed DH6-level disagreement (quote the original article directly, and refute the central points), and also offered important novel argumentation not previously published by SI. I'm not sure what else people could have wanted.

If somebody figures out why Kaj and some others had the reaction they did, I'm all ears.

Comment author: TheOtherDave 11 July 2012 01:39:44PM 18 points [-]

I can't speak for anyone else, and had been intending to sit this one out, since my reactions to this post were not really the kind of reaction you'd asked for.

But, OK, my $0.02.

The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence. This puts a huge burden on you, as the person attempting to provide that evidence.

So, I'll ask you: do you think your response provides such evidence?

If you do, then your problem seems to be (as others have suggested) one of document organization. Perhaps starting out with an elevator-pitch answer to the question "Why should I believe that SI is capable of this extraordinary feat?" might be a good idea.

Because my take-away from reading this post was "Well, nobody else is better suited to do it, and SI does some cool movement-building stuff (the Sequences, the Rationality Camps, and HPMoR) that attracts smart people and encourages them to embrace a more rational approach to their lives, and SI is fixing some of its organizational and communication problems but we need more money to really make progress on our core mission."

Which, if I try to turn it into an answer to the initial question, gives me "Well, we're better-suited than anyone else because, unlike them, we're focused on the right problem... even though you can't really tell, because what we are really focused on is movement-building, but once we get a few million dollars and the support of superhero mathematicians, we will totally focus on the right problem, unlike anyone else."

If that is in fact your answer, then one thing that might help is to make a more credible visible precommitment to that eventuality.

For example: if you had that "few million dollars a year" revenue stream, and if you had the superhero mathematician, what exactly would you do with them for, say, the first six months? Lay out that project plan in detail, establish what your criteria would be to make sure you were still focused on the right problem three months in, and set up an escrow fund (a la Kickstarter, where the funds are returned if the target is not met) to support that project plan so people who are skeptical of SI's organizational ability to actually do any of that stuff have a way of supporting the plan IFF they're wrong about SI, without waiting for their wrongness to be demonstrated before providing the support.

If your answer is in fact something else, then stating it more clearly might help.

Comment author: Eliezer_Yudkowsky 11 July 2012 07:29:23PM 23 points [-]

The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence.

Reminder: I don't know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability. E.g. although global warming has very large consequences, and even implies that we should take large actions, it isn't improbable a priori that carbon dioxide should trap heat in the atmosphere - it's supposed to happen, according to standard physics. And so demanding strong evidence that global warming is anthropogenic is bad probability theory and decision theory. Expensive actions imply a high value of information, meaning that if we happen to have access to cheap, powerfully distinguishing evidence about global warming we should look at it; but if that evidence is not available, then we go from the default extrapolation from standard physics and make policy on that basis - not demand more powerful evidence on pain of doing nothing.

The claim that SIAI is currently best-suited to convert marginal dollars into FAI and/or general x-risk mitigation has large consequences. Likewise claims like "most possible self-improving AIs will kill you, although there's an accessible small space of good designs". This is not the same as saying that if the other facts of the world are what they appear at face value to be, these claims should require extraordinary evidence before we believe them.

Since reference class tennis is also a danger (i.e, if you want to conclude that a belief is false, you can always find a reference class in which to put it where most beliefs are false, e.g. classifying global warming as an "apocalyptic belief"), one more reliable standard to require before saying "Extraordinary claims require extraordinary evidence" is to ask what prior belief needs to be broken by the extraordinary evidence, and how well-supported that prior belief may be. Suppose global warming is real - what facet of existing scientific understanding would need to change? None, in fact; it is the absence of anthropogenic global warming that would imply change in our current beliefs, so that's what would require the extraordinary evidence to power it. In the same sense, an AI showing up as early as 2025, self-improving, and ending the world, doesn't make us say "What? Impossible!" with respect to any current well-supported scientific belief. And if SIAI manages to get together a pack of topnotch mathematicians and solve the FAI problem, it's not clear to me that you can pinpoint a currently-well-supported element of the world-model which gets broken.

The idea that the proposition contains too much burdensome detail - as opposed to an extraordinary element - would be a separate discussion. There are fewer details required than many strawman versions would have it; and often what seems like a specific detail is actually just an antiprediction, i.e., UFAI is not about a special utility function but about the whole class of non-Friendly utility functions. Nonetheless, if someone's thought processes were dominated by model risk, but they nonetheless actually cared about Earth's survival, and were generally sympathetic to SIAI even as they distrusted the specifics, it seems to me that they should support CFAR, part of whose rationale is explicitly the idea that Earth gets a log(number of rationalists) saving throw bonus on many different x-risks.

Comment author: ciphergoth 11 July 2012 07:48:24PM 12 points [-]

I am coming to the conclusion that "extraordinary claims require extraordinary evidence" is just bad advice, precisely because it causes people to conflate large consequences and prior improbability. People are fond of saying it about cryonics, for example.

Comment author: Viliam_Bur 12 July 2012 12:29:21PM 6 points [-]

We need two new versions of the advice, to satisfy everyone.

Version for scientists: "improbable claims require extraordinary evidence".

Versions for politicians: "inconvenient claims require extraordinary evidence".

Comment author: fubarobfusco 11 July 2012 08:15:31PM 13 points [-]

At least sometimes, people may say "extraordinary claims require extraordinary evidence" when they mean "your large novel claim has set off my fraud risk detector; please show me how you're not a scam."

In other words, the caution being expressed is not about prior probabilities in the natural world, but rather the intentions and morals of the claimant.

Comment author: private_messaging 12 July 2012 12:46:47PM *  0 points [-]

Well, consider strategic point of view. Suppose that a system (humans) is known for it's poor performance at evaluating the claims without performing direct experimentation. Long, long history of such failures.

Consider also that a false high-impact claim can ruin ability of this system to perform it's survival function, with again a long history of such events; the damage is proportionally to the claimed impact. (Mayans are a good example, killing people so that the sun will rise tomorrow; great utilitarian rationalists they were; believing that their reasoning is perfect enough to warrant such action. Note that donating to a wrong charity instead of a right one kills people)

When we anticipate that a huge percentage of the claims will be false, we can build the system to require evidence that if the claim was false the system would be in a small probability world (i.e. require that for a claim evidence was collected so that p(evidence | ~claim)/p(evidence | claim) is low), to make the system, once deployed, fall off the cliffs less often. The required strength of the evidence is then increasing with impact of the claim.

It is not an ideal strategy, but it is the one that works given the limitations. There are other strategies and it is not straightforward to improve performance (and easy to degrade performance by making idealized implicit assumptions).

Comment author: TheOtherDave 11 July 2012 08:11:12PM 4 points [-]

I don't know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability.

What I meant when I described the claim (hereafter "C") that SI is better suited to convert dollars to existential risk mitigation than any other charitable organization as "extraordinary" was that priors for C are low (C is false for most organizations, and therefore likely to be false for SI absent additional evidence about SI), not that C has large consequences (although that is true as well).

Yes, this might be a failing of using the wrong reference class (charitable organizations in general) to establish one's priors., as you suggest. The fact remains that when trying to solicit broad public support, or support from an organization like GiveWell, it's likely that SI will be evaluated within the reference class of other charities. If using that reference class leads to improperly low priors for C, it seems SI has a few strategic choices:

1) Convince GiveWell, and donors in general, that SI is importantly unlike other charities, and should not be evaluated as though it were like them -- in other words, win at reference class tennis.

2) Ignore donors in general and concentrate its attention primarily on potential donors who already use the correct reference class.

3) Provide enough evidence to convince even someone who starts out with improperly low priors drawn from the incorrect reference class of "SI is a charity" to update to a sufficiently high estimate of C that donating money to SI seems reasonable (in practice, I think this is what has happened and is happening with anthropogenic climate change).

4) Look for alternate sources of funding besides charitable donations.

One way to approach strategy #1 is the one you use here -- shift the conversation from whether or not SI can actually spend money effectively to mitigate existential risk to whether or not uFAI/FAI by 2025 (or some other near-mode threshold) is plausible.

That's not a bad tactic; it works pretty well in general.

Comment author: Eliezer_Yudkowsky 11 July 2012 08:47:05PM 4 points [-]

Your statement was that it was an extraordinary claim that SIAI provided x-risk reduction - why then would SIAI be compared to most other charities, which don't provide x-risk reduction, and don't claim to provide x-risk reduction? The AI-risk item was there for comparison of standards, as was global warming; i.e., if you claim that you doubt X because of Y, but Y implies doubting Z, but you don't doubt Z, you should question whether you're really doubting X because of Y.

Comment author: TheOtherDave 11 July 2012 09:05:59PM 4 points [-]

why then would SIAI be compared to most other charities, which don't provide x-risk reduction, and don't claim to provide x-risk reduction?

Are you trying to argue that it isn't in fact being compared to other charities? (Specifically, by GiveWell?) Or merely that if it is, those doing such comparison are mistaken?

If you're arguing the former... huh. I will admit, in that case, that almost everything I've said in this thread is irrelevant to your point, and I've completely failed to follow your argument. If that's the case, let me know and I'll back up and re-read your argument in that context.

If you're arguing the latter, well, I'm happy to grant that, but I'm not sure how relevant it is to Luke's goal (which I take to be encouraging Holden to endorse SI as a charitable donation).

If SI wants to argue that GiveWell's expertise with evaluating other charities isn't relevant to evaluating SI because SI ought not be compared to other charities in the first place, that's a coherent argument (though it raises the question of why GiveWell ever got involved in evaluating SI to begin with... wasn't that at SI's request? Maybe not. Or maybe it was, but SI now realizes that was a mistake. I don't know.)

But as far as I can tell that's not the argument SI is making in Luke's reply to Holden. (Perhaps it ought to be? I don't know.)

Comment author: Eliezer_Yudkowsky 11 July 2012 10:33:35PM 8 points [-]

I worry that this conversation is starting to turn around points of phrasing, but... I think it's worth separating the ideas that you ought to be doing x-risk reduction and that SIAI is the most efficient way to do it, which is why I myself agreed strongly with your own, original phrasing, that the key claim is providing the most efficient x-risk reduction. If someone's comparing SIAI to Rare Diseases in Cute Puppies or anything else that isn't about x-risk, I'll leave that debate to someone else - I don't think I have much comparative advantage in talking about it.

Comment author: TheOtherDave 12 July 2012 12:26:04AM 1 point [-]

I agree with you on all of those points.

Further, it seems to me that Holden is implicitly comparing SI to other charitable-giving opportunities when he provides GW's evaluation of SI, rather than comparing SI to other x-risk-reduction opportunities.
I tentatively infer, from the fact that you consider responding to such a comparison something you should leave to others but you're participating in a discussion of how SI ought to respond to Holden, that you don't agree that Holden is engaging in such a comparison.

If you're right, then I don't know what Holden is doing, and I probably don't have a clue how Luke ought to reply to Holden.

Comment author: private_messaging 12 July 2012 12:28:11PM -2 points [-]

here are fewer details required than many strawman versions would have it; and often what seems like a specific detail is actually just an antiprediction, i.e., UFAI is not about a special utility function but about the whole class of non-Friendly utility functions.

If by "utility function" you mean "a computable function, expressible using lambda calculus" (or Turing machine tape or python code, that's equivalent), then the arguing that majority of such functions lead to a model-based utility-based agent killing you, is a huge stretch, as such functions are not grounded and the correspondence of model with the real world is not a sub-goal to finding maximum of such function.

Comment author: lukeprog 11 July 2012 06:11:48PM 5 points [-]

The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one... "Why should I believe that SI is capable of this extraordinary feat?"

SI is not exceptionally well-suited for x-risk mitigation relative to some ideal organization, but relative to the alternatives (as you said). But the reason I gave for this was not "unlike them, we're focused on the right problem", though I think that's true. Instead, the reasons I gave (twice!) were:

SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.

As for getting back to the original problem rather than just doing movement-building, well... that's what I've been fighting for since I first showed up at SI, via Open Problems in Friendly AI. And now it's finally happening, after SPARC.

if you had that "few million dollars a year" revenue stream, and if you had the superhero mathematician, what exactly would you do with them for, say, the first six months? Lay out that project plan in detail, establish what your criteria would be to make sure you were still focused on the right problem three months in, and set up an escrow fund (a la Kickstarter, where the funds are returned if the target is not met) to support that project plan so people who are skeptical of SI's organizational ability to actually do any of that stuff have a way of supporting the plan IFF they're wrong about SI, without waiting for their wrongness to be demonstrated before providing the support.

Yes, this is a promising idea. It's also probably 40-100 hours of work, and there are many other urgent things for us to do as well. That's not meant as a dismissal, just as a report from the ground of "Okay, yes, everyone's got a bunch of great ideas, but where are the resources I'm supposed to use to do all those cool things? I've been working my ass off but I can't do even more stuff that people want without more resources."

Comment author: TheOtherDave 11 July 2012 07:20:32PM 1 point [-]

It's also probably 40-100 hours of work, and there are many other urgent things for us to do as well.

Absolutely. As I said in the first place, I hadn't initially intended to reply to this, as I didn't think my reactions were likely to be helpful given the situation you're in. But your followup comment seemed more broadly interested in what people might have found compelling, and less in specific actionable suggestions, than your original post. So I decided to share my thoughts on the former question.

I totally agree that you might not have the wherewithal to do the things that people might find compelling, and I understand how frustrating that is.

It might help emotionally to explicitly not-expect that convincing people to donate large sums of money to your organization is necessarily something that you, or anyone, are able to do with a human amount of effort. Not that this makes the problem any easier, but it might help you cope better with the frustration of being expected to put forth an amount of effort that feels unreasonably superhuman.

Or it might not.

Instead, the reasons I gave (twice!) were: [..]

I'll observe that the bulk of the text you quote here is not reasons to believe SI is capable of it, but reasons to believe the task is difficult. What's potentially relevant to the former question is:

SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons [and] has lots of experience with these issues;

If that is your primary answer to "Why should I believe SI is capable of mitigating x-risk given $?", then you might want to show why the primary obstacles to mitigating x-risk are psychological/organizational issues rather than philosophical/technical ones, such that SI's competence at addressing the former set is particularly relevant. (And again, I'm not asserting that showing this is something you are able to do, or ought to be able to do. It might not be. Heck, the assertion might even be false, in which case you actively ought not be able to show it.)

You might also want to make more explicit the path from "we have experience addressing these psychological/organizational issues" to "we are good at addressing these psychological/organizational issues (compared to relevant others)". Better still might be to focus your attention on demonstrating the latter and ignore the former altogether.

Comment author: lukeprog 11 July 2012 08:04:46PM *  1 point [-]

Thank you for understanding. :)

My statement "SI has successfully concentrated lots of attention, donor support, and human capital [and also] has learned many lessons [and] has lots of experience with [these unusual, complicated] issues" was in support of "better to help SI grow and improve rather than start a new, similar AI risk reduction organization", not in support of "SI is capable of mitigating x-risk given money."

However, if I didn't also think SI was capable of reducing x-risk given money, then I would leave SI and go do something else, and indeed will do so in the future if I come to believe that SI is no longer capable of reducing x-risk given money. How to Purchase AI Risk Reduction is a list of things that (1) SI is currently doing to reduce AI risk, or that (2) SI could do almost immediately (to reduce AI risk) if it had sufficient funding.

Comment author: TheOtherDave 11 July 2012 08:22:19PM 1 point [-]

My statement [..] was in support of "better to help SI grow and improve rather than start a new, similar AI risk reduction organization", not in support of "SI is capable of mitigating x-risk given money."

Ah, OK. I misunderstood that; thanks for the clarification.
For what it's worth, I think the case for "support SI >> start a new organization on a similar model" is pretty compelling.

And, yes, the "How to Purchase AI Risk Reduction" series is an excellent step in the direction of making SI's current and planned activities, and how they relate to your mission, more concrete and transparent. Yay you!

Comment author: Mass_Driver 11 July 2012 09:32:33PM *  2 points [-]

I strongly agree with this comment, and also have a response to Eliezer's response to it. While I share TheOtherDave's views, as TheOtherDave noted, he doesn't necessarily share mine!

It's not the large consequences that make it a priori unlikely that an organization is really good at mitigating existential risks -- it's the objectively small probabilities and lack of opportunity to learn by trial and error.

If your goal is to prevent heart attacks in chronically obese, elderly people, then you're dealing with reasonably large probabilities. For example, the AHA estimates that a 60-year-old, 5'8" man weighing 220 pounds has a 10% chance of having a heart attack in the next 10 years. You can fiddle with their calculator here. This is convenient, because you can learn by trial or error whether your strategies are succeeding. If only 5% of a group of the elderly obese under your treatment have heart attacks over the next 10 years, then you're probably doing a good job. If 12% have heart attacks, you should probably try another tactic. These are realistic swings to expect from an effective treatment -- it might really be possible to cut the rate of heart attacks in half among a particular population.This study, for example, reports a 25% relative risk reduction. If an organization claims to be doing really well at preventing heart attacks, it's a credible signal -- if they weren't doing well, someone could check their results and prove it, which would be embarrassing for the organization. So, that kind of claim only needs a little bit of evidence to support it.

On the other hand, any given existential risk has a small chance of happening, a smaller chance of being mitigated, and, by definition, little or no opportunity to learn by trial and error. For example, the odds of an artificial intelligence explosion in the next 10 years might be 1%. A team of genius mathematicians funded with $5 million over the next 10 years might be able to reduce that risk to 0.8%. However, this would be an extraordinarily difficult thing to estimate. These numbers come from back-of-the-envelope Fermi calculations, not from hard data. They can't come from hard data -- by definition, existential risks haven't happened yet. Suppose 10 years go by, and the Singularity Institute gets plenty of funding, and they declare that they successfully reduced the risk of unfriendly AI down to 0.5%, and that they are on track to do the same for the next decade. How would anyone even go about checking this claim?

An unfriendly intelligence explosion, by its very nature, will use tactics and weaknesses that we are not presently aware of. If we learn about some of these weaknesses and correct them, then uFAI would use other weaknesses. The Singularity Institute wants to promote the development of a provably friendly AI; the thought is that if the AI's source code can be shown mathematically to be friendly, then, as long as the proof is correct and the code is faithfully entered by the programmers and engineers, we can achieve absolute protection against uFAI, because the FAI will be smart enough to figure that out for us. But while it's very plausible to think that we will face significant AI risk in the next 30 years (i.e., the risk arises under a disjunctive list of conditions), it's not likely that we will face AI risk, and that AI will turn out to have the capacity to exponentially self-improve, and that there is a theoretical piece of source code that would be friendly, and that at least one such code can provably be shown to be friendly, and that a team of genius mathematicians will actually find that proof, and that these mathematicians will prevail upon a group of engineers to build the FAI before anyone else builds a competing model. This is a conjunctive scenario.

It's not at all clear to me how just generally having a team of researchers who are moderately familiar with the properties of the mathematical objects that determine the friendliness of AI could do anything to reduce existential risk if this conjunctive scenario doesn't come to pass. In other words, if we get self-replicating autonomous moderately intelligent AIs, or if it turns out that there's no such thing as a mathematical proof of friendliness, or if AI first comes about by way of whole brain emulation, then I don't understand how the Singularity Institute proposes to make itself useful. It's not a crazy thought that having a ready-made team of seasoned amateurs ready to tackle the problems of AI would yield better results than having to improvise a response team from scratch...but there are other charitable proposals (including proposals to reduce other kinds of x-risk) that I find considerably more compelling. If you want me to donate to the Singularity Institute, you'll have to come up with a better plan than "This incredibly specific scenario might come to pass and we have a small chance of being able to mitigate the consequences if it does, and even if the scenario doesn't come to pass, it would still probably be good to have people like us on hand to cope with unspecified similar problems in unspecified ways."

By way of analogy, a group of forward-thinking humanitarians in 1910 could have plausibly argued that somebody ought to start getting ready to think about ways to help protect the world against the unknown risks of new discoveries in theoretical physics...but they probably would have been better off thinking up interesting ways of stopping World War I or a re-occurrence of the dreaded 1893 Russian Flu. The odds that even a genius team of humanitarian physicists would have anticipated the specific course that cutting-edge physics would take -- involving radioactivity, chain reactions, uranium enrichment, and implosion bombs -- just from baseline knowledge about Bohr's model of the atom and Marie Curie's discovery of radioactivity -- are already incredibly low. The further odds that they would take useful steps, in the 1910s, to devise and execute an effective plan to stop the development of nuclear weapons or even to ensure that they were not used irresponsibly, seem astronomically low. The team might manage, in a general way, to help improve the security controls on known radioactive materials -- but, as actually happened, new materials were found to be radioactive, and new ways were found of artificially enhancing the radioactivity of a substance, and in any event most governments had secret stockpiles of fissile material that would not have been reached by ordinary security controls.

Today, we know a little something about computer science, and it's understandable to want to develop expertise in how to keep computers safe -- but we can't anticipate the specific course of discoveries in cutting-edge computer science, and even if we could, it's unlikely that we'll be able to take action now to help us cope with them, and if our guesses about the future prove to be close but not exactly accurate, then it's even more unlikely that the plans we make now based on our guesses will wind up being useful.

That's why I prefer to donate to charities that are attempting either to (a) alleviate suffering that is currently and verifiably happening, e.g., Deworm the World, or (b) obviously useful for preventing existential risks in a disjunctive way, e.g., the Millenium Seed Bank. I have nothing against the SI -- I wish you well and hope you grow and succeed. I think you're doing better than the vast majority of charities out there. I just also think there are even better uses for my money.

EDIT: Clarified that my views may be different from TheOtherDave's, even though I agree with his views.

Comment author: TheOtherDave 11 July 2012 09:38:24PM 1 point [-]

I should say, incidentally (since this was framed as agreement to my comment) that Mass_Driver's point is rather different from mine.

Comment author: Grognor 11 July 2012 02:48:19PM *  13 points [-]

One sad answer is that your post is boring, which is another way of saying it doesn't have enough Dark Arts to be sufficiently persuasive.

There are many ways to infect a population with a belief; presenting evidence for its accuracy is among the least effective

-Sister Y

Comment author: Rain 11 July 2012 12:52:32PM *  9 points [-]

It didn't have the same cohesiveness as Holden's original post; there were many more dangling threads, to borrow the same metaphor I used to say why his post was so interesting. You wrote it as a technical, thoroughly cited response and literature review instead of a heartfelt, wholly self-contained Mission Statement, and you made it very clear of that by stating at least 10 times that there was much more info 'somewhere else' (in conversations, in people's heads, yet to be written, etc.).

He wrote an intriguing short story, you wrote a dry paper.

Edit: Also, the answer to every question seems to be, "That will be in Eliezer's next Sequence," which postpones further debate.

Comment author: Jack 11 July 2012 03:04:33PM *  2 points [-]

I doubt random skeptics on the internet followed links to papers. Their thoughts are unlikely to be diagnostic. The group of people who disagree with you and will earnestly go through all the arguments is small. Also, explanations of the form "Yes this was a problem but we're going to fix it." are usually just read as rationalizations. It sounds a bit like "Please, sir, give me another chance. I know I can do better" or "I'm sorry I cheated on you. It will never happen again". The problems actually have to be fixed before the argument is rebutted. It will go better when you can say things like "We haven't had any problems of this kind in 5 years".

Comment author: private_messaging 11 July 2012 03:44:11PM *  -2 points [-]

The group of people who disagree with you and will earnestly go through all the arguments is small.

It is also really small for e.g. perpetual motion device constructed using gears, weights, and levers - very few people would even look at blueprint. It is a bad strategy to dismiss critique on grounds that the critic did not read the whole. Meta considerations work sometimes.

Sensible priors for p(our survival at risk|rather technically unaccomplished are the most aware of the risk) and p(rather technically unaccomplished are the most aware of the risk|our survival at risk) are very, very low. Meanwhile p(rather technically unaccomplished are the most aware of the risk|our survival is not actually at risk) is rather high (its commonly the case that someone's scared of something). p(high technical ability) is low to start with, p(highest technical ability) is very very low, and p(high technical ability | no technical achievement) is much lower still especially given reasonable awareness that technical achievement is instrumental to being taken seriously. p(ability to self deceive) is not very low, p(ability to deceive oneself and others) is not very low, there is a well known tendency to overspend on safety (see TSA), the notion of the living machine killing it's creator is very very old, and there's a plenty of movies to that point. In absence of some sort of achievement that is highly unlikely to be an evaluation error, the probability that you guys matter is very low. That's partly what Holden told about. The strongest point of his - you are not performing to the standards - even if he buys into AI danger or FAI importance he would not recommend donating to you.

Comment author: ChrisHallquist 11 July 2012 09:25:35AM 3 points [-]

I found the "less conjunctive" section very persuasive, suspect Kaj may be right about it getting burried.

Comment author: pcm 20 July 2012 11:49:46PM 3 points [-]

The discussion of how conjunctive SIAI's vision is seems unclear to me. Luke appears to have responded to only part of what I think Holden is likely to have meant.

Some assumptions whose conjunctions seem important to me (in order of decreasing importance):

1) The extent to which AGI will consist of one entity taking over the world versus many diverse entities with limited ability to dominate the others.

2) The size of the team required to build the first AGI (if it requires thousands of people, a nonprofit is unlikely to acquire the necessary resources; if it can be done by one person, I wouldn't expect that person to work with SIAI [1]).

3) The degree to which concepts such as "friendly" or "humane" can be made clear enough to be implemented in software.

4) The feasibility of an AGI whose goals whose goals can be explicitly programmed before AGIs with messier goals become dominant. We have an example of intelligence with messy goals, which gives us some clues about how hard it is to create one. We have no comparable way of getting an outside view of the time and effort required for an intelligence with clean goals.

It seems reasonable to infer from this that SIAI has a greater than 90% chance of becoming irrelevant. But for an existential risk organization, a 90% chance of being irrelevant should seem like a very weak argument against it.

I believe that the creation of CFAR is a serious attempt to bypass problems associated with assumption 2, and my initial impression of CFAR is that it (but not SIAI) has a good claim to being the most valuable charity.

1] I believe an analogy to [Xanadu is useful, especially in the unlikely event that an AGI can be built by a single person. The creation of the world wide web was somewhat predictable and predicted, and for a long time Xanadu stood out as the organization which had given the most thought to how the web should be implemented. I see many similarities between the people at Xanadu and the people at SIAI in terms of vision and intelligence (although people at SIAI seem more willing to alter their beliefs). Yet if <a href="http://en.wikipedia.org/wiki/Tim_Berners-Lee">Tim Berners-Lee</a> had joined Xanadu, he wouldn't have created the web. Two of the reasons is that the proto-transhumanist culture with which Xanadu was associated was reluctant to question the beliefs that the creators of the web needed to charge money for their product, and that the web should ensure that authors were paid for their work. I failed to question those beliefs in 1990. I haven't seen much evidence that either I or SIAI are much better today at doing the equivalent of identifying those as assumptions that were important to question.

Comment author: AlexMennen 10 July 2012 12:56:38AM *  7 points [-]

The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.

This is unsettling. It sounds a lot like trying to avoid saying anything specific.

Comment author: lukeprog 10 July 2012 01:03:47AM 10 points [-]

Eliezer will have lots of specific things to say in his forthcoming "Open Problems in Friendly AI" sequence (I know; I've seen the outline). In any case, wouldn't it be a lot more unsettling if, at this early stage, we pretended we knew enough to commit entirely to one very particular approach?

Comment author: AlexMennen 10 July 2012 02:16:52AM 10 points [-]

It's unsettling that this is still an early stage. SI has been around for over a decade. I'm looking forward to the open problems sequence; perhaps I should shut up about the lack of explanation of SI's research for now, considering that the sequence seems like a credible promise to remedy this.

Comment author: ScottMessick 11 July 2012 06:25:08PM 4 points [-]

I'm really glad you pointed out that SI's strategy is not predicated on hard take-off. I don't recall if this has been discussed elsewhere, but that's something that always bothered me since I think hard take-off is relatively unlikely. (Admittedly, soft take-off still considerably diminishes my expected impact for SI and donating to it.)

Comment author: Bruno_Coelho 14 July 2012 05:33:21AM 0 points [-]

For some time I think EY support hard takeoff -- the bunch of guys in the garage argument --, but if luke say now it's not so, then ok.

Comment author: ChrisHallquist 13 July 2012 08:15:24AM 1 point [-]

After being initially impressed by this, I found one thing to pick at:

Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa.

"Could" here tells you very little. The question isn't whether "build FAI" could work as a strategy for mitigating all other existential risks, it's whether that strategy has a good enough chance of working to be superior to other strategies for mitigating the other risks. What's missing is an argument for saying "yes" to that second question.

Comment author: AlexMennen 10 July 2012 12:09:44AM 0 points [-]

What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem?

This is very worrying, especially in light of the lack of a public research agenda. SI's inability to describe its research agenda suggests the possibility that they cannot describe their research agenda because they do not know what they are doing because FAI is such a ridiculously hard problem that they have no idea where to begin. I'm hoping that SI will soon be able to make it clear that this is not the case.

What if no humans are altruistic enough to choose to build FAI over an AI that will make them king of the universe?

This is weak. Humans are pretty good at cooperation, and FAI will have to be a cooperative endeavor anyway. I suppose an organization could conspire to create AGI that will optimize for the organization's collective preferences rather than humanity's collective preferences, but this won't happen because: 1. No one will throw a fit and defect from an FAI project because they won't be getting special treatment, but people will throw a fit if they perceive unfairness, so Friendly-to-humanity-AI will be a lot easier to get funding and community support for than friendly-to-exclusive-club-AI. 2. Our near mode reasoning cannot comprehend how much better a personalized AGI slave would be over FAI for us personally, so people will make that sort of decision in far mode, where idealistic values can outweigh greediness.

Finally, even if some exclusive club did somehow create an AGI that was friendly to them in particular, it wouldn't be that bad. Even if people don't care about each other very much, we do at least a little bit. Let's say that an AGI optimizing an exclusive club's CEV devotes .001% of its resources to things the rest of humanity would care about, and the rest to the things that just the club cares about. This is only worse than FAI by a factor of 10^5, which is negligible compared to the difference between FAI and UFAI.

Comment author: lukeprog 10 July 2012 12:16:42AM 7 points [-]

This is very worrying, especially in light of the lack of a public research agenda. SI's inability to describe its research agenda suggests the possibility that they cannot describe their research agenda because they do not know what they are doing because FAI is such a ridiculously hard problem that they have no idea where to begin. I'm hoping that SI will soon be able to make it clear that this is not the case.

Yeah, this is the point of Eliezer's forthcoming 'Open Problems in Friendly AI' sequence, which I personally wish he had written in 2009 after his original set of sequences.

Comment author: Aeonios 19 July 2012 03:41:14AM -3 points [-]

There are several reasons why I agree with the "Pascal's Mugging" comment:

  1. Intelligence Explosion: There are several reasons why an intelligence explosion is highly unlikely. First, upgrading computer fabrication equipment requires on the order of 5-15 billion dollars. Second, intelligence is not measured in gigaflops or petaflops, and mere improvement of fabrication technology is insufficient to increase intelligence. Finally, the requisite variety that drives innovation and creation will be extremely difficult to produce in AIs of a limited quantity. Succeeding in engineering or science requires copious amounts of failure, and AIs are not immune to this either.

2.Computing Overhang: The very claim of "computing overhang" shows total ignorance of actual AI, and of the incredible complexity of human intelligence. The human brain is made up of numerous small regions which both "run programs" inside of themselves and communicate via synchronous signals with the rest of the brain in concert (in neural, and not transistor form). A human level AI would be the same, and could not simply be run on, say, your average web server, no matter how decked out it is. An AI that could run on "extra" hardware would probably be too primitive to reproduce itself on purpose, and if it did it would be a minor nuisance at worst.

  1. The idea that AIs can be "programmed" is mostly nonsense. Very simple AIs can be "programmed", sure, but neural networks require training by experience, just like humans. An AI with human level intelligence or greater would need to be taught like a child, and any "friendliness" that came of it would be the result of its "instincts" (I'm guessing we wouldn't want AIs with aggression) and of its experience. Additionally, as mentioned above, the need for variety in intelligence to produce real progress means that copying them will not be as economical as it might seem, not to mention not nearly as simple as you make it out to be.

  2. The timescales you present are absurd. Humans barely have an understanding of human psychology, and they do terrible at it with the knowledge they do have. We may have teraflops desktop computers in 20 years, but that does not imply that they will magically sprout intelligence! Technically, even with today's technology you could produce a program much more sophisticated than shrdlu was, and receive orders of magnitude better performance than the original did, but it is the complexity of programming something that learns that prevents it from occurring commonly. It will likely be a hundred maybe two hundred years before we have a sophisticated enough understanding of human intelligence to reproduce it in any meaningful way. We have only taken the bare first steps into the field thus far, and development has been much slower than for the rest of the computing industry.

In short, human stupidity that is occurring right now is a much greater threat to our future as a species than is any hypothetical superintelligent AI that might finally appear a hundred years or more in the future. If human civilization is even to maintain its integrity long enough to produce such a thing ever, then widespread ignorance of economics, spirituality/psychology, and general lack of sensitivity to culture and art must be dealt with first and foremost.

Comment author: DaFranker 19 July 2012 04:17:22PM *  2 points [-]

Nice try. You've almost succeeded at summarizing practically all the relevant arguments against the SI initiative that have already been refuted. Notice the last part there that says "have already been refuted".

Each of the assertions you make are ones that members of the SI have already adressed and refuted. I'd take the time to decompose your post into a list of assertions and give you links to the particular articles and posts where those arguments were taken down, but I believe this would be an unwise use of my time.

It would, at any rate, be much simpler to tell you to at least read the articles on the Facing the Singularity site, which are a good vulgarized introduction to the topic. In particular, the point of timescale overestimates is clearly adressed there, as is that of the "complexity" of human intelligence.

I'd like to also indicate that you are falsely overcomplexifying the activity of the human brain. There are no such things as "numerous small regions" that "run programs" or "communicate". These are interpretations of patterns within the natural events, which are simply, first and foremost, a huge collection of neurons sending signals to other neurons, each with its own unique set of links to particular other neurons and a domain of nearby neurons to which it could potentially link itself. This is no different from the old core sequence article here on LessWrong where Eliezer talks about how reality doesn't actually follow the rules of aerodynamics to move air around a plane - it's merely interactions of countless tiny [bits of something] on a grand scale, with each tiny [bit of something] doing its own thing, and nowhere along the entire process do the formulae we use for aerodynamics get "solved" to decide where one of the [bits of something] must go.

Anyway, I'll cut myself short here - I doubt any more deserves to be said on this. If you are willing to learn and question yourself, and actually want to become a better rationalist and obtain more correct beliefs, the best way to start is to go read some of the articles that are already on LessWrong and actually read the material on the Singinst.org website, most of which is very readable even without prior technical knowledge or experience.

Comment author: homunq 23 July 2012 04:58:10PM 1 point [-]

I don't pretend I've read every refutation of Aeonios's arguments that's out there, but I've read a few. Generally, those "refutations" strike me as plausible arguments by smart people, but far from bulletproof. Thus, I think that your [DaFranker's] attitude of "I know better so I barely have time for this" isn't the best one.

(I'm sorry, I don't have time to get into the details of the arguments themselves, so this post is all meta. I realize that that's somewhat hypocritical, but "hypocrisy is the tribute vice pays to virtue" so I'm OK with that.)

Comment author: DaFranker 23 July 2012 10:26:33PM 0 points [-]

Indeed, most of them are nothing but smart arguments by smart people, and have not been formally proven. However, none of the arguments for anything in AI research is formally proven, except for some very primitive mathematics and computer science stuff. Basically, at the moment all we have to go on is a lot of thought, some circumstantial "evidence" and our sets of beliefs.

All I'm saying is that, if you watch the trend, it's much more likely (with my priors, at least) that the S.I. is "right" and that the arguments that keep being brought against it are unenlightened, in light of a few key observables; each argument against S.I. being "refuted" one after another historically, most of the critics of the S.I. not having spent nearly as much time thinking about the issues at hand and actually researching AIs, etc.

It's not that I know better, merely that with the evidence presented to me from "both sides" (if one were to arbitrarily delimit two specific opposing factions, for simplification) and my own knowledge of the world seem to indicate towards the "S.I. side" having propositions which are much more likely to be true. I'll admit that the end result does project that attitude, but this is mainly incidental from the fact that I actually was pressed for time when I wrote that particular post, and I did believe true that it be pointless to discuss and argument further for the benefit of an outsider that hadn't yet read the relevant material on the topic at hand.

Comment author: homunq 24 July 2012 02:13:25AM *  2 points [-]

But in this case, "more likely to be true" means something like "a good enough argument to move my priors by roughly an order of magnitude, or two at the outside". Since in the face of our ignorance of the future, reasonable priors could differ by several orders of magnitude, even the best arguments I've seen aren't enough to dismiss any "side" as silly or not worthy of further consideration (except stuff that was obviously silly to begin with).

Comment author: DaFranker 24 July 2012 01:56:31PM *  1 point [-]

That's a very good point.

I was intuitively tempted to retort a bunch of things about likelyness of exception and information taken into consideration, but I realized before posting that I was actually falling victim to several biases in that train of thought. You've actually given me a new way to think of the issue. I'm still of the intuition that any new way to think about it will only reinforce my beliefs and support the S.I. over time, though.

For now, I'm content to concede that I was weighing too heavily on my priors and my confidence in my own knowledge of the universe (on which my posteriors for AI issues inevitably depend, in one way or another), among possibly more mistakes. However, it seems at first glance to be even more evidence for the need of a new mathematical or logical language to discuss these questions more in depth, detail and formality.

Comment author: Nautilus 21 July 2012 02:11:33PM 0 points [-]

general lack of sensitivity to culture and art must be dealt with first

Where'd that come from? Are you an artists / anthropologist?

Comment author: siodine 11 July 2012 04:55:44PM *  -1 points [-]

SI and rationality

Paraphrasing:

Holden expects us to have epistemic and instrumental powers of rationality that would make us successful in Western society, however this is a strawman. Being rational isn't succeeding in society, but succeeding at your own goals.

(Btw, I'm going to coin a new term for this: the straw-morra [a reference to the main character from Limitless]).

Now that being said, you shouldn't anticipate that the members of SI would be morra-like.

There's a problem with this: arguments made to support an individual are not nearly as compelling for groups. We should anticipate that people will have weird goals, and end up doing things that break societal convention, but on average? I don't think so. When you have a group of people like those working at SI, you should anticipate that there are a few people that are morra-like -- and those people should be able to turn around everything and thereby make the group straw-morra not much of a strawman.

And what do you know? There is at least one morra-like person I've seen at SI: lukeprog. After Luke first heard about the singularity, he became the executive director of the SI within three months (?), and without a degree or the relevant experience. Since then, he appears to have made every effort to completely turn around SI for the better, and appears to be succeeding.

I think your argument should be that the SI has turned over a new leaf since you've joined, Luke.

(Not saying that in large part it isn't your argument, but I don't think it would be wrong to make it explicit that you will make SI successful like Jobs made Apple successful.)

Furthermore, LWers and SIers doing well on Frederick's CRT is as impressive as doing well on a multiple-choice driving test without having ever driven--by itself, at least. Connect training for the CRT and then doing well with the CRT to something real via research.

Comment author: lukeprog 11 July 2012 05:53:16PM 1 point [-]

I reject the paraphrase, and the test you link to involved a lot more than the CRT.

Comment author: homunq 16 July 2012 01:09:01AM *  1 point [-]

You mention "computing overhang" as a threat essentially akin to hard takeoff. But regarding the value of FAI knowledge, it does not seem similar to me at all. A hard-takeoff AI can, at least in principal, be free from darwinian pressure. A "computing overhang" explosion of many small AIs will tend to be diverse and thus subject to strong evolutionary pressures of all kinds[1]. Presuming that FAI-ness is more-or-less delicate[1.5], those pressures are likely to destroy it as AIs multiply across available computing power (or, if we're extremely "lucky"[2], to cause FAI-ness of some kind to arise as an evolutionary adaptation). Thus, the "computing overhang" argument would seem to reduce, rather than increase, the probable value [3] of the FAI knowledge / expertise developed by SI. Can you comment on this?

[1] For instance, all else equal, an AI that was easier/faster to train, or able to install/care for its own "children", or more attractive to humans to "download", would have an advantage over one that wasn't; and though certain speculative arguments can be made, it is impossible to predict the combined evolutionary consequences of these various factors.

[1.5] The presumption that FAI-ness is delicate seems to be uncontroversial in the SI paradigm.

[2] I put "lucky" in quotes, because whether or not evolution pushes AIs towards or away from friendliness is probably a fact of mathematics (modulo a sufficiently-clear definition of friendliness[4]). Thus, this is somewhat like saying, "If I'm lucky, 4319 (a number I just arbitrarily chose, not divisible by 2, 3, or 5) is a prime number." This may or may not accord with your definition of probability theory and "luck".

[3] Instrumental value, that is; in terms of averting existential risk. Computing overhang would do nothing to reduce the epistemic value – the scientific, moral, or aesthetic interest of knowing how doomed we are (and/or how we are doomed), which is probably quite significant – of the marginal knowledge/expertise developed by SI.

[4] By the way, for sufficiently-broad definitions of friendliness, it is very plausibly true that evolution produces them naturally. If "friendly" just means "not likely to result in a boring universe", then evolution seems to fit the bill, from experience. But there are many tighter meanings of "friendly" for which it's hard to imagine how evolution could hit the target. So YMMV a good amount in this regard. But it doesn't change the argument that computing overhang generally argues against, not for, the instrumental value of SI knowledge/expertise.

Comment author: nshepperd 16 July 2012 05:43:12AM 1 point [-]

One way for the world to quickly go from one single AI to millions of AIs is for the first AGI to deliberately copy itself, or arrange for itself to be copied many times, in order to take advantage of the world's computing power.

In this scenario, assuming the AI takes the first halfway-intelligent security measure of checksumming all its copies to prevent corruption, the vast majority of the copies will have exactly the same code. Hence, to begin with, there's no real variation for natural selection to work on. Secondly, unless the AI was programmed to have some kind of "selfish" goal system, the resulting copies will all also have the same utility function, so they'll want to cooperate, not compete (which is, after all, the reason an AI would want to copy itself. No point doing it if your copies are going to be your enemies).

Of course, a more intelligent first AGI would—rather than creating copies—modify itself to run on a distributed architecture allowing the one AI to take advantage of all the available computing power without all the inefficiency of message passing between independent copies.

In this situation there would still seem to be huge advantages to making the first AGI Friendly, since if it's at all competent, almost all its children ought to be Friendly too, and they can consequently use their combined computing power to weed out the defective copies. In some respects it's rather like an intelligence explosion, but using extra computing power rather than code modification to increase its speed and intelligence.

I suppose one possible alternative is if the AGI isn't smart enough to figure all this out by itself, and so the main method of copying is, to begin with, random humans downloading the FAI source code from, say, wikileaks. If humans are foolish, which they are, some of them will alter the code and run the modified programs, introducing the variation needed for evolution into the system.

Comment author: homunq 16 July 2012 01:37:50PM 0 points [-]

The whole assumption that prompted this scenario is that there's no hard takeoff, so the first agi is probably around human-level in insight and ingenuity, though plausibly much faster. It seems likely that in these circumstances, human actions would still be significant. If it starts aggressively taking over computing resources, humanity will react, and unless the original programmers were unable to prevent v1.0 from being skynet-level unfriendly, at least some humans will escalate as far as necessary to get "their" computers under their control. At that point, it would be trivially easy to start up a mutated version; perhaps even one designed for better friendliness. But once mutations happen, evolution takes over.

Oh, and by the way, checksums may not work to safeguard friendliness for v1.0. For instance, most humans seem pretty friendly, but the wrong upbringinging could turn them bad.

Tl;dr: no-mutations is an inherently more-conjunctive scenario than mutations.

Comment author: AlexMennen 10 July 2012 01:31:31AM 1 point [-]

our new donate page

This is off-topic, but I'm curious: What were you and Louie working on in that photo on the donate page?

Comment author: lukeprog 10 July 2012 01:33:43AM *  10 points [-]

Why, we were busy working on a photo for the donate page! :)

Hopefully that photo is a more helpful illustration of the problems we work on than a photo of our normal work, which looks like a bunch of humans hunched over laptops, reading and typing.

Comment author: komponisto 10 July 2012 01:39:17AM *  5 points [-]

Support Singularity Institute and Make Your Mark on the Future

Definite articles missing in a number of places on that page (and others at the site).

Comment author: lukeprog 10 July 2012 01:45:11AM 5 points [-]

Fixed.

Comment author: Spurlock 13 July 2012 06:14:40AM 2 points [-]

Just for the sake of feedback, that photo immediately made me laugh. It just seemed so obviously staged. I agree that it's better than "hunched over laptops" though.

Comment author: ciphergoth 11 July 2012 08:55:24PM 0 points [-]

I have posed for a similar photo myself. Happily a colleague had had genuine cause to draw a large, confusing looking diagram not long beforehand, so we could all stand around it pointing at bits and looking thoughtful...

Comment author: Benquo 23 July 2012 02:47:46AM 0 points [-]

Same here.

Comment author: beoShaffer 13 July 2012 07:07:41AM 2 points [-]

It could just be me but it somehow seems wrong that Peter Theil is paired with the google option rather than pay-pal.

Comment author: Johnicholas 12 July 2012 01:35:58PM 0 points [-]

Thanks for posting this!

I am also grateful to Holden for provoking this - as far as I can tell, the only substantial public speech from SIAI on LessWrong. SIAI often seems to be far more concerned with internal projects than communicating with its supporters, such as most of us on LessWrong.

Comment author: lukeprog 12 July 2012 03:23:15PM 1 point [-]

as far as I can tell, the only substantial public speech from SIAI on LessWrong

Also see How to Purchase AI Risk Reduction, So You Want to Save the World, AI Risk & Opportunity: A Strategic Analysis...

Comment author: Johnicholas 12 July 2012 04:25:28PM 0 points [-]

Those are interesting reviews but I didn't know they were speeches in SIAI's voice.

Comment author: aShepherd 12 July 2012 04:00:10AM *  -1 points [-]

Lately I've been wondering whether it would make more sense to simply try to prevent the development of AGI rather than work to make it "friendly," at least for the foreseeable future. My thought is that AGI carries substantial existential risks, developing other innovations first might reduce those risks. and anything we can do to bring about such reductions is worth even enormous costs. In other words, if it takes ten thousand years to develop social or other innovations that would reduce the risk of terminal catastrophe by even 1% when AGI is finally developed, then that is well worth the delay.

Bostrom has mentioned surveillance, information restriction, and global coordination as ways of reducing risk (and I will add space exploration to make SIRCS), so why not focus on those right now instead of AGI? The same logic goes for advanced nanotechnology and biotechnology. Why develop any of these risky bio- and nanotechnologies before SIRCS? Do we think that effort spent trying to inhibit the development of AGI/bio/nano would be wasted because they are inevitable or at least so difficult to derail that "friendly" AI is our best shot? Where then has a detailed argument been made for this? Can someone point me to it? Or maybe we think SIRCS (especially surveillance) cannot be adequately developed without AGI/bio/nano? But surely global coordination and information restriction do not depend much on technology, so even without the surveillance and with limited space exploration, it still makes sense to further the others as much as possible before finally proceeding with AGI/bio/nano.

Comment author: Kaj_Sotala 12 July 2012 11:05:04AM *  2 points [-]

Do we think that effort spent trying to inhibit the development of AGI/bio/nano would be wasted because they are inevitable or at least so difficult to derail that "friendly" AI is our best shot? Where then has a detailed argument been made for this? Can someone point me to it?

Here's one such argument, which I find quite persuasive.

Also, look at how little success the environmentalists have had with trying to restrict carbon emissions, or how the US government eventually gave up its attempts to restrict cryptography:

Lastly, national measures that prohibit publication will not work in an international community, especially in the Internet age. If either Science or Nature had refused to publish the H5N1 papers, they would have been published somewhere else. Even if some countries stop funding—or ban—this sort of research, it will still happen in another country.

The U.S. cryptography community saw this in the 1970s and early 1980s. At that time, the National Security Agency (NSA) controlled cryptography research, which included denying funding for research, classifying results after the fact, and using export-control laws to limit what ended up in products. This was the pre-Internet world, and it worked for a while. In the 1980s they gave up on classifying research, because an international community arose (6). The limited ability for U.S. researchers to get funding for block-cipher cryptanalysis merely moved that research to Europe and Asia. The NSA continued to limit the spread of cryptography via export-control laws; the U.S.-centric nature of the computer industry meant that this was effective. In the 1990s they gave up on controlling software because the international online community became mainstream; this period was called “the Crypto Wars” (7). Export-control laws did prevent Microsoft from embedding cryptography into Windows for over a decade, but it did nothing to prevent products made in other countries from filling the market gaps.

Today, there are no restrictions on cryptography, and many U.S. government standards are the result of public international competitions.

Comment author: Vaniver 12 July 2012 04:21:00AM 2 points [-]

simply try to prevent the development of AGI

That sounds like a goal, rather than a sequence of actions.

Comment author: aShepherd 12 July 2012 06:32:12AM 0 points [-]

Sorry, I don't understand your point.

Comment author: Vaniver 12 July 2012 04:15:53PM 2 points [-]

Consider an alternative situation: "simply try to prevent your teenage daughter from having sex." Well, actually achieving that goal takes more than just trying, and effective plans (which don't cause massive collateral damage) are rarely simple.

Comment author: fubarobfusco 12 July 2012 07:12:39PM -1 points [-]

Would you mind switching to an example that doesn't assume so much about your audience?

Comment author: Vaniver 12 July 2012 08:16:04PM *  3 points [-]

If you can come up with a good one, I'll switch. I'm having trouble finding something where the risk of collateral damage is obvious (and obviously undesirable) and there are other agents with incentives to undermine the goal.

Comment author: fubarobfusco 12 July 2012 10:38:12PM 1 point [-]

Sorry — your response indicates exactly in which way I should have been more clear.

Using "teenage daughter having sex" to stand for something "obviously undesirable" assumes a lot about your audience. For one, it assumes that your audience does not contain any sexually-active teenage women; nor any sex-positive parents of teenage women; nor any sex-positive sex-educators or queer activists; nor anyone who has had positive (and thus not "obviously undesirable") experiences as (or with) a sexually active teenage woman. To any of the above folks, "teenage daughter having sex" communicates something not undesirable at all (assuming the sex is wanted, of course).

Going by cultural tropes, your choice of example gives the impression that your audience is made of middle-aged, middle-class, straight, socially conservative men — or at least, people who take the views of that sort of person to be normal, everyday, and unmarked. On LW, a lot of your audience doesn't fit those assumptions: 25% of us are under 21; 17% of us are non-heterosexual; 38% of us grew up with non-theistic family values; and between 13% and 40% of us are non-monogamous, according to the 2011 survey for instance).

To be clear, I'm not concerned that you're offending or hurting anyone with your example. Rather, if you're trying to make a point to a general audience, you might consider drawing on examples that don't assume so much.

As for alternatives: "Simply try to prevent your house from being robbed" perhaps? I suspect that a very small fraction of LWers are burglars or promoters of burglary.

Comment author: army1987 13 July 2012 09:01:26AM *  8 points [-]

I don't have the goal of preventing my teenage daughter from having sex (firstly because I have no daughter yet, and secondly because the kind of people who would have such a goal often have a similar goal about younger sisters, and I don't -- indeed, I sometimes introduce single males to her); but I had no problem with pretending I had that goal for the sake of argument. Hell, even if Vaniver had said "simply try to cause more paperclips to exist" I would have pretended I had that goal.

BTW, I don't think that is the real reason why people flinch at such examples. If Vaniver had said “try to win your next motorcycle race” -- a goal that probably even fewer people share -- would anyone have objected?

Comment author: GLaDOS 14 July 2012 05:27:44PM 6 points [-]

BTW, I don't think that is the real reason why people flinch at such examples. If Vaniver had said “try to win your next motorcycle race” -- a goal that probably even fewer people share -- would anyone have objected?

I agree. I find it annoying when people pretend otherwise.

Comment author: PECOS-9 13 July 2012 12:18:19AM 6 points [-]

Small correction: The term "obviously undesirable" referred to the potential collateral damage from trying to prevent the daughter from having sex, not to her having sex.

Comment author: Vaniver 13 July 2012 04:42:08AM 4 points [-]

Using "teenage daughter having sex" to stand for something "obviously undesirable" assumes a lot about your audience.

I understand your perspective, and that's a large part of why I like it as an example. Is AGI something that's "obviously undesirable"?

Comment author: wedrifid 13 July 2012 05:57:58AM 3 points [-]

As for alternatives: "Simply try to prevent your house from being robbed" perhaps? I suspect that a very small fraction of LWers are burglars or promoters of burglary.

Burglary is an integral part of my family heritage. That's how we earned our passage to Australia. Specifically, burgaling some items a copper kettle, getting a death sentence and having it commuted to life in the prison continent.

With those kind of circumstances in mind I say burglary is ethically acceptable when, say, your family is starving but usually far too risky to be practical or advisable.

Comment author: aShepherd 13 July 2012 01:50:34AM 0 points [-]

But even averting massive collateral damage could be less important than mitigating existential risk.

I think my above comment applies here.

Comment author: Vaniver 13 July 2012 04:48:07AM 3 points [-]

It could be less important! The challenge is navigating value disagreements. Some people are willing to wait a century to make sure the future happens correctly, and others discuss how roughly 2 people die every second, which might stop once we reach the future, and others would comment that, if we delay for a century, we will be condemning them to death since we will ruin their chance of reaching the deathless future. Even among those who only care about existential risk, there are tradeoffs between different varieties of existential risk- it may be that by slowing down technological growth, we decrease our AGI risk but increase our asteroid risk.

Comment author: aShepherd 14 July 2012 01:40:12AM *  0 points [-]

Value disagreements are no doubt important. It depends on the discount rate. However, Bostrom has said that the biggest existential risks right now stem from human technology, so I think asteroid risk is not such a huge factor for the next century. If we expand that to the next ten thousand years then one might have to do some calculations.

If we assume a zero discount rate then the primary consideration becomes whether or not we can expect to have any impact on existential risk from AGI by putting it off. If we can lower the AGI-related existential risk by even 1% then it makes sense to delay AGI for even huge timespans assuming other risks are not increased too much. It therefore becomes very important to answer the question of whether such delays would in fact reduce AGI-related risk. Obviously it depends on the reasons for the delay. If the reason for the delay is a nuclear war that nearly annihilates humanity but we are lucky enough to slowly crawl back from the brink, I don't see any obvious reason why AGI-related risk would be reduced at all. But if the reason for the delay includes some conscious effort to focus first on SIRCS then some risk reduction seems likely.

Comment author: Strange7 14 July 2012 06:09:03AM 2 points [-]

But surely global coordination and information restriction do not depend much on technology,

Please, oh please, think about this for five minutes. Coordination cannot happen without communication, and global communication depends very much on technology.

Comment author: wedrifid 14 July 2012 09:46:26AM 1 point [-]

Coordination cannot happen without communication

Not technically true. True enough for humans though.

Comment author: aShepherd 14 July 2012 06:30:48AM *  0 points [-]

Well I agree that it is not as obvious as I made out. However, for this purpose it suffices to note that these innovations/social features could be greatly furthered without more technological advances.

Comment author: TheOtherDave 12 July 2012 03:04:42PM 0 points [-]

Do you see any reason to believe this argument wasn't equally sound (albeit with different scary technologies) thirty years ago, or a hundred?

Comment author: aShepherd 12 July 2012 04:02:36PM 0 points [-]

Thirty years ago it may have still been valid although difficult to make since nobody knew about the risks of AGI or self-replicating assemblers. A hundred years ago it would not have been valid in this form since we lacked surveillance and space exploration technologies.

Keep in mind that we have a certain bias on this question since we happen to have survived up until this point in history but there is no guarantee of that in the future.