This post and the reactions to it will be an interesting test for my competing models about the value of giving detailed explanations to supporters. Here are just two of them:
One model says that detailed communication with supporters is good because it allows you to make your case for why your charity matters, and thus increase the donors' expectation that your charity can turn money into goods that they value, like poverty reduction or AI risk reduction.
Another model says that detailed communication with supporters is bad because (1) supporters are generally giving out of positive affect toward the organization, and (2) that positive affect can't be increased much once they grok the mission enough to start donating, but (3) the positive affect they feel toward the charity can be overwhelmed by the absolute number of the organization's statements with which they disagree, and (4) more detailed communication with supporters increases this absolute number more quickly than limited communication that repeats the same points again and again (e.g. in a newsletter).
I worry that model #2 may be closer to the truth, in part because of things like (Dilbert-creator) Scott Adams' account of w...
An issue that SI must inevitably confront is how much rationality it will assume of its target population of donors. If it simply wanted to raise as much money as possible, there are, I expect, all kinds of Dark techniques it could use (of which decreasing communication is only the tip of the iceberg). The problem is that SI also wants to raise the sanity waterline, since that is integral to its larger mission -- and it's hard (not to mention hypocritical) to do that while simultaneously using fundraising methods that depend on the waterline being below a certain level among its supporters.
Regarding the theft:
I was telling my friend (who recently got into HPMOR and lurks a little on LW) about Holden's critique, specifically with regard to the theft. He's an accounting and finance major, and was a bit taken aback. His immediate response was to ask if SI had an outside accountant audit their statements. We searched around and it doesn't look like to us that you do. He immediately said that he would never donate to an organization that did not have an accountant audit their statements, and knowing how much I follow LW, immediately advised me to not to either. This seems like a really good step for addressing the transparency issues here, and now that he mentions it, seems a very prudent and obvious thing for any nonprofit to do.
Edit 2: Luke asked me to clarify, I am not necessarily endorsing not donating to SI because of this, unless this problem is a concern of yours. My intent was only to suggest ways SI can improve and may be turning away potential donors.
Edit: He just mentioned to me that the big four accounting firms often do pro bono work because it can be a tax write-off. This may be worth investigating.
Also note that thefts of this size are not as rare as they appear, because many non-profits simply don't report them. I have inside knowledge about very few charities, but even I know one charity that suffered a larger theft than SI did, and they simply didn't tell anybody. They knew that donors would punish them for the theft and not at all reward them for reporting it. Unfortunately, this is probably true for SI, too, which did report the theft.
Yep. We knew that would happen at the time - it was explicitly discussed in the Board meeting - and we went ahead and reported it anyway, partly because we didn't want to have exposable secrets, partly because we felt honesty was due our donors, and partially because I'd looked up embezzlement-related stuff online and had found that a typical nonprofit-targeting embezzler goes through many nonprofits before being reported and prosecuted by a nonprofit "heroic" enough, if you'll pardon the expression, to take the embarrassment-hit in order to stop the embezzler.
Yes, we're currently in the process of hiring a bookkeeper (interviewed one, scheduling interviews with 2 others), which will allow us to get our books in enough order that an accountant will audit our statements. We do have an outside accountant prepare our 990s already. Anyway, this all requires donations. We can't get our books cleaned up and audited unless we have the money to do so.
Also, it's my impression that many or most charities our size and smaller don't have their books audited by an accountant because it's expensive to do so. It's largely the kind of thing a charity does when they have a bigger budget than we currently do. But I'd be curious to see if there are statistics on this somewhere; I could be wrong.
And yes, we are investigating the possibility of getting pro bono work from an accounting firm; it's somewhere around #27 on my "urgent to-do list." :)
Edit: BTW, anyone seriously concerned about this matter is welcome to earmark their donations for "CPA audit" so that those donations are only used for (1) paying a bookkeeper to clean up our processes enough so that an accountant will sign off on them, and (2) paying for a CPA audit of our books. I will personally make sure those earmarks are honored.
You can't deduct the value of services donated to nonprofits. Not sure your friend is as knowledgeable as stated. Outside accounting is expensive and the IRS standard is to start doing it once your donations hit $2,000,000/year, which we haven't hit yet. Also, SIAI recently passed an IRS audit.
Fifteen seconds of Googling resulted in Deloitte's pro-bono service, which is done for CSR and employee morale rather than tax avoidance. Requests need to originate with Deloitte personnel- I know a friend who works there who might be interested in LW, but it'd be a while before I'd be comfortable asking him to recommend SI. It's a big enough company that it's likely that there are some HPMOR or LW fans that work there.
Interesting!
"Applications for a contribution of pro bono professional services must be made by Deloitte personnel. To be considered for a pro bono engagement, a nonprofit organization (NPO) with a 501c3 tax status must have an existing relationship with Deloitte through financial support, volunteerism, Deloitte personnel serving on its Board of Directors or Trustees, or a partner, principal or director (PPD) sponsor (advocate for the duration of the engagement). External applications for this program are not accepted. Organizations that do not currently have a relationship with Deloitte are welcome to introduce themselves to the Deloitte Community Involvement Leader in their region, in the long term interest of developing one."
Deloitte is requiring a very significant investment from its employees before offering pro bono services. Nonetheless, I have significant connections there and would be willing to explore this option with them.
Clarifications:
Eliezer is right: RobertLumley's friend is mistaken:
can the value of your time and services while providing pro bono legal services qualify as a charitable contribution that is deductible from gross income on your federal tax return? Unfortunately, in a word, nope.
According to IRS Publication 526, “you cannot deduct the value of your time or services, including blood donations to the Red Cross or to blood banks, and the value of income lost while you work as an unpaid volunteer for a qualified organization.”
Certainly the fact that some really awful charities are untruthful doesn't mean SI shouldn't be held accountable merely because it managed to tell the truth.
I think you're missing Luke's implied argument that more than 'some' charities are untruthful, but quite a lot of them are. The situation is the same as with, say, corporations getting hacked: they have no incentive to report it because only bad things will happen, and this leads to systematic underreporting, which reinforces the equilibrium as anyone reporting honestly will be seen as an outlier (as indeed they are) and punished. A vicious circle.
(Given the frequency of corporations having problems, and the lack of market discipline for nonprofits and how they depend on patrons, I could well believe that nonprofits routinely have problems with corruption, embezzlement, self-dealing, etc.)
How do I know that supporting SI doesn't end up merely funding a bunch of movement-building leading to no real progress?
It seems to me that the premise of funding SI is that people smarter (or more appropriately specialized) than you will then be able to make discoveries that otherwise would be underfunded or wrongly-purposed.
I think the (friendly or not) AI problem is hard. So it seems natural for people to settle for movement-building or other support when they get stuck.
That said, some of the collateral output to date has been enjoyable.
For SI, movement building is directly progress more than it is for, say, Oxfam, because a big part of their mission is to try and persuade people not to do the very dangerous thing.
But I don't see any evidence that anyone who was likely to create an AI soon, now won't.
According to Luke, Moshe Looks (head of Google's AGI team) is now quite safety conscious, and a Singularity Institute supporter.
I do think OP is right that in practice, 100 years ago, it would have been really hard to figure out what an AI issue looked like. This was pre-Godel, pre-decision-theory, pre-Bayesian-revolution, and pre-computer. Yes, a sufficiently competent Earth would be doing AI math before it had the technology for computers, in full awareness of what it meant - but that's a pretty darned competent Earth we're talking about.
If my writings (on FAI, on decision theory, and on the form of applied-math-of-optimization called human rationality) so far haven't convinced you that I stand a sufficient chance of identifying good math problems to solve to maintain the strength of an input into existential risk, you should probably fund CFAR instead. This is not, in any way shape or form, the same skill as the ability to manage a nonprofit. I have not ever, ever claimed to be good at managing people, which is why I kept trying to have other people doing it.
I greatly appreciate the response to my post, particularly the highly thoughtful responses of Luke (original post), Eliezer, and many commenters.
Broad response to Luke's and Eliezer's points:
As I see it, there are a few possible visions of SI's mission:
My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team. An organization with a very narrow, specific mission - such as "analyzing how to develop a provably safe/useful/benign utility function without needing iterative/experimental development" - can, relatively easily, establish which other organizations (if any) are trying to provid...
My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team.
Can you describe a hypothetical organization and some examples of the impressive achievements it might have, which would pass the bar for handling mission M3? What is your estimate of the probability of such an organization coming into existence in the next five or ten years, if a large fraction of current SI donors were to put their money into donor-advised funds instead?
I'm very much an outsider to this discussion, and by no means a "professional researcher", but I believe those to be the primary reasons why I'm actually qualified to make the following point. I'm sure it's been made before, but a rapid scan revealed no specific statement of this argument quite as directly and explicitly.
HoldenKarnofsky: (...) my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.
I've always understood SI's position on this matter not as one of "We should not focus on building Tool AI! Fully reflectively self-modifying AGIs are the only way to go!", but rather that it is extremely unlikely that we can prevent everyone else from building one.
To my understanding, logic goes: If any programmer with relevant skills is sufficiently convinced, by whatever means and for whatever causes, that building a full traditional AGI is more efficient and will more "lazily" achieve his goals with less resources or achieve them faster, the programmer will build it whether you think...
On the question of the impact of rationality, my guess is that:
Luke, Holden, and most psychologists agree that rationality means something roughly like the ability to make optimal decisions given evidence and goals.
The main strand of rationality research followed by both psychologists and LWers has been focused on fairly obvious cognitive biases. (For short, let's call these "cognitive biases".)
Cognitive biases cause people to make choices that are most obviously irrational, but not most importantly irrational. For example, it's very clear that spinning a wheel should not affect people's estimates of how many African countries are in the UN. But do you know anyone for whom this sort of thing is really their biggest problem?
Since cognitive biases are the primary focus of research into rationality, rationality tests mostly measure how good you are at avoiding them. These are the tests used in the studies psychologists have done on whether rationality predicts success.
LW readers tend to be fairly good at avoiding cognitive biases (and will be even better if CFAR takes off).
But there are a whole series of much more important irrationalities that LWers suffer from.
Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future.
This equates "managing AI risk" and "building FAI" without actually making the case that these are equivalent. Many people believe that dangerous research can be banned by governments, for instance; it would be useful to actually make the case (or link to another place where it has been made) that managing AI risk is intractable without FAI.
This is one of the 10,000 things I didn't have the space to discuss in the original post, but I'm happy to briefly address it here!
It's much harder to successfully ban AI research than to successfully ban, say, nuclear weapons. Nuclear weapons require rare and expensive fissile material that requires rare heavy equipment to manufacture. Such things can be tracked to some degree. In contrast, AI research requires... um... a few computers.
Moreover, it's really hard to tell whether the code somebody is running on a computer is potentially dangerous AI stuff or something else. Even if you magically had a monitor installed on every computer to look for dangerous AI stuff, it would have to know what "dangerous AI stuff" looks like, which is hard to do before the dangerous AI stuff is built in the first place.
The monetary, military, and political incentives to build AGI are huge, and would be extremely difficult to counteract through a worldwide ban. You couldn't enforce the ban, anyway, for the reasons given above. That's why Ben Goertzel advocates "Nanny AI," though Nanny AI may be FAI-complete, as mentioned here.
I hope that helps?
Certainly the fact that some really awful charities are untruthful doesn't mean SI shouldn't be held accountable merely because it managed to tell the truth.
I didn't mean that SI shouldn't be held accountable for the theft. I was merely lamenting my expectation that it will probably be punished for reporting it.
A clarification. In Thoughts on the Singularity Institute, Holden wrote:
I will commit to is reading and carefully considering up to 50,000 words of content that are (a) specifically marked as SI-authorized responses to the points I have raised; (b) explicitly cleared for release to the general public as SI-authorized communications. In order to consider a response "SI-authorized and cleared for release," I will accept explicit communication from SI's Executive Director or from a majority of its Board of Directors endorsing the content in question.
As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.
According to Word Count Tool, these three things add up to a mere 13,940 words.
As SI's Executive Director I am hereby marking three different things as "SI-authorized responses" to the points Holden raised: my "Reply to Holden on the Singularity Institute" (the post above), my long comment on recent organizational improvements at SI, and Eliezer's Reply to Holden on Tool AI.
Consider removing the first sentence of the final link:
This comment is not intended to be part of the 50,000-word response which Holden invited.
I linked this to an IRC channel full of people skeptical of SI. One person commented that
the reply doesn't seem to be saying much
and another that
I think most arguments are 'yes we are bad but we will improve'
and some opinion based statement about how FAI is the most improtant thing on the world.
Which was somewhat my reaction as well - I can't put a finger on it and say exactly what it is that's wrong, but somehow it feels like this post isn't "meaty" enough to elicit much of a reaction, positive or negative. Which on the other feels odd, since e.g. the "SI's mission assumes a scenario that is far less conjunctive than it initially appears" heading makes an important point that SI hasn't really communicated well in the past. Maybe it just got buried under the other stuff, or something.
I can't speak for anyone else, and had been intending to sit this one out, since my reactions to this post were not really the kind of reaction you'd asked for.
But, OK, my $0.02.
The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence. This puts a huge burden on you, as the person attempting to provide that evidence.
So, I'll ask you: do you think your response provides such evidence?
If you do, then your problem seems to be (as others have suggested) one of document organization. Perhaps starting out with an elevator-pitch answer to the question "Why should I believe that SI is capable of this extraordinary feat?" might be a good idea.
Because my take-away from reading this post was "Well, nobody else is better suited to do it, and SI does some cool movement-building stuff (the Sequences, the Rationality Camps, and HPMoR) that attracts smart people and encourages them to embrace a more rational approach to their lives, and SI is fixing some of its organizational and communication problems but we need more money to really make progress...
The claim that an organization is exceptionally well-suited to convert money into existential risk mitigation is an extraordinary one, and extraordinary claims require extraordinary evidence.
Reminder: I don't know if you were committing this particular error internally, but, at the least, the sentence is liable to cause the error externally, so: Large consequences != prior improbability. E.g. although global warming has very large consequences, and even implies that we should take large actions, it isn't improbable a priori that carbon dioxide should trap heat in the atmosphere - it's supposed to happen, according to standard physics. And so demanding strong evidence that global warming is anthropogenic is bad probability theory and decision theory. Expensive actions imply a high value of information, meaning that if we happen to have access to cheap, powerfully distinguishing evidence about global warming we should look at it; but if that evidence is not available, then we go from the default extrapolation from standard physics and make policy on that basis - not demand more powerful evidence on pain of doing nothing.
The claim that SIAI is currently best-suited to convert ...
I am coming to the conclusion that "extraordinary claims require extraordinary evidence" is just bad advice, precisely because it causes people to conflate large consequences and prior improbability. People are fond of saying it about cryonics, for example.
At least sometimes, people may say "extraordinary claims require extraordinary evidence" when they mean "your large novel claim has set off my fraud risk detector; please show me how you're not a scam."
In other words, the caution being expressed is not about prior probabilities in the natural world, but rather the intentions and morals of the claimant.
We need two new versions of the advice, to satisfy everyone.
Version for scientists: "improbable claims require extraordinary evidence".
Versions for politicians: "inconvenient claims require extraordinary evidence".
I worry that this conversation is starting to turn around points of phrasing, but... I think it's worth separating the ideas that you ought to be doing x-risk reduction and that SIAI is the most efficient way to do it, which is why I myself agreed strongly with your own, original phrasing, that the key claim is providing the most efficient x-risk reduction. If someone's comparing SIAI to Rare Diseases in Cute Puppies or anything else that isn't about x-risk, I'll leave that debate to someone else - I don't think I have much comparative advantage in talking about it.
Holden is comparing SI to other giving opportunities, not just to giving opportunities that may reduce x-risk. That's not a part of the discussion Eliezer feels he should contribute to, though. I tried to address it in the first two sections of my post above, and then in part 3 I talked about why both FHI and SI contribute unique and important value to the x-risk reduction front.
In other words: I tried to explain that for many people, x-risk is Super Duper Important, and so for those people, what matters is which charities among those reducing x-risk they should support. And then I went on to talk about SI's value for x-risk reduction in particular.
Much of the debate over x-risk as a giving opportunity in general has to do with Holden's earlier posts about expected value estimates, and SI's post on that subject (written by Steven Kaas) is still under development.
It didn't have the same cohesiveness as Holden's original post; there were many more dangling threads, to borrow the same metaphor I used to say why his post was so interesting. You wrote it as a technical, thoroughly cited response and literature review instead of a heartfelt, wholly self-contained Mission Statement, and you made it very clear of that by stating at least 10 times that there was much more info 'somewhere else' (in conversations, in people's heads, yet to be written, etc.).
He wrote an intriguing short story, you wrote a dry paper.
Edit: Also, the answer to every question seems to be, "That will be in Eliezer's next Sequence," which postpones further debate.
The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.
This is unsettling. It sounds a lot like trying to avoid saying anything specific.
Eliezer will have lots of specific things to say in his forthcoming "Open Problems in Friendly AI" sequence (I know; I've seen the outline). In any case, wouldn't it be a lot more unsettling if, at this early stage, we pretended we knew enough to commit entirely to one very particular approach?
It's unsettling that this is still an early stage. SI has been around for over a decade. I'm looking forward to the open problems sequence; perhaps I should shut up about the lack of explanation of SI's research for now, considering that the sequence seems like a credible promise to remedy this.
When making the case for SI's comparative advantage, you point to these things:
... [A]nd the ability to do unusual things that are nevertheless quite effective at finding/creating lots of new people interested in rationality and existential risk reduction: (1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world...
What evidence supports these claims?
I'm really glad you pointed out that SI's strategy is not predicated on hard take-off. I don't recall if this has been discussed elsewhere, but that's something that always bothered me since I think hard take-off is relatively unlikely. (Admittedly, soft take-off still considerably diminishes my expected impact for SI and donating to it.)
If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex? (I only ask because bugging him for an update has been previously suggested to reduce update speed)
Furthermore. Oracle AI/Nanny AI seem to both fail the heuristic of "other country is about to beat us in a war, should we remove the safety programming" that I use quite often with nearly everyone I debate AI about from outside the LW community. Thank you both for writing such concise yet detailed responses that helped me understand the problem areas of Tool AI better.
If I earmark my donations for "HPMOR Finale or CPA Audit whichever comes first" would that act as positive or negative pressure towards Eliezer's fiction creation complex?
I think the issue is that we need a successful SPARC and an "Open Problems in Friendly AI" sequence more urgently than we need an HPMOR finale.
"Open Problems in Friendly AI" sequence
an HPMOR finale
A sudden, confusing vision just occurred, of the two being somehow combined. Aaagh.
In general - never earmark donations. It's a stupendous pain in the arse to deal with. If you trust an organisation enough to donate to them, trust them enough to use the money for whatever they see a need for. Contrapositive: If you don't trust them enough to use the money for whatever they see a need for, don't donate to them.
The discussion of how conjunctive SIAI's vision is seems unclear to me. Luke appears to have responded to only part of what I think Holden is likely to have meant.
Some assumptions whose conjunctions seem important to me (in order of decreasing importance):
1) The extent to which AGI will consist of one entity taking over the world versus many diverse entities with limited ability to dominate the others.
2) The size of the team required to build the first AGI (if it requires thousands of people, a nonprofit is unlikely to acquire the necessary resources; if i...
After being initially impressed by this, I found one thing to pick at:
Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa.
"Could" here tells you very little. The question isn't whether "build FAI" could work as a strategy for mitigating all other existential risks, it's whether that strategy has a good enough chance of working to be superior to other strategies for mitigating the other risks. What's missing is an argument for saying "yes" to that second question.
This is off-topic, but I'm curious: What were you and Louie working on in that photo on the donate page?
Why, we were busy working on a photo for the donate page! :)
Hopefully that photo is a more helpful illustration of the problems we work on than a photo of our normal work, which looks like a bunch of humans hunched over laptops, reading and typing.
You mention "computing overhang" as a threat essentially akin to hard takeoff. But regarding the value of FAI knowledge, it does not seem similar to me at all. A hard-takeoff AI can, at least in principal, be free from darwinian pressure. A "computing overhang" explosion of many small AIs will tend to be diverse and thus subject to strong evolutionary pressures of all kinds[1]. Presuming that FAI-ness is more-or-less delicate[1.5], those pressures are likely to destroy it as AIs multiply across available computing power (or, if we're ex...
Thanks for posting this!
I am also grateful to Holden for provoking this - as far as I can tell, the only substantial public speech from SIAI on LessWrong. SIAI often seems to be far more concerned with internal projects than communicating with its supporters, such as most of us on LessWrong.
What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem?
This is very worrying, especially in light of the lack of a public research agenda. SI's inability to describe its research agenda suggests the possibility that they cannot describe their research agenda because they do not know what they are doing because FAI is such a ridiculously hard problem that they have no idea where to begin. I'm hoping that SI will soon be able to make it clear that this is not the...
Lately I've been wondering whether it would make more sense to simply try to prevent the development of AGI rather than work to make it "friendly," at least for the foreseeable future. My thought is that AGI carries substantial existential risks, developing other innovations first might reduce those risks. and anything we can do to bring about such reductions is worth even enormous costs. In other words, if it takes ten thousand years to develop social or other innovations that would reduce the risk of terminal catastrophe by even 1% when AGI is ...
I don't have the goal of preventing my teenage daughter from having sex (firstly because I have no daughter yet, and secondly because the kind of people who would have such a goal often have a similar goal about younger sisters, and I don't -- indeed, I sometimes introduce single males to her); but I had no problem with pretending I had that goal for the sake of argument. Hell, even if Vaniver had said "simply try to cause more paperclips to exist" I would have pretended I had that goal.
BTW, I don't think that is the real reason why people flinch at such examples. If Vaniver had said “try to win your next motorcycle race” -- a goal that probably even fewer people share -- would anyone have objected?
Small correction: The term "obviously undesirable" referred to the potential collateral damage from trying to prevent the daughter from having sex, not to her having sex.
SI and rationality
Paraphrasing:
Holden expects us to have epistemic and instrumental powers of rationality that would make us successful in Western society, however this is a strawman. Being rational isn't succeeding in society, but succeeding at your own goals.
(Btw, I'm going to coin a new term for this: the straw-morra [a reference to the main character from Limitless]).
Now that being said, you shouldn't anticipate that the members of SI would be morra-like.
There's a problem with this: arguments made to support an individual are not nearly as c...
It occurs to me that Holden's actual reasoning (never mind what he said) is perhaps not about rationality per se and instead may be along these lines: "Since SI staff haven't already accumulated wealth and power, they probably suffer from something like insufficient work ethic or high akrasia or not-having-inherited-billions, and thus will probably be ineffective at achieving the kind of extremely-ambitious goals they have set for themselves."
Holden Karnofsky of GiveWell has objected to the Singularity Institute (SI) as a target for optimal philanthropy. As someone who thinks that existential risk reduction is really important and also that the Singularity Institute is an important target of optimal philanthropy, I would like to explain why I disagree with Holden on these subjects. (I am also SI's Executive Director.)
Mostly, I'd like to explain my views to a broad audience. But I'd also like to explain my views to Holden himself. I value Holden's work, I enjoy interacting with him, and I think he is both intelligent and capable of changing his mind about Big Things like this. Hopefully Holden and I can continue to work through the arguments together, though of course we are both busy with many other things.
I appreciate the clarity and substance of Holden's objections, and I hope to reply in kind. I begin with an overview of some basic points that may be familiar to most Less Wrong veterans, and then I reply point-by-point to Holden's post. In the final section, I summarize my reply to Holden.
Holden raised many different issues, so unfortunately this post needed to be long. My apologies to Holden if I have misinterpreted him at any point.
Contents
Comments
I must be brief, so while reading this post I am sure many objections will leap to your mind. To encourage constructive discussion on this post, each question (posted as a comment on this page) that follows the template described below will receive a reply from myself or another SI representative.
Please word your question as clearly and succinctly as possible, and don't assume your readers will have read this post before reading your question (because: the conversations here may be used as source material for a comprehensive FAQ).
Here's an example of how you could word the first paragraph of your question: "You claimed that [insert direct quote here], and also that [insert another direct quote here]. That seems to imply that [something something]. But that doesn't seem to take into account that [blah blah blah]. What do you think of that?"
If your question needs more explaining, leave the details to subsequent paragraphs in your comment. Please post multiple questions as multiple comments, so they can be voted upon and replied to individually. If you don't follow these rules, I can't guarantee SI will have time to give you a reply. (We probably won't.)
Why many people care greatly about existential risk reduction
Why do many people consider existential risk reduction to be humanity's most important task? I can't say it much better than Nick Bostrom does, so I'll just quote him:
I refer the reader to Bostrom's paper for further details and additional arguments, but neither his paper nor this post can answer every objection one might think of.
Nor can I summarize all the arguments and evidence related to estimating the severity and time horizon of every proposed existential risk. Even the 500+ pages of Oxford University Press' Global Catastrophic Risks can barely scratch the surface of this enormous topic. As explained in Intelligence Explosion: Evidence and Import, predicting long-term technological progress is hard. Thus, we must
I'll say more about convergent outcomes later, but for now I'd just like to suggest that:
Many humans living today value both current and future people enough that if existential catastrophe is plausible this century, then upon reflection (e.g. after counteracting their unconscious, default scope insensitivity) they would conclude that reducing the risk of existential catastrophe is the most valuable thing they can do — whether through direct work or by donating to support direct work. It is to these people I appeal. (I also have much to say to people who e.g. don't care about future people, but it is too much to say here and now.)
As it turns out, we do have good reason to believe that existential catastrophe is plausible this century.
I don't have the space here to discuss the likelihood of different kinds of existential catastrophe that could plausibly occur this century (see GCR for more details), so instead I'll talk about just one of them: an AI catastrophe.
AI risk: the most important existential risk
There are two primary reasons I think AI is the most important existential risk:
Reason 1: Mitigating AI risk could mitigate all other existential risks, but not vice-versa. There is an asymmetry between AI risk and other existential risks. If we mitigate the risks from (say) synthetic biology and nanotechnology (without building Friendly AI), this only means we have bought a few years or decades for ourselves before we must face yet another existential risk from powerful new technologies. But if we manage AI risk well enough (i.e. if we build a Friendly AI or "FAI"), we may be able to "permanently" (for several billion years) secure a desirable future. Machine superintelligence working in the service of humane goals could use its intelligence and resources to prevent all other existential catastrophes. (Eliezer: "I distinguish 'human', that which we are, from 'humane'—that which, being human, we wish we were.")
Reason 2: AI is probably the first existential risk we must face (given my evidence, only the tiniest fraction of which I can share in a blog post).
One reason AI may be the most urgent existential risk is that it's more likely for AI (compared to other sources of catastrophic risk) to be a full-blown existential catastrophe (as opposed to a merely billions dead catastrophe). Humans are smart and adaptable; we are already set up for a species-preserving number of humans to survive (e.g. in underground bunkers with stockpiled food, water, and medicine) major catastrophes from nuclear war, superviruses, supervolcano eruption, and many cases of asteroid impact or nanotechnological ecophagy.
Machine superintelligences, however, could intelligently seek out and neutralize humans which they (correctly) recognize as threats to the maximal realization of their goals. Humans are surprisingly easy to kill if an intelligent process is trying to do so. Cut off John's access to air for a few minutes, or cut off his water supply for a few days, or poke him with a sharp stick, and he dies. Forever. (Post-humans might shudder at this absurdity like we shudder at the idea that people used to die from their teeth.)
Why think AI is coming anytime soon? This is too complicated a topic to breach here. See Intelligence Explosion: Evidence and Import for a brief analysis of AI timelines. Or try The Uncertain Future, which outputs an estimated timeline for human-level AI based on your predictions of various technological developments. (SI is currently collaborating with the Future of Humanity Institute to write another paper on this subject.)
It's also important to mention that the case for caring about AI risk is less conjunctive that many seem to think, which I discuss in more detail here.
SI can purchase several kinds of AI risk reduction more efficiently than others can
The two organizations working most directly to reduce AI risk are the Singularity Institute and the Future of Humanity Institute (FHI). Luckily, these organizations complement each other well, as I pointed out back before I was running SI:
FHI is part of Oxford, and thus can bring credibility to existential risk reduction. Resulting output: lots of peer-reviewed papers, books from OUP like Global Catastrophic Risks, conferences, media appearances, etc.
SI is independent and is less constrained by conservatism or the university system. Resulting output: Very novel (and, to the mainstream, "weird") research on Friendly AI, and the ability to do unusual things that are nevertheless quite effective at finding/creating lots of new people interested in rationality and existential risk reduction: (1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world, and (3) the Singularity Summit, a mainstream-aimed conference that brings in people who end up making significant contributions to the movement — e.g. Tomer Kagan (an SI donor and board member) and David Chalmers (author of The Singularity: A Philosophical Analysis and The Singularity: A Reply).
A few weeks later, Nick Bostrom (Director of FHI) said the same things (as far as I know, without having read my comment):
FHI is, despite its small size, a highly productive philosophy department. More importantly, FHI has focused its research work on AI risk issues for the past 9 months, and plans to continue on that path for at least another 12 months. This is important work that should be supported. (Note that FHI recently hired SI research associate Daniel Dewey.)
SI lacks FHI's publishing productivity and its university credibility, but as an organization SI is improving quickly, and it can seize many opportunities for AI risk reduction that FHI is not well-positioned to seize. (New organizations will also tend to be less capable of seizing these opportunities than SI, due to the financial and human capital already concentrated at SI and FHI.)
Here are some examples of projects that SI is probably better able to carry out than FHI, given its greater flexibility (and assuming sufficient funding):
My replies to Holden, point by point
Holden's post makes so many claims that I'll just have to work through his post from beginning to end, and then summarize where I think we stand at the end.
GiveWell Labs
Holden opened "Thoughts on the Singularity Institute" by noting that SI was previously outside Givewell's scope, since GiveWell was focused on specific domains like poverty reduction. With the launch of GiveWell Labs, GiveWell is now open to evaluating any giving opportunity, including SI.
I admire this move. I'm sure people have been bugging GiveWell to do this for a long time, but almost none of those people appreciate how hard it is to launch broad new initiatives like this with the limited budget of an organization like Givewell or the Singularity Institute. Most of them also do not understand how much work is required to write something like "Thoughts on the Singularity Institute", "Reply to Holden on Tool AI", or this post.
Three possible outcomes
Next, Holden wrote:
As explained at the top of Holden's post, I had already conceded that many of Holden's objections (especially concerning past organizational competence) are valid, and had been working to address them, even before Holden's post was published. So outcome #2 is already true in part.
I hope for outcome #1, too, but I don't expect Holden to change his opinion overnight. There are too many possible objections to which Holden has not yet heard a good response. But hopefully this post and its comment threads will successfully address some of Holden's (and others') objections.
Outcome #3 is unlikely since SI is already making changes, though of course it's possible we will be unable to raise sufficient funding for SI despite making these changes, or even because of our efforts to make these changes. (Improving general organizational effectiveness is important but it costs money and is not exciting to donors.)
SI's mission is more important than SI as an organization
Holden said:
Clearly, SI's mission is more important than SI as an organization. If somebody launches an organization more effective (at AI risk reduction) than SI but just as flexible, then SI should probably fold itself and try to move its donor base, support community, and the best of its human capital to that new organization.
That said, it's probably easier to reform SI into a more effective organization than it is to launch a new one, since SI has successfully concentrated lots of attention, donor support, and human capital. Also, SI has learned many lessons about how to run a very tricky kind of organization. AI risk reduction is a mission that (1) is beyond most people's time horizons for caring, (2) is hard to understand and visualize, (3) pattern-matches to science fiction and apocalyptic religion, (4) suffers under complicated and necessarily uncertain strategic considerations (compare to the simplicity of bed nets), (5) has a very small pool of people from which to recruit researchers, etc. SI has lots of experience with these issues; experience that probably takes a long time and lots of money to acquire.
(On the other hand, SI has also concentrated some bad reputation which a new organization could launch without. But I still think the weight of the arguments is in favor of reforming SI.)
SI's arguments need to be clearer
Holden:
I agree that SI's arguments are often vague. For example, Chris Hallquist reported:
I know the feeling! That's why I've tried to write as many clarifying documents as I can, including the Singularity FAQ, Intelligence Explosion: Evidence and Import, The Singularity and Machine Ethics, Facing the Singularity, So You Want to Save the World, and How to Purchase AI Risk Reduction.
Unfortunately, it takes lots of resources to write up hundreds of arguments and responses to objections in clear and precise language, and we're working on it. (For comparison, Nick Bostrom's forthcoming book on machine superintelligence will barely scratch the surface of the things SI and FHI researchers have worked out in conversation, and it will probably take him 2+ years to write in total, and Bostrom is already an unusually prolific writer.) Hopefully SI's responses to Holden's post have helped to clarify our positions already.
Holden's objection #1 punts to objection #2
The first objection on Holden's numbered list was:
I'm glad Holden agrees with us that successful Friendly AI is very hard. SI has spent much of its effort trying to show people that the first 20 solutions they come up with all fail. See: AI as a Positive and Negative Factor in Global Risk, The Singularity and Machine Ethics, Complex Value Systems are Required to Realize Valuable Futures, etc. Holden mentions the standard SI worry about the hidden complexity of wishes, and the one about a friendly utility function still causing havoc because the AI's priors are wrong (problem 3.6 from my list of open problems in AI risk research).
There are reasons to think FAI is harder still. What if we get the utility function right and we get the priors right but the AI's values change for the worse when it updates its ontology? What if the smartest, most careful, most insanely safety-conscious AI researchers humanity can produce just aren't smart enough to solve the problem? What if no humans are altruistic enough to choose to build FAI over an AI that will make them king of the universe? What if the idea of FAI is incoherent? (The human brain is an existence proof for the possibility of general intelligence, but we have no existence proof for the possibility of a decision theoretic agent which stably optimizes the world according to a set of preferences over states of affairs.)
So, yeah. Friendly AI is hard. But as I said elsewhere:
So Holden's objection #1 objection really just punts to objection #2, about tool-AGI, as the last paragraph in this section of Holden's post seems to indicate:
So if Holden's objection #2 doesn't work, then objection #1 ends up reducing to "the development of Friendliness theory can achieve at best a reduction in AI risk," which is what SI has been saying all along.
Tool AI
Holden's second numbered objection was:
Eliezer wrote a whole post about this here. To sum up:
(1) Whether you're working with Tool AI or Agent AI, you need the "Friendly AI" domain experts that SI is trying to recruit:
(2) Tool AI isn't that much safer than Agent AI, because Tool AIs have lots of hidden "gotchas" that cause havoc, too. (See Eliezer's post for examples.)
These points illustrate something else Eliezer wrote:
Indeed. We need places for experts who specialize in seeing the consequences of mathematical objects for things humans value (e.g. the Singularity Institute) just like we need places for experts on efficient charity (e.g. Givewell).
Anyway, it's worth pointing out that Holden did not make the common (and mistaken) argument that "We should just build Tool AIs instead of Agent AIs and then we'll be fine." This is wrong for many reasons, but one obvious point is that there are incentives to build Agent AIs (because they're powerful), so even if the first 6 teams are careful enough to build only Tool AIs, the 7th team could still build Agent AI and destroy the world.
Instead, Holden pointed out that you could use Tool AI to increase your chances of successfully building agenty FAI:
After reading Eliezer's reply, however, you can probably guess my replies to this paragraph:
So Holden's Objection #2 doesn't work, which (as explained earlier) means that his Objection #1 (as stated) doesn't work either.
SI's mission assumes a scenario that is far less conjunctive than it initially appears.
Holden's objection #3 is:
His main concern here seemed to be that technological developments and other factors would render earlier FAI work irrelevant. But Eliezer's clarifications about what we mean by "FAI team" render this objection moot, at least as it is currently stated. The purpose of an FAI team is not to blindly develop one particular approach to Friendly AI without checking to see whether this work will be obsoleted by future developments. Instead, the purpose of an FAI team is to develop highly specialized expertise on, among other things, which kinds of research are more and less likely to be relevant given future developments.
Holden's confusion about what SI means by "FAI team" is common and understandable, and it is one reason that SI's mission assumes a scenario that is far less conjunctive than it appears to many. We aren't saying we need an FAI team because we know lots of specific things about how AGI will be built 30 years from now. We're saying you need experts on "the consequences of mathematical objects for things humans value" (an FAI team) because AGIs are mathematical objects and will have big consequences. That's pretty disjunctive.
Similarly, many people think SI's mission is predicated on hard takeoff. After all, we call ourselves the "Singularity Institute," Eliezer has spent a lot of time arguing for hard takeoff, and our current research summary frames AI risk in terms of recursive self-improvement.
But the case for AI as a global risk, and thus the need for dedicated experts on AI risk and "the consequences of mathematical objects for things humans value", isn't predicated on hard takeoff. Instead, it looks something like this:
(1) Eventually, most tasks are performed by machine intelligences.
The improved flexibility, copyability, and modifiability of machine intelligences make them economically dominant even without other advantages (Brynjolfsson & McAfee 2011; Hanson 2008). In addition, there is plenty of room "above" the human brain in terms of hardware and software for general intelligence (Muehlhauser & Salamon 2012; Sotala 2012; Kurzweil 2005).
(2) Machine intelligences don't necessarily do things we like.
We don't necessarily control AIs, since advanced intelligences may be inherently goal-oriented (Omohundro 2007), and even if we build advanced "Tool AIs," these aren't necessarily safe either (Yudkowsky 2012) and there will be significant economic incentives to transform them into autonomous agents (Brynjolfsson & McAfee 2011). We don't value most possible futures, but it's very hard to get an autonomous AI to do exactly what you want (Yudkowsky 2008, 2011; Muehlhauser & Helm 2012; Arkin 2009).
(3) There are things we can do to increase the probability that machine intelligences do things we like.
Further research can clarify (1) the nature and severity of the risk, (2) how to engineer goal-oriented systems safely, (3) how to increase safety with differential technological development, (4) how to limit and control machine intelligences (Armstrong et al. 2012; Yampolskiy 2012), (5) solutions to AI development coordination problems, and more.
(4) We should do those things now.
People aren't doing much about these issues now. We could wait until we understand better (e.g.) what kind of AI is likely, but: (1) it might take a long time to resolve the core issues, including difficult technical subproblems that require time-consuming mathematical breakthroughs, (2) incentives may be badly aligned (e.g. there seem to be strong economic incentives to build AI, but not to take into account social and global risks for AI), (3) AI may not be that far away (Muehlhauser & Salamon 2012), and (4) the transition to machine dominance may be surprisingly rapid due to (e.g.) intelligence explosion (Chalmers 2010, 2012; Muehlhauser & Salamon 2012) or computing overhang.
What do I mean by "computing overhang"? We may get the hardware needed for AI long before we get the software, such that once software for general intelligence is figured out, there is tons of computing hardware sitting around for running AIs (a "computing overhang"). Thus we could switch from a world with one autonomous AI to a world with 10 billion autonomous AIs at the speed of copying software, and thereby transition rapidly from human dominance to AI dominance even without an intelligence explosion. (This is one of the many, many things we haven't yet written up in detail up due to lack of resources.)
(This broad argument is greatly compressed from a paper outline developed by Paul Christiano, Carl Shulman, Nick Beckstead, and myself. We'd love to write the paper at some point, but haven't had the resources to do so. The fuller version of this argument is of course more detailed.)
SI's public argumentation
Next, Holden turned to the topic of SI's organizational effectiveness:
The first reason Holden gave for his negative impression of SI is:
I agree in part. Here's what I think:
SI's endorsements
The second reason Holden gave for his negative impression of SI is "a lack of impressive endorsements." This one is generally true, despite the three "celebrity endorsements" on our new donate page. More impressive than these is the fact that, as Eliezer mentioned, the latest edition of the leading AI textbook spend several pages talking about AI risk and Friendly AI, and discusses the work of SI-associated researchers like Eliezer Yudkowsky and Steve Omohundro while completely ignoring the existence of the older, more prestigious, and vastly larger mainstream academic field of "machine ethics."
Why don't we have impressive endorsements? To my knowledge, SI hasn't tried very hard to get them. That's another thing we're in the process of changing.
SI and feedback loops
The third reason Holden gave for his negative impression of SI is:
We have thought many times about commercially viable innovations we could develop, but these would generally be large distractions from the work of our core mission. (The Center for Applied Rationality, in contrast, has many opportunities to develop commercially viable innovations in line with its core mission.)
Still, I do think it's important for the Singularity Institute to test itself with tight feedback loops wherever feasible. This is particularly difficult to do for a research organization doing a philosophy of long-term forecasting (30 years is not a "tight" feedback loop in the slightest), but that's what FHI does and they have more "objectively impressive" (that is, "externally proclaimed") accomplishments: lots of peer-reviewed publications, some major awards for its top researcher Nick Bostrom, etc.
SI and rationality
Holden's fourth concern about SI is that it is overconfident about the level of its own rationality, and that this seems to show itself in (e.g.) "insufficient self-skepticism" and "being too selective (in terms of looking for people who share its preconceptions) when determining whom to hire and whose feedback to take seriously."
What would provide good evidence of rationality? Holden explains:
Unfortunately, this seems to misunderstand the term "rationality" as it is meant in cognitive science. As I explained elsewhere:
But I don't mean to dodge the key issue. I think SIers are generally more rational than most people (and so are LWers, it seems), but I think SIers have often overestimated their own rationality, myself included. Certainly, I think SI's leaders have been pretty irrational about organizational development at many times in the past. In internal communications about why SI should help launch CFAR, one reason on my list has been: "We need to improve our own rationality, and figure out how to create better rationalists than exist today."
SI's goals and activities
Holden's fifth concern about SI is the apparent disconnect between SI's goals and its activities:
This one is pretty easy to answer. We've focused mostly on movement-building rather than direct research because, until very recently, there wasn't enough community interest or funding to seriously begin to form an FAI team. To do that you need (1) at least a few million dollars a year, and (2) enough smart, altruistic people to care about AI risk that there exist some potential superhero mathematicians for the FAI team. And to get those two things, you've got to do mostly movement-building, e.g. Less Wrong, HPMoR, the Singularity Summit, etc.
Theft
And of course, Holden is (rightly) concerned about the 2009 theft of $118,000 from SI, and the lack of public statements from SI on the matter.
Briefly:
Pascal's Mugging
In another section, Holden wrote:
Some problems with Holden's two posts on this subject will be explained in a forthcoming post by Steven Kaas. But as Holden notes, some SI principals like Eliezer don't use "small probability of large impact" arguments, anyway. We in fact argue that the probability of a large impact is not tiny.
Summary of my reply to Holden
Now that I have addressed so many details, let us return to the big picture. My summarized reply to Holden goes like this:
Holden's first two objections can be summarized as arguing that developing the Friendly AI approach is more dangerous than developing non-agent "Tool" AI. Eliezer's post points out that "Friendly AI" domain experts are what you need whether you're working with Tool AI or Agent AI, because (1) both of these approaches require FAI experts (experts in seeing the consequences of mathematical objects for what humans value), and because (2) Tool AI isn't necessarily much safer than Agent AI, because Tool AIs have lots of hidden gotchas, too. Thus, "What the human species needs from an x-risk perspective is experts on This Whole Damn Problem [of AI risk], who will acquire whatever skills are needed to that end. The Singularity Institute exists to host such people and enable their research — once we have enough funding to find and recruit them."
Holden's third objection was that the argument behind SI's mission is more conjunctive than it seems. I replied that the argument behind SI's mission is actually less conjunctive than it often seems, because an "FAI team" works on a broader set of problems than Holden had realized, and because the case for AI risk is more disjunctive than many people realize. These confusions are understandable, however, and they probably are a result of insufficient clear argumentative writing from SI on these matters — a problem we am trying to fix with several recent and forthcoming papers and other communications (like this one).
Holden's next objection concerned SI as an organization: "SI has, or has had, multiple properties that I associate with ineffective organizations." I acknowledged these problems before Holden published his post, and have since outlined the many improvements we've made to organizational effectiveness since I was made Executive Director. I addressed several of Holden's specific worries here.
Finally, Holden recommended giving to a donor-advised fund rather than to SI:
By now I've called into question most of Holden's arguments about SI, but I will still address the issue of donating to SI vs. donating to a donor-advised fund.
First: Which public charity would administer the donor-advised fund? Remember also that in the U.S., the administering charity need not spend from the donor-advised fund as the donor wishes, though they often do.
Second: As I said earlier,
The case for funding improvements and growth at SI (as opposed to starving SI as Holden suggests) is bolstered by the fact that SI's productivity and effectiveness have been improving rapidly of late, and many other improvements (and exciting projects) are on our "to-do" list if we can raise sufficient funding to implement them.
Holden even seems to share some of this optimism:
Conclusion
For brevity's sake I have skipped many important details. I may also have misinterpreted Holden somewhere. And surely, Holden and other readers have follow-up questions and objections. This is not the end of the conversation; it is closer to the beginning. I invite you to leave your comments, preferably in accordance with these guidelines (for improved discussion clarity).