DaFranker comments on Reply to Holden on The Singularity Institute - Less Wrong

46 Post author: lukeprog 10 July 2012 11:20PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (213)

You are viewing a single comment's thread. Show more comments above.

Comment author: HoldenKarnofsky 01 August 2012 02:16:55PM 14 points [-]

I greatly appreciate the response to my post, particularly the highly thoughtful responses of Luke (original post), Eliezer, and many commenters.

Broad response to Luke's and Eliezer's points:

As I see it, there are a few possible visions of SI's mission:

  • M1. SI is attempting to create a team to build a "Friendly" AGI.
  • M2. SI is developing "Friendliness theory," which addresses how to develop a provably safe/useful/benign utility function without needing iterative/experimental development; this theory could be integrated into an AGI developed by another team, in order to ensure that its actions are beneficial.
  • M3. SI is broadly committed to reducing AGI-related risks, and work on whatever will work toward that goal, including potentially M1 and M2.

My view is that the broader SI's mission, the higher the bar should be for the overall impressiveness of the organization and team. An organization with a very narrow, specific mission - such as "analyzing how to develop a provably safe/useful/benign utility function without needing iterative/experimental development" - can, relatively easily, establish which other organizations (if any) are trying to provide what it does and what the relative qualifications are; it can set clear expectations for deliverables over time and be held accountable to them; its actions and outputs are relatively easy to criticize and debate. By contrast, an organization with broader aims and less clearly relevant deliverables - such as "broadly aiming to reduce risks from AGI, with activities currently focused on community-building" - is giving a donor (or evaluator) less to go on in terms of what the space looks like, what the specific qualifications are and what the specific deliverables are. In this case it becomes more important that a donor be highly confident in the exceptional effectiveness of the organization and team as a whole.

Many of the responses to my criticisms (points #1 and #4 in Eliezer's response; "SI's mission assumes a scenario that is far less conjunctive than it initially appears" and "SI's goals and activities" section of Luke's response) correctly point out that they have less force, as criticisms, when one views SI's mission as relatively broad. However, I believe that evaluating SI by a broader mission raises the burden of affirmative arguments for SI's impressiveness. The primary such arguments I see in the responses are in Luke's list:

(1) The Sequences, the best tool I know for creating aspiring rationalists, (2) Harry Potter and the Methods of Rationality, a surprisingly successful tool for grabbing the attention of mathematicians and computer scientists around the world, and (3) the Singularity Summit, a mainstream-aimed conference that brings in people who end up making significant contributions to the movement — e.g. Tomer Kagan (an SI donor and board member) and David Chalmers (author of The Singularity: A Philosophical Analysis and The Singularity: A Reply).

I've been a consumer of all three of these, and while I've found them enjoyable, I don't find them sufficient for the purpose at hand. Others may reach a different conclusion. And of course, I continue to follow SI's progress, as I understand that it may submit more impressive achievements in the future.

Both Luke and Eliezer seem to disagree with the basic approach I'm taking here. They seem to believe that it is sufficient to establish that (a) AGI risk is an overwhelmingly important issue and that (b) SI compares favorably to other organizations that explicitly focus on this issue. For my part, I (a) disagree with the statement: "the loss in expected value resulting from an existential catastrophe is so enormous that the objective of reducing existential risks should be a dominant consideration whenever we act out of an impersonal concern for humankind as a whole"; (b) do not find Luke's argument that AI, specifically, is the most important existential risk to be compelling (it discusses only how beneficial it would be to address the issue well, not how likely a donor is to be able to help do so); (c) believe it is appropriate to compare the overall organizational impressiveness of the Singularity Institute to that of all other donation-soliciting organizations, not just to that of other existential-risk- or AGI-focused organizations. I would guess that these disagreements, particularly (a) and (c), come down to relatively deep worldview differences (related to the debate over "Pascal's Mugging") that I will probably write more about in the future.

On tool AI:

Most of my disagreements with SI representatives seem to be over how broad a mission is appropriate for SI, and how high a standard SI as an organization should be held to. However, the debate over "tool AI" is different, with both sides making relatively strong claims. Here SI is putting forth a specific point as an underappreciated insight and thus as a potential contribution/accomplishment; my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

My latest thoughts on this disagreement were posted separately in a comment response to Eliezer's post on the subject.

A few smaller points:

  • I disagree with Luke's claim that " objection #1 punts to objection #2." Objection #2 (regarding "tool AI") points out one possible approach to AGI that I believe is both consonant with traditional software development and significantly safer than the approach advocated by SI. But even if the "tool AI" approach is not in fact safer, there may be safer approaches that SI hasn't thought of. SI does not just emphasize the general problem that AGI may be dangerous (something that I believe is a fairly common view), but emphasizes a particular approach to AGI safety, one that seems to me to be highly dangerous. If SI's approach is dangerous relative to other approaches that others are taking/advocating, or even approaches that have yet to be developed (and will be enabled by future tools and progress on AGI), this is a problem for SI.
  • Luke states that rationality is "only a ceteris paribus predictor of success" and that it is a "weak one." I wish to register that I believe rationality is a strong (though not perfect) predictor of success, within the population of people who are as privileged (in terms of having basic needs met, access to education, etc.) as most SI supporters/advocates/representatives. So while I understand that success is not part of the definition of rationality, I stand by my statement that it is "the best evidence of superior general rationality (or of insight into it)."
  • Regarding donor-advised funds: opening an account with Vanguard, Schwab or Fidelity is a simple process, and I doubt any of these institutions would overrule a recommendation to donate to an organization such as SI (in any case, this is easily testable).
Comment author: DaFranker 01 August 2012 03:29:42PM *  7 points [-]

I'm very much an outsider to this discussion, and by no means a "professional researcher", but I believe those to be the primary reasons why I'm actually qualified to make the following point. I'm sure it's been made before, but a rapid scan revealed no specific statement of this argument quite as directly and explicitly.

HoldenKarnofsky: (...) my view is that SI's suggested approach to AGI development is more dangerous than the "traditional" approach to software development, and thus that SI is advocating for an approach that would worsen risks from AGI.

I've always understood SI's position on this matter not as one of "We should not focus on building Tool AI! Fully reflectively self-modifying AGIs are the only way to go!", but rather that it is extremely unlikely that we can prevent everyone else from building one.

To my understanding, logic goes: If any programmer with relevant skills is sufficiently convinced, by whatever means and for whatever causes, that building a full traditional AGI is more efficient and will more "lazily" achieve his goals with less resources or achieve them faster, the programmer will build it whether you think it's a good idea or not. As such, SI's "Moral Imperative" is to account for this scenario as there is non-negligible probability of it actually happening, for if they do not, they effectively become hypocritical in claiming to work towards reducing existential AI risk.

To reiterate with silly scare-formatting: It is completely irrelevant, in practice, what SI "advocates" or "promotes" as a preferred approach to building safe AI, because the probability that someone, somewhere, some day is going to use the worst possible approach is definitely non-negligible. If there is not already a sufficiently advanced Friendly AI in place to counter such a threat, we are then effectively defenseless.

To metaphorize, this is a case of: "It doesn't matter if you think only using remote-controlled battle robots would be a better way to resolve international disputes. At some point, someone somewhere is going to be convinced that killing all of you is going to be faster and cheaper and more certain of achieving their goals, so they'll build one giant bomb and throw it at you without first making sure they won't kill themselves in the process."

Comment author: John_Maxwell_IV 03 August 2012 05:28:21AM *  5 points [-]

This looks similar to this point Kaj Sotala made. My own restatement: As the body of narrow AI research devoted to making tools grows larger and larger, building agent AGI gets easier and easier, and there will always be a few Shane Legg types who are crazy enough to try it.

I sometimes suspect that Holden's true rejection to endorsing SI is that the optimal philanthropy movement is fringe enough already, and he doesn't want to associate it with nutty-seeming beliefs related to near-inevitable doom from superintelligence. Sometimes I wish SI would market themselves as being similar to nuclear risk organizations like the Bulletin of Atomic Scientists. After all, EY was an AI researcher who quit and started working on Friendliness when he saw the risks, right? I think you could make a pretty good case for SI's usefulness just working based on analogies from nuclear risk, without any mention of FOOM or astronomical waste or paperclip maximizers.

Ideally we'd have wanted to know about nuclear weapon risks before having built them, not afterwards, right?

Comment author: DaFranker 03 August 2012 12:58:33PM *  1 point [-]

Personally, I highly doubt that to be Holden's true rejection, though it is most likely one of the emotional considerations that cannot be ignored in a strategic perspective. Holden claims to have gone through most of the relevant LessWrong sequence and SIAI public presentation material, which makes the likelihood of a deceptive (or self-deceptive) argumentation lower, I believe.

No, what I believe to be the real issue is that Holden and (Most of SIAI) have disagreements over many specific claims used to justify broader claims - if the specific claims are granted in principle, both seem to generally agree in good bayesian fashion on the broader or more general claim. Much of the disagreements on those specifics also appears to stem from different priors in ethical and moral values, as well as differences in their evaluations and models of human population behaviors and specific (but often unspecified) "best guess" probabilities.

For a generalized example, one strong claim for existential risk being optimal effort is that even a minimal decrease in risk provides immense expected value simply from the sheer magnitude of what could most likely be achieved by humanity throughout the rest of its course of existence. Many experts and scientists outright reject this on the grounds that "future, intangible, merely hypothetical other humans" should not be assigned value on the same order-of-magnitude as current humans, or even one order of magnitude lower.

Comment author: [deleted] 01 August 2012 04:01:43PM 1 point [-]

but rather that it is extremely unlikely that we can prevent everyone else from building one.

Well, SI's mission makes sense on the premise that the best way to prevent a badly built AGI from being developed or deployed is to build a friendly AGI which has that as one of its goals. 'Best way' here is a compromise between, on the one hand, the effectiveness of the FAI relative to other approaches, and on the other, the danger presented by the FAI itself as opposed to other approaches.

So I think Holden's position is that the ratio of danger vs. effectiveness does not weigh favorably for FAI as opposed to tool AI. So to argue against Holden, we would have to argue either that FAI will be less dangerous than he thinks, or that tool AI will be less effective than he thinks.

I take it the latter is the more plausible.

Comment author: DaFranker 02 August 2012 06:27:23PM *  1 point [-]

Indeed, we would have to argue that to argue against Holden.

My initial reaction was to counter this with a claim that we should not be arguing against anyone in the first place, but rather looking for probable truth (concentrate anticipations). And then I realized how stupid that was: Arguments Are Soldiers. If SI (and by the Blue vs Green principle, any SI-supporter) can't even defend a few claims and defeat its opponents, it is obviously stupid and not worth paying attention to.

SI needs some amount of support, yet support-maximization strategies carry a very high risk of introducing highly dangerous intellectual contamination through various forms (including self-reinforcing biases in the minds of researchers and future supporters) that could turn out to cause even more existential risk. Yet, at the same time, not gathering enough support quickly enough dramatically augments the risk that someone, somewhere, is going to trip on a power cable and poof, all humans are just gone.

I am definitely not masterful enough in mathematics and bayescraft to calculate the optimal route through this differential probabilistic maze, but I suspect others could provide a very good estimate.

Also, it's very much worth noting that these very considerations, on a meta level, are an integral part of SI's mission, so figuring out whether that premise you stated is true or not, and whether there are better solutions or not actually is SI's objective. Basically, while I might understand some of the cognitive causes for it, I am still very much rationally confused when someone questions SI's usefulness by questioning the efficiency of subgoal X, while SI's original and (to my understanding) primary mission is precisely to calculate the efficiency of subgoal X.