Followup to: Outline of possible Singularity scenarios (that are not completely disastrous)

Given that the Singularity and being strategic are popular topics around here, it's surprising there hasn't been more discussion on how to answer the question "In what direction should we nudge the future, to maximize the chances and impact of a positive Singularity?" ("We" meaning the SIAI/FHI/LW/Singularitarian community.)

(Is this an appropriate way to frame the question? It's how I would instinctively frame the question, but perhaps we ought to discussed alternatives first. For example, one might be "What quest should we embark upon to save the world?", which seems to be the frame that Eliezer instinctively prefers. But I worry that thinking in terms of "quest" favors the part of the brain that is built mainly for signaling instead of planning. Another alternative would be "What strategy maximizes expect utility?" but that seems too technical for human minds to grasp on an intuitive level, and we don't have the tools to answer the question formally.)

Let's start by assuming that humanity will want to build at least one Friendly superintelligence sooner or later, either from scratch, or by improving human minds, because without such an entity, it's likely that eventually either a superintelligent, non-Friendly entity will arise, or civilization will collapse. The current state of affairs, in which there is no intelligence greater than baseline-human level, seems unlikely to be stable over the billions of years of the universe's remaining life. (Nor does that seem particularly desirable even if it is possible.)

Whether to push for (or personally head towards) de novo AI directly, or IA/uploading first, depends heavily on the expected (or more generally, subjective probability distribution of) difficulty of building a Friendly AI from scratch, which in turn involves a great deal of logical and philosophical uncertainty. (For example, if it's known that it actually takes a minimum of 10 people with IQ 200 to build a Friendly AI, then there is clearly little point in pushing for de novo AI first.)

Besides the expected difficulty of building FAI from scratch, another factor that weighs heavily in the decision is the risk of accidentally building an unFriendly AI (or contributing to others building UFAIs) while trying to build FAI. Taking this into account also involves lots of logical and philosophical uncertainty. (But it seems safe to assume that this risk, if plotted against the intelligence of the AI builders, forms an inverted U shape.)

Since we don't have good formal tools for dealing with logical and philosophical uncertainty, it seems hard to do better than to make some incremental improvements over gut instinct. One idea is to train our intuitions to be more accurate, for example by learning about the history of AI and philosophy, or learning known cognitive biases and doing debiasing exercises. But this seems insufficient to gap the widely differing intuitions people have on these questions.

My own feeling is that the chance of success of of building FAI, assuming current human intelligence distribution, is low (even if given unlimited financial resources), while the risk of unintentionally building or contributing to UFAI is high. I think I can explicate a part of my intuition this way: There must be a minimum level of intelligence below which the chances of successfully building an FAI is negligible.  We humans seem at best just barely smart enough to build a superintelligent UFAI. Wouldn't it be surprising that the intelligence threshold for building UFAI and FAI turn out to be the same?

Given that there are known ways to significantly increase the number of geniuses (i.e., von Neumann level, or IQ 180 and greater), by cloning or embryo selection, an obvious alternative Singularity strategy is to invest directly or indirectly in these technologies, and to try to mitigate existential risks (for example by attempting to delay all significant AI efforts) until they mature and bear fruit (in the form of adult genius-level FAI researchers). Other strategies in the same vein are to pursue cognitive/pharmaceutical/neurosurgical approaches to increasing the intelligence of existing humans, or to push for brain emulation first followed by intelligence enhancement of human minds in software form.

Social/PR issues aside, these alternatives make more intuitive sense to me. The chances of success seem higher, and if disaster does occur as a result of the intelligence amplification effort, we're more likely to be left with a future that is at least partly influenced by human values. (Of course, in the final analysis, we also have to consider social/PR problems, but all Singularity approaches seem to have similar problems, which can be partly ameliorated by the common sub-strategy of "raising the general sanity level".)

I'm curious in what others think. What does your intuition say about these issues? Are there good arguments in favor of any particular strategy that I've missed? Is there another strategy that might be better than the ones mentioned above?

New Comment
30 comments, sorted by Click to highlight new comments since: Today at 6:01 PM

I was informed by Justin Shovelain that recently he independently circulated a document arguing for "IA first", and that most of the two dozen people he showed it to agreed with it, or nearly so.

I'm a bit surprised there hasn't been more people arguing (or at least stating their intuition) that "AI first" is the better strategy.

But I did find that Eliezer had written an argument explaining why he chose the "AI first" strategy in Artificial Intelligence as a Positive and Negative Factor in Global Risk (pages 31-35). Here's the conclusion from that section:

I would be pleasantly surprised if augmented humans showed up and built a Friendly AI before anyone else got the chance. But someone who would like to see this outcome will probably have to work hard to speed up intelligence enhancement technologies; it would be difficult to convince me to slow down. If AI is naturally far more difficult than intelligence enhancement, no harm done; if building a 747 is naturally easier than inflating a bird, then the wait could be fatal. There is a relatively small region of possibility within which deliberately not working on Friendly AI could possibly help, and a large region within which it would be either irrelevant or harmful. Even if human intelligence enhancement is possible, there are real, difficult safety considerations; I would have to seriously ask whether we wanted Friendly AI to precede intelligence enhancement, rather than vice versa.

I do not assign strong confidence to the assertion that Friendly AI is easier than human augmentation, or that it is safer. There are many conceivable pathways for augmenting a human. Perhaps there is a technique which is easier and safer than AI, which is also powerful enough to make a difference to existential risk. If so, I may switch jobs. But I did wish to point out some considerations which argue against the unquestioned assumption that human intelligence enhancement is easier, safer, and powerful enough to make a difference.

If AI is naturally far more difficult than intelligence enhancement, no harm done

I should probably write a more detailed response to Eliezer's argument at some point. But for now it seems worth pointing out that if UFAI is of comparable difficulty to IA, but FAI is much harder (as seems plausible), then attempting to build FAI would cause harm by diverting resources away from IA and contributing to the likelihood of UFAI coming first in other ways.

What if UFAI (of the dangerous kind) is incredibly difficult compared to harmless but usable AI such as a system that can find inputs to any computable function that give maximum output, analytically (not mere bruteforcing) and which for example understands ODEs?

We can cure every disease including mortality with it, we can use it to improve it, and can use it to design the machinery for mind uploading - all with comparatively little effort as it would take off much of cognitive workload - but it won't help us make the 'utility function' in the SI sense (paperclips, etc) as this is a problem of definition.

I feel that the unfriendly AI term is a clever rhetorical technique. The above-mentioned math AI is not friendly, but neither is it unfriendly. Several units could probably be combined to cobble together a natural language processing system as well. Nothing like 'hearing a statement then adopting a real world goal to the general gist of it', though.

Cousin_it (who took a position similar to yours) and Nesov had a discussion about this, and I tend to agree with Nesov. But perhaps this issue deserves a more extensive discussion. I will give it some thought and maybe write a post.

The discussion you link is purely ideological: pessimist, narrow minded cynicism about human race (on Nesov's side), versus the normal view, without any justifications what so ever for either view.

The magical optimizer allows for space colonization (probably), cures for every disease, solution to energy problems, and so on. We do not have as much room for intelligent improvement when it comes to destroying ourselves - the components for deadly diseases come pre made by evolution, the nuclear weapons already have been invented, etc. The capacity of destruction is bounded by what we have to lose (and we already have the capacity to lose everything), the capacity for growth is bounded by much larger value of what we may gain.

Sure, the magical friendly AI is better than anything else. So is flying carpet better than a car.

When you focus so much on the notion that others are stupid, you forget how hostile is the very universe we live in, you neglect how important it is to save ourselves from external-ish factors. As long as viruses like common cold and flu can exist and be widespread, it is only a matter of time until there is a terrible pandemic killing an enormous number of people (and potentially crippling economy). We haven't even gotten rid of dangerous parasites yet. Not even top of the foodchain really, if you count parasites. We are also stuck on a rock hurling through space full of rocks, and we can't go anywhere.

What if, as I suspect, UFAI is much easier than IA, where IA is at the level you're hoping for? Moreover, what evidence can you offer that researchers of von Neumann's intelligence face a significantly smaller difficulty gap between UFAI and FAI than those of mere high intelligence? For some determinacy, let "significantly smaller difficulty gap" mean that von Neumann level intelligence gives at least twice the probability of FAI, conditional on GAI.

Basically, I think you overestimate the value of intelligence.

Which is not to say that a parallel track of IA might not be worth a try.

What if, as I suspect, UFAI is much easier than IA, where IA is at the level you're hoping for?

I had a post about this.

Moreover, what evidence can you offer that researchers of von Neumann's intelligence face a significantly smaller difficulty gap between UFAI and FAI than those of mere high intelligence? For some determinacy, let "significantly smaller difficulty gap" mean that von Neumann level intelligence gives at least twice the probability of FAI, conditional on GAI.

If it's the case that even researchers of von Neumann's intelligence cannot attempt to build FAI without creating unacceptable risk, then I expect they would realize that (assuming they are not less rational than we are) and find even more indirect ways of building FAI (or optimizing the universe for humane values in general), like for example building an MSI-2.

informed by Justin Shovelain that recently he independently circulated a document

Is this a super-secret document of can we ask Justin to share?

Sorry, I should have said that it's a draft document. I didn't see any particularly sensitive information in it, so presumably Justin will release it when it's ready. But the argument is basically along the same lines as my OP.

I think that some organization should be seriously planning how to leverage possible uploading or intelligence improvement technologies for building FAI (e.g. try to be the first to run an accelerated uploaded FAI research team; or develop better textbooks on FAI theory that improve the chances that future people, uploaded or with improved intelligence, get it right), and tracking new information that impacts such plans. Perhaps SIAI should form a team that works on that, or create another organization with that mission (right now it doesn't look like a top priority, but it could be, so this deserves some thought; it will be a clear priority in another 20 years, as relevant technologies move closer, but by then it might be too late to do some specific important-in-retrospect thing).

What do you think about the idea that people who are currently interested in doing FAI research limit their attention to topics that are not "dual use" (i.e., equally applicable to building UFAI)? For example, metaethics and metaphilosophy seem "single use", whereas logical uncertainty, and to a lesser extent, decision theory, seem "dual use". Of course we could work on dual use topics and try to keep the results secret, but it seems unlikely that we'd be good enough at keeping secrets that such work won't leak out pretty quickly.

We humans seem at best just barely smart enough to build a superintelligent UFAI.

I haven't unpacked the intuition, but I feel humans are pretty far over the threshold of smart enough to build general AI given no giant-setback/existential catastrophes and no significant intelligence enhancement within the next 500 years. In the last 500 years we gained probability theory, in the last 100 utility theory, in the last 50 formal causality theory, and it seems we're making some progress on new decision theories now. (I don't claim to know if those are the most important theoretical inventions or anything, but they seem likely to be very important). And the progress we've made on mechanical integration of sensory input from the world feels like it's not going to be the bottleneck... so give me conservatively another 1 fundamental breakthrough per century and I'm not comfortable at all saying after 5 of those humans would be just barely smart enough to build UFAI.

Did you have a much shorter time horizon, or do you know what other thing it is that fuels your just-barely-smart-enough intuition?

If AGI will take longer than 100 years to become possible, "AI first" isn't a relevant strategic option since an upload or IA driven Singularity will probably occur within that time frame even without any specific push from Singularitarians. So it seems reasonable to set a time horizon of 100 years at most.

Ah okay, so we're talking about a "humans seem just barely smart enough to build a superintelligent UFAI within the next 100 years" intuition. Talking about that makes sense, and that intuition feels much more plausible to me.

I'd give it 150 years. Civilization might get a setback, actual implementation of fast-running uploads might be harder than it looks, and intelligence improvement might take too long to become an important force. Plans can fail.

We humans seem at best just barely smart enough to build a superintelligent UFAI.

Humans seem just barely smart enough to do many things, especially in the period when doing those things is just becoming possible.

That sentence makes more sense if you consider it within a time horizon in which strategic considerations are relevant. See my reply to GuySrinivasan. (I probably should have made that clearer in the post.)

Wouldn't it be surprising that the intelligence threshold for building UFAI and FAI turn out to be the same?

Probably not, since you use knowledge to build things, not just intelligence, and knowledge accumulates. For example, humans can build (invent) both windmills and space shuttles, and lower intelligence would probably bar humans from building (inventing) windmills.

Again, that whole section makes more sense if we assume a limited time horizon. But I'm curious: you don't think it's likely that there is some distribution of intelligence at which humanity would have invented windmills but then stagnated before reaching space shuttles? It seems to me that building windmills require only some physical/mechanical intuitions plus trial and error, whereas space shuttles need higher mathematics and computers, which can't be understood (much less invented) below a certain IQ.

I wrote that comment before reading the clarification about there being an intended context of a time horizon. I agree that it'd take longer to develop FAI than to develop an UFAI, all else equal. (Of course, all else is not equal, and you did write about intelligence threshold and not time.)

I think there is significant speedup in the time it took us to build space shuttles gained through outliers in intelligence, compared to the time it would take to do that for a species of uniform ability that was only barely smart enough to invent windmills, but I expect they'd get there, barring defeaters (to use Chalmers' term), if they manage to stumble upon a science-generating social institution, and avoid collapse of their civilization for long enough to accumulate the necessary knowledge. People that are not that smart can do science, and learn science, as they can learn language or any other skill, it just takes them much longer to learn it or find useful abstract insights, and each individual can master less. We (as a civilization) don't use people of low ability for this purpose only because there is a much more efficient alternative.

We humans seem at best just barely smart enough to build a superintelligent UFAI. Wouldn't it be surprising that the intelligence threshold for building UFAI and FAI turn out to be the same?

I think people who would contest the direction of this post (probably Eli and Nesov) would point out that if humanity is over the intelligence threshold for UFAI, the economic-political-psychological forces would drive it to be built in a timeframe of few decades. Anything that does not address this directly will destroy the future. Building smarter humans is likely not fast enough (plus who says smarter humans will not be driven by the same forces to build UFAI?).

The problem is that building FAI is also likely not fast enough, given that UFAI looks significantly easier than FAI. And there are additional unique downsides to attempting to build FAI: since many humans are naturally competitive, it provides additional psychological motivation for others to build AGI; unless the would-be FAI builders have near perfect secrecy and security, they will leak ideas and code to AGI builders not particularly concerned with Friendliness; the FAI builders may themselves accidentally build UFAI; it's hard to do anti-AI PR/politics (to delay UFAI) while you're trying to build an AI yourself.

ETA: Also, the difficulty of building smarter humans seems logically independent of the difficulty of building UFAI, whereas the difficulty of building FAI is surely at least as great as the difficulty of building UFAI. So it seems the likelihood that building smarter humans is fast enough is higher.

plus who says smarter humans will not be driven by the same forces to build UFAI?

Smarter humans will see the difficulty gap between FAI and UFAI as smaller, so they'll be less motivated to "save time and effort" by not taking taking safety/Friendliness seriously. The danger of UFAI will also be more obvious to them.

Well, let me take one more step back here, and ask the question, why do we want to have a singularity at all?

I realize that might cause a knee jerk reaction from some people, but think about it for a second; you have to know what your goals are before trying to figure out what strategy to use to get there.

If the answer is something like "So we can cure old age, live much longer, happier lives, develop a post scarcity economy, colonize space, prevent environmental collapse, avoid existential risks, and expand into the universe", then I wonder; what if IA is sufficient to do all of that without developing a GAI at all (at least in the near future)? I'm not saying that a IA human is going to be as smart as a GAI computer; but what if a group of IA humans with a 250 IQ or so are smart enough to accomplish whatever our goals are?

I'm never exactly sure how to respond to a question like that.

I mean, let's up the ante a little: what if a group of IA humans with "a 250 IQ or so" are not only sufficient to accomplish our goals, but optimal... what if it turns out that for every unit of intelligence above that threshold, our chance of achieving our goals goes down?

Well, in that case we would do best to have such a group, and to not develop greater intelligence than that.

That said, if I found myself tomorrow confident that a proposition like that was true, I would strongly encourage myself to review how I had arrived at such confidence as rigorously as I could before actually making any decisions based on that confidence, because it's just so damned implausible that any halfway legitimate process would arrive at such a conclusion.

Well, the way I defined those goals, I think those are all are things that I'm pretty confident that even unenhanced humans would eventually be able to do; intelligence enhancement here is more about speeding it up then about doing the impossible.

I think that curing old age and creating an indefinite lifespan is something we would eventually manage even without enhancement, through some combination of SENS, cloning, organ bioprinting, cybernetics, genetic engineering, ect. Notice, however, that if I had defined the goal as "preventing all death, including accidental death" then we get into a realm where I become less sure of unenhanced humans solving that problem. Like I said, how you set your goals is pretty vital here.

Same for the other goals I mentioned; a "basic" post scarcity system (everyone able to get all the necessities and most consumer goods at will, almost no one needs to work unless they want to) is something that we should be able to do with just some fairly modest advances in narrow AI and robotics, combined with a new economic/political system; that should be within our theoretical capabilities at some point. Space colonization, again, doesn't seem like an impossible task based on what we know now.

That's not even to say that we won't develop GAI, or that we shouldn't...but if we can solve those basic types of problems without it, then maybe that at least lets us take our time, develop GAI one small step at a time, and make sure we get it right. You know; have AI's with 100 IQ help us develop friendly AI theory for a while before we start building AI's with 150 IQ, and then have those help refine our design of AI's with 180 IQ, ect, instead of a risky rapid self-modifying foom.

I certainly agree that had you set more difficult goals, lower confidence that we can achieve them at all is appropriate.
I'm not as confident as you sound that we can plausibly rely on continuing unbounded improvements in narrow AI, robotics, economics, politics, space exploration, genetic engineering, etc., though it would certainly be nice.
And, sure, if we're confident enough in our understanding of underlying theory that we have some reasonable notion of how to safely do "one step at a time and make sure we get it right," that's a fine strategy. I'm not sure such confidence is justified, though.

It's not so much that I have confidence that we can have unbounded improvements in any one of those fields, like genetic engineering for example; it's more that I would say there are at least 5 or 6 major research paths right now that could conceivably cure aging, all of which seem promising and all of which are areas where significant progress is being made right now, so that even if one or two of them prove impossible or impractical I think the odds of at least some of them eventually working are quite high.

(nods) Correction noted.

It's not obvious to me that creating super smart people would have a net positive effect because motivating them to decrease AI risk is itself an alignment problem. What if they instead decide to accelerate AI progress or do nothing at all?

My own feeling is that the chance of success of of building FAI, assuming current human intelligence distribution, is low (even if given unlimited financial resources), while the risk of unintentionally building or contributing to UFAI is high. I think I can explicate a part of my intuition this way: There must be a minimum level of intelligence below which the chances of successfully building an FAI is negligible. We humans seem at best just barely smart enough to build a superintelligent UFAI. Wouldn't it be surprising that the intelligence threshold for building UFAI and FAI turn out to be the same?

What will construct advanced intelligent machines is slightly less advanced intelligent machines, in a symbiotic relationship with humans. It doesn't much matter if the humans are genetically identical with the ones that barely managed to make flint axe heads - since they are not working on this task alone.