One "anti-foom" factor is the observation that in the early stages we can make progress partly by cribbing from nature - and simply copying it. After roughly "human level" is reached, that short-cut is no longer available - so progress may require more work after that.
Eliezer seems quite worried about the possibility that someone will develop a FOOMing unFriendly AI before Friendly AI can get off the ground, but is anything being done about this besides just rushing to finish Friendly AI?
Carl Shulman has written about that topic. See:
Shulman, Carl M., "Arms Control and Intelligence Explosions." Proceedings of the European Conference on Computing and Philosophy. Universitat Autònoma de Barcelona, Barcelona, Spain. 4 July 2009.
An AI that "valued" keeping the world looking roughly the way it does now, that was specifically instructed never to seize control of more than X number of each of several thousand different kinds of resources, and whose principal intended activity was to search for, hunt down, and destroy AIs that seemed to be growing too powerful too quickly might be an acceptable risk.
This would not be acceptable to me, since I hope to be one of those AIs.
The morals of FAI theory don't mesh well at all with the morals of transhumanism. This is surprising, ...
Is that Phil Goetz's CEV vs. all humans' CEV, or Phil Goetz's current preferences or behaviour-function vs. the average of all humans' current preferences or behaviour-functions? In the former scenario, I'd prefer the global CEV (if I were confident that it would work as stated), but in the latter, even without me remembering much about Phil and his views other than that he appears to be an intelligent educated Westerner who can be expected to be fairly reliably careful about potentially world-changing actions, I'd probably feel safer with him as world dictator than with a worldwide direct democracy automatically polling everyone in the world on what to do, considering the kinds of humans who currently make up a large majority of the population.
Perhaps we need some kind of mini-FOOMing marginally Friendly AI whose only goal is to ensure that nothing seizes control of the world's computing resources until SIAI can figure out how to get CEV to work.
You mean: rather like the Monopolies and Mergers commission?
We already have organisations like that. One question is whether they will be enough. So far, they have hampered - but evidently not yet completely crippled - the Microsoft machine intelligence - and now show signs of switching their attention to Google:
http://techcrunch.com/2010/02/23/eu-antitrust-google-microsoft/
I don't think the lack of an earth-shattering ka-FOOM changes much of the logic of FAI. Smart enough to take over the world is enough to make human existence way better, or end it entirely.
It's quite tricky to ensure that your superintelligent AI does anything like what you wanted it to. I don't share the intuition that creating a "homeostasis" AI is any easier than an FAI. I think one move Eliezer is making in his "Creating Friendly AI" strategy is to minimize the goals you're trying to give the machine; just CEV.
I think this makes...
The shield AI bothers me. One would need to be very careful to specify how a shield AI would function so that it would a) not try to generally halt human research development in its attempt to prevent the development of AIs b) allow humans to turn it off when we think we've got friendly AIs even as it doesn't let FOOMing AIs turn it off. Specifying these issues might be quite difficult. Edit: Sorry,b is only an issue for a slightly different form of shield AI since you seem to want a shield AI which is actually not given a specific method of being turned o...
I agree that there is a serious problem about what to do in possible futures where we have an AI that’s smart enough to be dangerous, but not powerful enough to implement something like CEV. Unfortunately I don’t think this analysis offers any help.
Of your list of ways to avoid a FOOM, option 1 isn’t really relevant (since if we’re all dead we don’t have to worry about how to program an AI). Option 2 is already ruled out with very high probability, because you don't need any exotic physics to reach mind-boggling levels of hardware performance. For instance...
This is a brief summary of what I consider the software based non-fooming scenario.
Terminology
Impact - How much the agent can make the world how it wants (hit a small target in search space, etc) Knowledge = correct programming.
General Outlook/philosophy
Do not assume that an agent knows everything, assume that an agent has to start off with no program to run. Try and figure out where and how it gets the information for the program. And whether it can be mislead by sources of information.
General Suppositions
High impact requires that someone (either the
I do not understand encryption well, and so it is possible that some plausible level of investment in computer security could, contrary to my assumptions, actually manage to protect human control over individual computers for the foreseeable future.
Encryption that an AI couldn't break is the easy part. Just don't do something dumb like with WEP. The hard part is not the encryption but all the stuff that is supposed to keep things safely hidden behind the encryption. Think "airtight blastproof shield held in place with pop rivets, duct tape and a some guy from marketing plugging one of the holes with his finger"
Although the question of the effect of non-FOOMing AI is interesting, this particular article is full of sloppy thinking and cargo cult estimates. (Right now I'm not in the mood to break it down, but I'd guess many of the regulars are qualified to do that.)
1 seems unlikely and 2 and 3 seem silly to me. An associated problem of unknown scale is the wirehead problem. Some think that this won't be a problem - but we don't really know that yet. It probably would not slow down machine intelligence very much, until way past human level - but we don't yet know for sure what its effects will be.
Thank you for saying that non-FOOM has nonzero probability and should be considered!
Another case I'd like to be considered more is "if we can't/shouldn't control the AIs, what can we do to still have influence over them?"
Close, but the tricky part is that the universe can expand at greater than the speed of light. Nothing (like photons) that can influence cause and effect can travel faster than c but the fabric of spacetime itself can expand faster than the speed of light. Looking at the (models of) the first 10^-30 seconds highlights this to an extreme degree. Even now some of the galaxies that are visible to us are becoming further away from us by more than a light year per year. That means that the light they are currently emitting (if any) will never reach us.
To launch an AI out of our future light cone you must send it past a point at which the expansion of the universe makes that point further away from us at c. At that time it will be one of the points at the edge of our future light cone and beyond it the AI can never touch us.
The thing with hardware limits is that you can keep on building more computers, even if they're slow, and create massive parallel processors. We know that they can at the very least achieve the status of the human brain. Not to mention that they can optimize the process for building them, possibly many different designs including biological inspired ones that can, with a large enough energy source, consume organic material (FYI, the earth is made of organic material -- rocks) and grow at exponential rates. As the system grows exponentially larger it would ...
whose principal intended activity was to search for, hunt down, and destroy AIs that seemed to be growing too powerful too quickly might be an acceptable risk.
This AI would prevent anyone, including SIAI, from developing any sort of an AI.
An AI that "valued" keeping the world looking roughly the way it does now, that was specifically instructed never to seize control of more than X number of each of several thousand different kinds of resources,
Define 'seize control.' Wouldn't such an AI be motivated to understate it's effective resources, or create other nominally-independent AIs with identical objectives and less restraint, or otherwise circumvent that factor?
This is a brief summary of what I consider the software based non-fooming scenario.
Terminology
Impact - How much the agent can make the world how it wants (hit a small target in search space, etc) Knowledge = correct programming.
General Outlook/philosophy
Do not assume that an agent knows everything, assume that an agent has to start off with no program to run. Try and figure out where and how it gets the information for the program. And whether it can be mislead by sources of information.
General Suppositions
High impact requires that someone (either the
Intro
This article seeks to explore possible futures in a world where artificial intelligence turns out NOT to be able to quickly, recursively self-improve so as to influence our world with arbitrarily large strength and subtlety, i.e, "go FOOM." Note that I am not arguing that AI won't FOOM. Eliezer has made several good arguments for why AI probably will FOOM, and I don't necessarily disagree. I am simply calling attention to the non-zero probability that it won't FOOM, and then asking what we might do to prepare for a world in which it doesn't.
Failure Modes
I can imagine three different ways in which AI could fail to FOOM in the next 100 years or so. Option 1 is a "human fail." Option 1 means we destroy ourselves or succumb to some other existential risk before the first FOOM-capable AI boots up. I would love to hear in the comments section about (a) which existential risks people think are most likely to seriously threaten us before the advent of AI, and (b) what, if anything, a handful of people with moderate resources (i.e., people who hang around on Less Wrong) might do to effectively combat some of those risks.
Option 2 is a "hardware fail." Option 2 means that Moore's Law turns out to have an upper bound; if physics doesn't show enough complexity beneath the level of quarks, or if quantum-sized particles are so irredeemably random as to be intractable for computational purposes, then it might not be possible for even the most advanced intelligence to significantly improve on the basic hardware design of the supercomputers of, say, the year 2020. This would limit the computing power available per dollar, and so the level of computing power required for a self-improving AI might not be affordable for generations, if ever. Nick Bostrom has some interesting thoughts along these lines, ultimately guessing (as of 2008) that the odds of a super-intelligence forming by 2033 was less than 50%.
Option 3 is a "software fail." Option 3 means that *programming* efficiency turns out to have an upper bound; if there are natural information-theoretical limits on how efficiently a set number of operations can be used to perform an arbitrary task, then it might not be possible for even the most advanced intelligence to significantly improve on its basic software design; the supercomputer would be more than 'smart' enough to understand itself and to re-write itself, but there would simply not *be* an alternate script for the source code that was actually more effective.
These three options are not necessarily exhaustive; they are just the possibilities that have immediately occurred to me, with some help from User: JoshuaZ.
"Superintelligent Enough" AI
An important point to keep in mind is that even if self-improving AI faces hard limits before becoming arbitrarily powerful, AI might still be more than powerful enough to effortlessly dominate future society. I am sure my numbers are off by many orders of magnitude, but by way of illustration only, suppose that current supercomputers run at a speed of roughly 10^20 ops/second, and that successfully completing Eliezer's coherent extrapolated volition project would require a processing speed of roughly 10^36 ops/second. There is obviously quite a lot of space here for a miniature FOOM. If one of today's supercomputers starts to go FOOM and then hits hard limits at 10^25 ops/second, it wouldn't be able to identify humankind's CEV, but it might be able to, e.g, take over every electronic device capable of receiving transmissions, such as cars, satellites, and first-world factories. If this happens around the year 2020, a mini-FOOMed AI might also be able to take over homes, medical prosthetics, robotic soldiers, and credit cards.
Sufficient investments in security and encryption might keep such an AI out of some corners of our economy, but right now, major operating systems aren't even proof against casual human trolls, let alone a dedicated AI thinking at faster-than-human speeds. I do not understand encryption well, and so it is possible that some plausible level of investment in computer security could, contrary to my assumptions, actually manage to protect human control over individual computers for the foreseeable future. Even if key industrial resources were adequately secured, though, a moderately super-intelligent AI might be capable of modeling the politics of current human leaders well enough to manipulate them into steering Earth onto a path of its choosing, as in Issac Asimov's The Evitable Conflict.
If enough superintelligences develop at close enough to the same moment in time and have different enough values, they might in theory reach some sort of equilibrium that does not involve any one of them taking over the world. As Eliezer has argued (scroll down to 2nd half of the linked page), though, the stability of a race between intelligent agents should mostly be expected to *decrease* as those agents swallow their own intellectual and physical supply chains. If a supercomputer can take over larger and larger chunks of the Internet as it gets smarter and smarter, or if a supercomputer can effectively control what happens in more and more factories as it gets smarter and smarter, then there's less and less reason to think that supercomputing empires will "grow" at roughly the same pace -- the first empire to grow to a given size is likely to grow faster than its rivals until it takes over the world. Note that this could happen even if the AI is nowhere near smart enough to start mucking about with uploaded "ems" or nanoreplicators. Even in a boringly normal near-future scenario, a computer with even modest self-improvement and self-aggrandizement capabilities might be able to take over the world. Imagine something like the ending to David Brin's Earth, stripped of the mystical symbolism and the egalitarian optimism.
Ensuring a "Nice Place to Live"
I don't know what Eliezer's timeline is for attempting to develop provably Friendly AI, but it might be worthwhile to attempt to develop a second-order stopgap. Eliezer's CEV is supposed to function as a first-order stopgap; it won't achieve all of our goals, but it will ensure that we all get to grow up in a Nice Place to Live while we figure out what those goals are. Of course, that only happens if someone develops a CEV-capable AI. Eliezer seems quite worried about the possibility that someone will develop a FOOMing unFriendly AI before Friendly AI can get off the ground, but is anything being done about this besides just rushing to finish Friendly AI?
Perhaps we need some kind of mini-FOOMing marginally Friendly AI whose only goal is to ensure that nothing seizes control of the world's computing resources until SIAI can figure out how to get CEV to work. Although no "utility function" can be specified for a general AI without risking paper-clip tiling, it might be possible to formulate a "homeostatic function" at relatively low risk. An AI that "valued" keeping the world looking roughly the way it does now, that was specifically instructed *never* to seize control of more than X number of each of several thousand different kinds of resources, and whose principal intended activity was to search for, hunt down, and destroy AIs that seemed to be growing too powerful too quickly might be an acceptable risk. Even if such a "shield AI" were not provably friendly, it might pose a smaller risk of tiling the solar system than the status quo, since the status quo is full of irresponsible people who like to tinker with seed AIs.
An interesting side question is whether this would be counterproductive in a world where Failure Mode 2 (hard limits on hardware) or Failure Mode 3 (hard limits on software) were serious concerns. Assuming that, eventually, a provably friendly AI can be developed, then, several years after that, it's likely that millions of people can be convinced that it would be really good to activate the provably friendly AI, and humans might be able to dedicate enough resources to specifically overcome the second-order stopgap "shield AI" that was knocking out other people's un-provably Friendly AIs. But if the shield AI worked too well and got too close to the hard upper bound on the power of an AI, then it might not be possible to unmake the shield, even with added resources and with no holds barred.