For a very long time I assumed the first strong AI would be neutral (and I thought that hoping for a friendly AI first was both unrealistic and unnecessary). Now I'm unsure. Of course I'm pretty ignorant so you should take what I say with a grain of salt.
The most obvious objection is that almost all optimism about the power of an AGI comes from its potential ability to understand the world well enough to construct a better AI and then doing it, which is automatically ruled out by the notion of safety.
Moreover, as far as I can tell the most difficult problem facing society now is managing to build an AI which is smart for some reason you can understand (a prerequisite to being either safe or friendly) before we accidentally build an AI which is smart for some reason we can't understand (which is therefore likely to be unfriendly if you believe the SIAI).
to build an AI which is smart for some reason you can understand (a prerequisite to being either safe or friendly) before we accidentally build an AI which is smart for some reason we can't understand (which is therefore likely to be unfriendly if you believe the SIAI).
Entirely agreed, but to nitpick, an AI that's smart for some reason you understand is no more likely to be Friendly if you don't try to make it Friendly — it just allows you to try with a decent hope of success.
Unfriendly AI has goal conflicts with us. Friendly AI (roughly speaking) shares our goals. How about an AI with no goals at all?
I'll call this "neutral AI". Cyc is a neutral AI. It has no goals, no motives, no desires; it is inert unless someone asks it a question. It then has a set of routines it uses to try to answer the question. It executes these routines, and terminates, whether the question was answered or not. You could say that it had the temporary goal to answer the question. We then have two important questions:
Many people have answered the first question "no". This would probably include Hubert Dreyfus (based on a Heideggerian analysis of semantics, which was actually very good but I would say misguided in its conclusions because Dreyfus mistook "what AI researchers do today" for "what is possible using a computer"), Phil Agre, Rodney Brooks, and anyone who describes their work as "reactive", "behavior-based", or "embodied cognition". We could also point to the analogous linguistic divide. There are two general approaches to natural language understanding, one descending from generative grammars and symbolic AI and embodied by James Allen's book Natural Language Understanding, and in the "program in the knowledge" camp that would answer the first question "yes". The other approach has more kinship with construction grammars and machine learning, and is embodied by Manning & Schutze's Foundations of Statistical Natural Language Processing, and its practitioners would be more likely to answer the first question "no". (Eugene Charniak is noteworthy for having been prominent in both camps.)
The second question, I think, hinges on two sub-questions:
The Jack Williamson story "With Folded Hands" (1947) tells how humanity was enslaved by robots given the order to protect humanity, who became... overprotective. Or suppose a physicist asked an AI, "Does the Higgs boson exist?" You don't want it to use the Earth to build a supercollider. These are cases of using more resources than intended to carry out an order.
You may be able to build a Cyc-like question-answering architectures that would have no risk of doing any such thing. It may be as simple as placing resource limitations on every question. The danger is that if the AI is given a very thorough knowledge base that includes, for instance, an understanding of human economics and motivations, it may syntactically construct a plan to find the answer to a question that is technically within the resource limitations posed, for instance by manipulating humans in ways that don't tweak its cost function. This could lead to very big mistakes; but it isn't the kind of mistake that builds on itself, like a FOOM scenario. The question is whether any of these very big mistakes would be irreversible. My intuition is that there would be a power-law distribution of mistake sizes, with a small number of irreversible mistakes. We might then figure out a reasonable way of determining our risk level.
If the answer to the second subquestion is "yes", then we probably don't need to fear a FOOM from neutral AI.
The short answer is, Yes, there are "neutral AI architectures" that don't currently have the risk either of harvesting too many resources, or of attempting to increase their own intelligence. Many existing AI architectures are examples. (I'm thinking specifically of "hierarchical task-network planning", which I don't consider true planning; it only allows the piecing together of plan components that were pre-built by the programmer.) But they can't do much. There's a power / safety tradeoff. The question is how much power you can get in the "completely safe" region, and where the sweet spots are in that tradeoff outside the completely safe region.
If you could build an AI that did nothing but parse published articles to answer the question, "Has anyone said X?", that would be very useful, and very safe. I worked on such a program (SemRep) at NIH. It works pretty well within the domain of medical journal articles. If it could take one step more, and ask, "Can you find a set of one to four statements that, taken together, imply X?", that would be a huge advance in capability, with little if any additional risk. (I added that capability to SemRep, but no one has ever used it, and it isn't accessible through the web interface.)