Ben Goertzel and Joel Pitt: Nine Ways to Bias Open-Source AGI Toward Friendliness. Journal of Evolution and Technology - Vol. 22 Issue 1 – February 2012 - pgs 116-141.

Abstract

While it seems unlikely that any method of guaranteeing human-friendliness (“Friendliness”) on the part of advanced Artificial General Intelligence (AGI) systems will be possible, this doesn’t mean the only alternatives are throttling AGI development to safeguard humanity, or plunging recklessly into the complete unknown. Without denying the presence of a certain irreducible uncertainty in such matters, it is still sensible to explore ways of biasing the odds in a favorable way, such that newly created AI systems are significantly more likely than not to be Friendly. Several potential methods of effecting such biasing are explored here, with a particular but non-exclusive focus on those that are relevant to open-source AGI projects, and with illustrative examples drawn from the OpenCog open-source AGI project. Issues regarding the relative safety of open versus closed approaches to AGI are discussed and then nine techniques for biasing AGIs in favor of Friendliness are presented:

1.       Engineer the capability to acquire integrated ethical knowledge.

2.       Provide rich ethical interaction and instruction, respecting developmental stages.

3.       Develop stable, hierarchical goal systems.

4.       Ensure that the early stages of recursive self-improvement occur relatively slowly and with rich human involvement.

5.       Tightly link AGI with the Global Brain.

6.       Foster deep, consensus-building interactions between divergent viewpoints.

7.       Create a mutually supportive community of AGIs.

8.       Encourage measured co-advancement of AGI software and AGI ethics theory.

9.       Develop advanced AGI sooner not later.

 In conclusion, and related to the final point, we advise the serious co-evolution of functional AGI systems and AGI-related ethical theory as soon as possible, before we have so much technical infrastructure that parties relatively unconcerned with ethics are able to rush ahead with brute force approaches to AGI development.

I'd say it's worth a read - they have pretty convincing criticism against the possibility of regulating AGI (section 3). I don't think that their approach will work if there's a hard takeoff or a serious hardware overhang, though it could maybe work if there isn't. It might also work if there was the possibility for a hard takeoff, but not instantly after developing the first AGI systems.

New Comment
7 comments, sorted by Click to highlight new comments since: Today at 3:30 AM

Here's their regulation criticism:

3 - The (unlikely) prospect of government controls on AGI development

Given the obvious long-term risks associated with AGI development, is it feasible that governments might enact legislation intended to stop AI from being developed? Surely government regulatory bodies would slow down the progress of AGI development in order to enable measured development of accompanying ethical tools, practices, and understandings? This however seems unlikely, for the following reasons.

Let us consider two cases separately. First, there is the case of banning AGI research and development after an “AGI Sputnik” moment has occurred. We define an AGI Sputnik moment as a technological achievement that makes the short- to medium-term possibility of highly functional and useful human-level AGI broadly evident to the public and policy makers, bringing it out of the realm of science fiction to reality. Second, we might choose to ban it before such a moment has happened.

After an AGI Sputnik moment, even if some nations chose to ban AI technology due to the perceived risks, others would probably proceed eagerly with AGI development because of the wide-ranging perceived benefits. International agreements are difficult to reach and enforce, even for extremely obvious threats like nuclear weapons and pollution, so it’s hard to envision that such agreements would come rapidly in the case of AGI. In a scenario where some nations ban AGI while others do not, it seems the slow speed of international negotiations would contrast with the rapid speed of development of a technology in the midst of revolutionary breakthrough. While worried politicians sought to negotiate agreements, AGI development would continue, and nations would gain increasing competitive advantage from their differential participation in it.

The only way it seems feasible for such an international ban to come into play, would be if the AGI Sputnik moment turned out to be largely illusory because the path from the moment to full human-level AGI turned out to be susceptible to severe technical bottlenecks. If AGI development somehow slowed after the AGI Sputnik moment, then there might be time for the international community to set up a system of international treaties similar to what we now have to control nuclear weapons research. However, we note that the nuclear weapons research ban is not entirely successful – and that nuclear weapons development and testing tend to have large physical impacts that are remotely observable by foreign nations. On the other hand, if a nation decides not to cooperate with an international AGI ban, this would be much more difficult for competing nations to discover.

An unsuccessful attempt to ban AGI research and development could end up being far riskier than no ban. An international R&D ban that was systematically violated in the manner of current international nuclear weapons bans would shift AGI development from cooperating developed nations to “rogue nations,” thus slowing down AGI development somewhat, but also perhaps decreasing the odds of the first AGI being developed in a manner that is concerned with ethics and Friendly AI.

Thus, subsequent to an AGI Sputnik moment, the overall value of AGI will be too obvious for AGI to be effectively banned, and monitoring AGI development would be next to impossible.

The second option is an AGI R&D ban earlier than the AGI Sputnik moment – before it’s too late. This also seems infeasible, for the following reasons:

  • Early stage AGI technology will supply humanity with dramatic economic and quality of life improvements, as narrow AI does now. Distinguishing narrow AI from AGI from a government policy perspective would also be prohibitively difficult.
  • If one nation chose to enforce such a slowdown as a matter of policy, the odds seem very high that other nations would explicitly seek to accelerate their own progress on AI/AGI, so as to reap the ensuing differential economic benefits.

To make the point more directly, the prospect of any modern government seeking to put a damper on current real-world narrow-AI technology seems remote and absurd. It’s hard to imagine the US government forcing a roll-back from modern search engines like Google and Bing to more simplistic search engines like 1997 AltaVista, on the basis that the former embody natural language processing technology that represents a step along the path to powerful AGI.

Wall Street firms (that currently have powerful economic influence on the US government) will not wish to give up their AI-based trading systems, at least not while their counterparts in other countries are using such systems to compete with them on the international currency futures market. Assuming the government did somehow ban AI-based trading systems, how would this be enforced? Would a programmer at a hedge fund be stopped from inserting some more-effective machine learning code in place of the government-sanctioned linear regression code? The US military will not give up their AI-based planning and scheduling systems, as otherwise they would be unable to utilize their military resources effectively. The idea of the government placing an IQ limit on the AI characters in video games, out of fear that these characters might one day become too smart, also seems absurd. Even if the government did so, hackers worldwide would still be drawn to release “mods” for their own smart AIs inserted illicitly into games; and one might see a subculture of pirate games with illegally smart AI.

“Okay, but all these examples are narrow AI, not AGI!” you may argue. “Banning AI that occurs embedded inside practical products is one thing; banning autonomous AGI systems with their own motivations and self-autonomy and the ability to take over the world and kill all humans is quite another!” Note though that the professional AI community does not yet draw a clear border between narrow AI and AGI. While we do believe there is a clear qualitative conceptual distinction, we would find it hard to embody this distinction in a rigorous test for distinguishing narrow AI systems from “proto-AGI systems” representing dramatic partial progress toward human-level AGI. At precisely what level of intelligence would you propose to ban a conversational natural language search interface, an automated call center chatbot, or a house-cleaning robot? How would you distinguish rigorously, across all areas of application, a competent non-threatening narrow-AI system from something with sufficient general intelligence to count as part of the path to dangerous AGI?

A recent workshop of a dozen AGI experts, oriented largely toward originating such tests, failed to come to any definitive conclusions (Adams et al. 2010), recommending instead that a looser mode of evaluation be adopted, involving qualitative synthesis of multiple rigorous evaluations obtained in multiple distinct scenarios. A previous workshop with a similar theme, funded by the US Naval Research Office, came to even less distinct conclusions (Laird et al. 2009). The OpenCog system is explicitly focused on AGI rather than narrow AI, but its various learning modules are also applicable as narrow AI systems, and some of them have largely been developed in this context. In short, there’s no rule for distinguishing narrow AI work from proto-AGI work that is sufficiently clear to be enshrined in government policy, and the banning of narrow AI work seems infeasible as the latter is economically and humanistically valuable, tightly interwoven with nearly all aspects of the economy, and nearly always non-threatening in nature. Even in the military context, the biggest use of AI is in relatively harmless-sounding contexts such as back-end logistics systems, not in frightening applications like killer robots.

Surveying history, one struggles to find good examples of advanced, developed economies slowing down development of any technology with a nebulous definition, obvious wide-ranging short to medium term economic benefits, and rich penetration into multiple industry sectors, due to reasons of speculative perceived long-term risks. Nuclear power research is an example where government policy has slowed things down, but here the perceived economic benefit is relatively modest, the technology is restricted to one sector, the definition of what’s being banned is very clear, and the risks are immediate rather than speculative. More worryingly, nuclear weapons research and development continued unabated for years, despite the clear threat it posed.

In summary, we submit that, due to various aspects of the particular nature of AGI and its relation to other technologies and social institutions, it is very unlikely to be explicitly banned, either before or after an AGI Sputnik moment. If one believes the creation of AGI to be technically feasible, then the more pragmatically interesting topic becomes how to most effectively manage and guide its development.

I like their coining of "AGI Sputnik moment."

Surveying history, one struggles to find good examples of advanced, developed economies slowing down development of any technology with a nebulous definition, obvious wide-ranging short to medium term economic benefits, and rich penetration into multiple industry sectors, due to reasons of speculative perceived long-term risks.

The government restrictions on cryptography is surely the nearest example within IT.

The government also restricts basic "intellectual development" activities, such as "copying stuff" and "inventing stuff".

Neither of those examples presents a resounding success story. Restrictions on cryptography proved infeasible, and were abandoned; copyright prohibits duplication of others' work only when it's for profit, and is only sporadically effective even at that.

There's also there monopolies and mergers commission. The government doesn't foster the development of big and powerful agents that might someday compete with it.

There are seeds of some good ideas in Ben's paper like having the goal of the AGI system maintained in a distributed peer to peer system like bit torrent or bitcoin, preventing it from getting too corrupted. That partially addresses one of my concerns with friendliness, the cosmic ray coming in and flips one important register and the AI turns Evil instead of Good (Laugh all you want, I was genuinely worried about this once upon a time)

My idea follows - (warning: Some familiarity with the bitcoin protocol may be needed) An open morality project to provide the goal for a future AGI could begin with a community effort to understand goodness, praise goodness and reward goodness. The community's reputation points/karma could be maintained in a bitcoin like distributed ledger, practically unhackable if the community has gone unmolested for 2 years or so.

Modifying the highest goal - Be friendly would be impossible. Modifying the weights of the lower level subgoals would require karma and more karma as you go higher towards the highest goal. Changing weights drastically, adding new subgoals would require more karma and there could be allowances for goals/behaviors that are supported/opposed greatly by a few and goals/behaviours supported by vast numbers.

Instead of paying money to a foundation subject to future corruption, future philanthropists can just back the "coin" with extra money, strengthening the hand of all members in the community and further strengthening the goal.

Infact, existing communities could transfer their Karma/Reputation points onto the initial distribution and then begin from there, with future Karma coming only coming from the peer to peer network. The coin distribution could be a limited one or a slowly growing one depending on the best estimates of the coders.

Section 9 seems a little shaky - and it coincides with Ben's interests - i.e. he'd benefit it this recommendation were followed.