Allen & Wallach on Friendly AI

lukeprog

2 min read

10 Allen & Wallach on Friendly AI

by lukeprog

16th Dec 2011

2 min read

10

Colin Allen and Wendell Wallach, who wrote Moral Machines (MM) for OUP in 2009, address the problem of Friendly AI in their recent chapter for Robot Ethics (MIT Press). Their chapter is a precis of MM and a response to objections, one of which is:

The work of researchers focused on ensuring that a technological singularity will be friendly to humans (friendly AI) was not given its due in MM.

Their brief response to this objection is:

The project of building AMAs [artificial moral agents] is bracketed by the more conservative expectations of computer scientists, engaged with the basic challenges and thresholds yet to be crossed, and the more radical expectations of those who believe that human-like and superhuman systems will be built in the near future. There are a wide variety of theories and opinions about how sophisticated computers and robotic systems will become in the next twenty to fifty years. Two separate groups focused on ensuring the safety of (ro)bots have emerged around these differing expectations: the machine ethics community and the singularitarians (friendly AI), exemplified by the Singularity Institute for Artificial Intelligence (SIAI). Those affiliated with SIAI are specifically concerned with the existential dangers to humanity posed by AI systems that are smarter than humans. MM has been criticized for failing to give fuller attention to the projects of those dedicated to a singularity in which AI systems friendly to humans prevail.

SIAI has been expressly committed to the development of general mathematical models that can, for example, yield probabilistic predictions about future possibilities in the development of AI. One of Eliezer Yudkowsky’s projects is motivationally stable goal systems for advanced forms of AI. If satisfactory predictive models or strategies for stable goal architectures can be developed, their value for AMAs is apparent. But will they be developed, and what other technological thresholds must be crossed, before such strategies could be implemented in AI? In a similar vein, no one questions the tremendous value machine learning would have for facilitating the acquisition by AI systems of many skills, including moral decision making. But until sophisticated machine learning strategies are developed, discussing their application is speculative. That said, since the publication of MM, there has been an increase in projects that could lead to further collaboration between these two communities, a prospect we encourage.

Meh. Not much to this. I suppose The Singularity and Machine Ethics is another plank in bridging the two communities.

The most interesting chapter in the book is, imo, Anthony Beavers' "Moral Machines and the Threat of Ethical Nihilism."

New to LessWrong?

10

Allen & Wallach on Friendly AI

New Comment

6 comments, sorted by

top scoring

Click to highlight new comments since: Today at 7:51 AM

[-]Jordan13y60

I agree with Allen and Wallach here. We don't know what an AGI is going to look like. Maybe the idea of a utility maximizer is unfeasible, and the AGIs we are capable of building end up operating in a fundamentally different way (more like a human brain, perhaps). Maybe morality compatible with our own desires can only exist in a fuzzy form at a very high level of abstraction, effectively precluding mathematically precise statements about its behavior (like in a human brain).

These possibilities don't seem trivial to me, and would undermine results from friendliness theory. Why not instead develop a sub-superintelligent AI first (perhaps an intelligence intentionally less than human), so that we can observe directly what the system looks like before we attempt to redesign it for greater safety.

[-]Nymogenous13y30

The problem there is twofold; firstly, a lot of aspects would not necessarily scale up to a smarter system, and it's sometimes hard to tell what generalizes and what doesn't. Secondly, it's very very hard to pinpoint the "intelligence" of a program without running it; if we make one too smart it may be smart/nasty enough to feed us misleading data so that our final AI will not share moral values with humans. It's what I'd do if some aliens tried to dissect my mind to force their morality on humanity.

[-]Jordan13y00

firstly, a lot of aspects would not necessarily scale up to a smarter system, and it's sometimes hard to tell what generalizes and what doesn't.

I agree, but certainly trying to solve the problem without any hands on knowledge is more difficulty.

Secondly, it's very very hard to pinpoint the "intelligence" of a program without running it

I agree, there is a risk that the first AGI we build will be intelligent enough to skillfully manipulate us. I think the chances are quite small. I find it difficult to image skipping dog level intelligence and human level intelligence and jumping straight to superhuman intelligence, but it is certainly possible.