Friendly AI Research and Taskification

multifoliaterose

30 Friendly AI Research and Taskification

14th Dec 2010

5 min read

30

Eliezer has written a great deal about the concept of Friendly AI, for example in a document from 2001 titled Creating Friendly AI 1.0. The new SIAI overview states that:

SIAI's primary approach to reducing AI risks has thus been to promote the development of AI with benevolent motivations which are reliably stable under self-improvement, what we call “Friendly AI” [22].

The SIAI Research Program lists under its Research Areas:

Mathematical Formalization of the "Friendly AI" Concept. Proving theorems about the ethics of AI systems, an important research goal, is predicated on the possession of an appropriate formalization of the notion of ethical behavior on the part of an AI. And, this formalization is a difficult research question unto itself.

Despite the enormous value that the construction of a Friendly AI would have; at present I'm not convinced that researching the Friendly AI concept is a cost-effective way of reducing existential risk. My main reason for doubt is that as far as I can tell, the problem of building a Friendly AI has not been taskified to a sufficiently fine degree for it to be possible to make systematic progress toward obtaining a solution. I'm open-minded on this point and quite willing to change my position subject to incoming evidence

The Need For Taskification

In The First Step is to Admit That You Have a Problem Alicorn wrote:

If you want a peanut butter sandwich, and you have the tools, ingredients, and knowhow that are required to make a peanut butter sandwich, you have a task on your hands. If you want a peanut butter sandwich, but you lack one or more of those items, you have a problem [... ] treating problems like tasks will slow you down in solving them. You can't just become immortal any more than you can just make a peanut butter sandwich without any bread.

In Let them eat cake: Interpersonal Problems vs Tasks HughRistik wrote:

Similarly, many straight guys or queer women can't just find a girlfriend, and many straight women or queer men can't just find a boyfriend, any more than they can "just become immortal."

We know that the problems of making a peanut butter sandwich and of finding a romantic partner can (often) be taskified because many people have succeeded in solving them. It's less clear that a given problem that has never been solved can be taskified. Some problems are in principle unsolvable whether because they are mathematically undecidable or because physical law provides an obstruction to their solution. Other currently unsolved problems have solutions in the abstract but lack solutions that are accessible to humans. That taskification is in principle possible is not a sufficient condition for solving a problem but it is a necessary condition.

The Difficulty of Unsolved Problems

There's a long historical precedent of unsolved problems being solved. Humans have succeeded in building cars and skyscrapers, have succeeded in understanding the chemical composition of far away stars and of our own DNA, have determined the asymptotic distribution of the prime numbers and have given an algorithm to determine whether a given polynomial equation is solvable in radicals, have created nuclear bombs and have landed humans on the moon. All of these things seemed totally out of reach at one time.

Looking over the history of human achievement gives one a sense of optimism as to the feasibility of accomplishing a goal. And yet, there's a strong selection effect at play: successes are more interesting than failures and we correspondingly notice and remember successes more than failures. One need only page through a book like Richard Guy's Unsolved Problems in Number Theory to get a sense for how generic it is for a problem to be intractable. The ancient Greek inspired question of whether there are infinitely many perfect numbers remains out of reach for best mathematicians of today. The success of human research efforts has been as much a product of wisdom in choosing one's battles as it has been a product of ambition.

The Case of Friendly AI

My present understanding is that there are potential avenues for researching AGI. Richard Hollerith was kind enough to briefly describe Monte Carlo AIXI to me last month and I could sort of see how it might be in principle possible to program a computer to do Bayesian induction according to an approximation to a universal prior and implement the computer with a decision making apparatus based on its epistemological state at a given time. Some people have suggested to me that the amount of computer power and memory needed to implement human level Monte Carlo AIXI is prohibitively large but (in my current, very ill-informed state; by analogy with things that I've seen in computational complexity theory) I could imagine ingenious tricks yielding an approximation to Monte Carlo AIXI which uses much less computing power/memory and which is a sufficiently close to approximation to serve as a substitute for practical purposes. This would point to a potential taskification of the problem of building an AGI. I could also imagine that there are presently no practically feasible AGI research programs; I know too little about the state of strong artificial intelligence research to have anything but a very unstable opinion on this matter.

As Eliezer has said; the problem of creating a Friendly AI is inherently more difficult than that of creating an AGI and may be a problem much more difficult than that of creating an AGI. At present, the Friendliness aspect of a Friendly AI seems to me to strongly resist taskificaiton. In his poetic Mirrors and Paintings Eliezer gives the most detailed description of what a Friendly AI should do that I've seen, but the gap between concept and implementation here seems so staggeringly huge that it doesn't suggest to me any fruitful lines of Friendly AI research. As far as I can tell, Eliezer's idea of a Friendly AI is at this point not significantly more fleshed out (relative to the magnitude of the task) than Freeman Dyson's idea of a Dyson sphere. In order to build a Friendly AI, beyond conceiving of what a Friendly AI should be in the abstract one has to convert one's intuitive understanding of friendliness into computer code in a formal programming language.

I don't even see how one would start to research the problem of getting a hypothetical AGI to recognize humans as distinguished beings. Solving this problem would seem to require as a prerequisite an understanding of the make up of the hypothetical AGI; something which people don't seem to have a clear grasp of at the moment. Even if one does have a model for a hypothetical AGI, writing code conducive to it recognizing humans as distinguished beings seems like an intractable task. And even with a relatively clear understanding of how one would implement a hypothetical AGI with the ability to recognize humans as distinguished beings; one is still left with the problem of making such a hypothetical AGI Friendly toward such beings.

In view of all this, working toward stable whole-brain emulation of a a trusted and highly intelligent person concerned about human well being seems to me like a more promising strategy of reducing existential risk at the present time than researching Friendly AI. Quoting a comment by Carl Shulman

Emulations could [...] enable the creation of a singleton capable of globally balancing AI development speeds and dangers. That singleton could then take billions of subjective years to work on designing safe and beneficial AI. If designing safe AI is much, much harder than building AI at all, or if knowledge of AI and safe AI are tightly coupled, such a singleton might be the most likely route to a good outcome.

There are various things that could go wrong with whole-brain emulation and it would be good to have a better option but Friendly AI research doesn't seem to me to be one in light of an apparent total absence of even the outlines of a viable Friendly AI research program.

But I feel like I may have missed something here. I'd welcome any clarifications of what people who are interested in Friendly AI research mean by Friendly AI research. In particular, is there a conjectural taskification of the problem?

Open ProblemsAI

Personal Blog

30

New Comment

Rendering 0/47 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 8:15 AM

Moderation Log

30 Friendly AI Research and Taskification

by multifoliaterose

14th Dec 2010

5 min read

30

Eliezer has written a great deal about the concept of Friendly AI, for example in a document from 2001 titled Creating Friendly AI 1.0. The new SIAI overview states that:

SIAI's primary approach to reducing AI risks has thus been to promote the development of AI with benevolent motivations which are reliably stable under self-improvement, what we call “Friendly AI” [22].

The SIAI Research Program lists under its Research Areas:

Mathematical Formalization of the "Friendly AI" Concept. Proving theorems about the ethics of AI systems, an important research goal, is predicated on the possession of an appropriate formalization of the notion of ethical behavior on the part of an AI. And, this formalization is a difficult research question unto itself.

The Need For Taskification

In The First Step is to Admit That You Have a Problem Alicorn wrote:

If you want a peanut butter sandwich, and you have the tools, ingredients, and knowhow that are required to make a peanut butter sandwich, you have a task on your hands. If you want a peanut butter sandwich, but you lack one or more of those items, you have a problem [... ] treating problems like tasks will slow you down in solving them. You can't just become immortal any more than you can just make a peanut butter sandwich without any bread.

In Let them eat cake: Interpersonal Problems vs Tasks HughRistik wrote:

Similarly, many straight guys or queer women can't just find a girlfriend, and many straight women or queer men can't just find a boyfriend, any more than they can "just become immortal."

The Difficulty of Unsolved Problems

The Case of Friendly AI

Emulations could [...] enable the creation of a singleton capable of globally balancing AI development speeds and dangers. That singleton could then take billions of subjective years to work on designing safe and beneficial AI. If designing safe AI is much, much harder than building AI at all, or if knowledge of AI and safe AI are tightly coupled, such a singleton might be the most likely route to a good outcome.

Open ProblemsAI

Personal Blog

30

Mentioned in

10John Baez Interviews with Eliezer (Parts 2 and 3)

1Risks from AI and Charitable Giving

New Comment

Rendering 0/47 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 8:15 AM

Moderation Log

More from multifoliaterose

Curated and popular this week

47Comments

Comment Permalink

Will_Newsome15y90

I'm not speaking for SIAI as this is more of a Visiting Fellows thing than an SIAI thing, but there are people working on Friendliness, and creating a Friendliness roadmap. We have lists of hundreds of problems, and lists of potentially relevant fields or concepts. Work is getting started on combining these lists into a real roadmap despite the uncertainty and difference of emphasis among researchers. Obviously we'd rather not release things for the public to see unless there were rather good reasons for doing so -- less output means less chance for screwing up public relations, which is important because SIAI Visiting Fellows output is easy to conflate with SIAI output in ways that might be misleading. I've started a blog where I'll put my own thoughts on something-like-Friendliness that I feel are not at all dangerous, and I might encourage other Friendliness researchers to do so as well. I'll link to my blog in a discussion post once I have a few more posts seeded. At some point you might see summaries of collaborative research somewhere. But until we have a better idea of who our audience is and what security precautions are sane, we'd like to work quietly. Again, I'm mostly speaking for myself, kind of speaking for a group of partially-SIAI-affiliated folk, and not at all for SIAI as an organization.

(There aren't that many people that can speak for SIAI, unfortunately. Like, two maybe. If you're an Oppenheimer (strong rationality and remarkable ability to get uber-nerds to work like a well-oiled machine), please consider applying for Visiting Fellowship. We're a bright group, but that has more to do with being bright than it has to do with being a group, and we'd like to change that.)

Kaj_Sotala15y80

I'm not speaking for SIAI as this is more of a Visiting Fellows thing than an SIAI thing, but there are people working on Friendliness, and creating a Friendliness roadmap. We have lists of hundreds of problems, and lists of potentially relevant fields or concepts.

Meh. Now I'm a bit annoyed in that I did try to poke people into a direction where they'd do something like that when I was there as a Visiting Fellow, but mostly the reaction seemed to be "we should leave all thinking about Friendliness to Eliezer". But upon reflection, I realize th... (read more)

5multifoliaterose15y

1. I'm encouraged by what you say here. The doubt as to the value of Friendliness research that I express above is doubt as to the value of researching Friendly AI without a taskification rather than doubt as to the value researching what a taskification might look like. 2. If you haven't done so I think that it would be worthwhile to ask the SIAI staff whether they might be comfortable with classifying (some of?) the output of the SIAI Visiting Fellows as part of SIAI's output. As I said in response to a comment by WrongBot, I've gathered that the SIAI visiting fellows program is a good thing; but there's been relatively little public documentation of what the SIAI visiting fellows have been doing. I would guess that a policy of such public documentation would improve SIAI's credibility. 3. While I didn't read your comment in the way that cousin_it did, I can see why he would do so. I've gotten a vague impression from talking to a number of people loosely or directly connected with SIAI that SIAI has been keeping their research secret on the grounds that releases to the public could be dangerous on account of speeding unfriendly AI research. In view of how primitive the study of AGI looks, the apparent infeasibility of SIAI unilaterally building the first AGI and the fact that Friendliness research would not seem to significantly speed the creation of unfriendly AI; such a policy seems highly dubious to me. So I was happy to hear that you and your collaborators are planning on putting some of what you've been doing out in the open in roughly a few months. 4. Thanks for the link to your blog posts.

9cousin_it15y

You have hundreds of subproblems that need to be solved? And you're making a special effort to keep them secret from people on LW? Just... wow. How dumb would you have to be? Excuse me while I beat my head against the wall for awhile.

See in context