AGI and Friendly AI in the dominant AI textbook

lukeprog

AGI and Friendly AI in the dominant AI textbook — LessWrong

80 AGI and Friendly AI in the dominant AI textbook

11th Mar 2011

4 min read

80

AI: A Modern Approach is by far the dominant textbook in the field. It is used in 1200 universities, and is the 25th most-cited publication in computer science. If you're going to learn AI, this is how you learn it.

Luckily, the concepts of AGI and Friendly AI get pretty good treatment in the 3rd edition, released in 2009.

The Singularity is mentioned in the first chapter on page 12. Both AGI and Friendly AI are also mentioned in the first chapter, on page 27:

[Many leaders in the field] believe AI should return to its roots of striving for, in Simon's words, "machines that think, that learn and that create." They call the effort human-level AI or HLAI: their first symposium was in 2004 (Minsky et al. 2004)...

A related idea is the subfield of Artificial General Intelligence or AGI (Goertzel and Pennachin, 2007), which held its first conference and organized the Journal of Artificial General Intelligence in 2008. AGI looks for a universal algorithm for learning and acting in any environment, and has its roots in the work of Ray Solomonoff (1965), one of the attendees of the original 1956 Dartmouth conference. Guaranteeing that what we create is really Friendly AI is also a concern (Yudkowsky, 2008; Omohundro, 2008), one we will return to in Chapter 26.

Chapter 26 is about the philosophy AI, and section 26.3 is "The Ethics and Risks of Developing Artificial Intelligence." They are:

People might lose their jobs to automation.
People might have too much (or too little) leisure time.
People might lose their sense of being unique.
AI systems might be used toward undesirable ends.
The use of AI systems might result in a loss of accountability.

Each of those sections is one or two paragraphs long. The final risk of AI takes up 3.5 pages: (6) The Success of AI might mean the end of the human race. Here's a snippet:

The question is whether an AI system poses a bigger risk than traditional software. We will look at three sources of risk. First, the AI system's state estimation may be incorrect, causing it to do the wrong thing. For example... a missile defense system might erroneously detect an attack and launch a counterattack, leading to the death of billions...

Second, specifying the right utility function for an AI system to maximize is not so easy. For example, we might propose a utility function designed to minimize human suffering, expressed as an additive reward function over time as in Chapter 17. Given the way humans are, however, we'll always find a way to suffer even in paradise; so the optimal decision for the AI system is to terminate the human race as soon as possible - no humans, no suffering...

Third, the AI system's learning function may cause it to evolve into a system with unintended behavior. This scenario is the most serious, and is unique to AI systems, so we will cover it in more depth. I.J. Good wrote (1965),

Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then be unquestionably be an "intelligence explosion," and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control. The "intelligence explosion" has also been called the technological singularity by... Vernor Vinge...

Then they mention Moravec, Kurzweil, and transhumanism, before returning to a more concerned tone about AI. They cover Asimov's three laws of robotics, and then:

Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design - to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes. We can't just give a program a static utility function, because circumstances, and our desired responses to circumstances, change over time. For example, if technology had allowed us to design a super-powerful AI agent in 1800 and endow it with the prevailing morals of the time, it would be fighting today to reestablish slavery and abolish women's right to vote. On the other hand, if we build an AI agent today and tell it how to evolve its utility function, how can we assure that it won't read that "Humans think it is moral to kill annoying insects, in part because insect brains are so primitive. But human brains are primitive compared to my powers, so it must be moral for me to kill humans."

Omohundro (2008) hypothesizes that even an innocuous chess program could pose a risk to society. Similarly, Marvin Minsky once suggested that an AI program designed to solve the Riemann Hypothesis might end up taking over all the resources of Earth to build more powerful supercomputers to help achieve its goal. THe moral is that even if you only want you program to play chess or prove theorems, if you give it the capability to learn and alter itself, you need safeguards.

It's good this work is getting such mainstream coverage!

Book Reviews / Media ReviewsGeneral intelligenceAI

Frontpage

80

New Comment

27 comments, sorted by

top scoring

Click to highlight new comments since: Today at 11:22 AM

[-][anonymous]15y110

This is great news! FAI just got a huge increase in legitimacy - in fact, this is the biggest such boost I can think of.

[-]Miller15y90

provided that the machine is docile enough to tell us how to keep it under control

I must have been sleeping through all the other quotations of this. It's the first time I noticed this was a part of the original text.

It was left off: http://singinst.org/summit/overview/whatisthesingularity/

It's left off the wikipedia entry that references it: http://en.wikipedia.org/wiki/Technological_singularity

And this other random high Google hit: http://www.committedsardine.com/blogpost.cfm?blogID=1771

I guess one upshot is that I pulled up the original article to verify (and no the comment about Vernor Vinge was not in the original). Scholarship!

[-]timtyler15y20

Is Luke paying attention, though? Good could not have been quoting Vinge!

[-]XiXiDu15y70

Marvin Minsky once suggested that an AI program designed to solve the Riemann Hypothesis might end up taking over all the resources of Earth to build more powerful supercomputers to help achieve its goal.

I would like to know why he doesn't actively work on FAI, or voices his concerns more loudly. I might ask him about it in an e-mail, if nobody else wants to do it instead.

[-]lukeprog15y70

One thing I will note is that I'm not sure why they say AGI has its roots in Solomonoff's induction paper. There is such a huge variety in approaches to AGI... what do they all have to do with that paper?

[-]Eliezer Yudkowsky15y100

AIXI is based on Solomonoff, and to the extent that you regard all other AGIs as approximations to AIXI...

[-]lukeprog15y10

Gotcha.

[-]Eliezer Yudkowsky15y120

Or to look at it another way, Solomonoff was the first mathematical specification of a system that could, in principle if not in the physical universe, learn anything learnable by a computable system.

[-]cousin_it15y50

I think the interesting feature of Solomonoff induction is that it does no worse than any other object from the same class (lower-semicomputable semimeasures), not just objects from a lower class (computable humans). I'm currently trying to solve a related problem where it's easy to devise an agent that beats all humans, but difficult to devise one that's optimal in its own class.

[-]Vladimir_Nesov15y10

That paragraph is simply wrong.

[-]Manfred15y-10

Well, on the other hand, if AGI is defined as truly universal, Solomonoff seems quite rooty indeed. It's only if you think of "general" to mean "general relative to a beaver's brain" that a wide variety of approaches become acceptable.

[-]timtyler15y00

I estimate brains spend about 80% of their time doing inductive inference (the rest is evaluation, tree-pruning, etc). Solomonoff's induction is a general theory of inductive inference. Thus the connection.

[-]jmmcd15y40

A while ago I took a similar (very quick) look at another AI text, Nils Nilsson's AI History: The Quest for Artificial Intelligence.

[-]Perplexed15y20

Norvig & Russell: Yudkowsky (2008) goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design - to define a mechanism for evolving AI systems under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.

"Mechanism design"? "Checks and balances"? Do you know what they mean by "Yudkowsky (2008)" and where I can find a copy? I'd like to see this for myself.

[-]lukeprog15y50

(Yudkowsky, 2008)

[-]Perplexed15y30

OK, so am I misreading Yudkowsky, or are Norvig and Russell misreading Yudkowsky, or am I misreading Norvig and Russell? Because if you take "mechanism design" and "checks and balances" to have the obvious economic and political meanings in the theories of multi-agent systems, then I am pretty sure that Yudkowsky does not claim that this is where the challenge lies.

[-]folkTheory15y120

This is an introductory textbook for students who haven't been exposed to these ideas before. The paragraph makes a lot more sense under that assumption, than under the assumption they are trying to be technically correct down to every term they use.

[-]Perplexed15y20

Perhaps. But considering that we are talking about chapter 26 of a 27 chapter textbook, and that the authors spent 5 pages explaining the concept of "mechanism design" back in section 17.6, and also considering that every American student learns about the political concept of "checks and balances" back in high school, I'm going to stick with the theory that they either misunderstood Yudkowsky, or decided to disagree with him without calling attention to the fact.

ETA: Incidentally, if the authors are inserting their own opinion and disagreeing with Yudkowsky, I tend to agree with them. In my (not yet informed opinion), Eliezer dismisses the possibility of a multi-agent solution too quickly.

[-]timtyler15y50

Eliezer dismisses the possibility of a multi-agent solution too quickly.

A multi-machine solution? Is that so very different from one machine with a different internal architecture?

[-]Perplexed15y10

I favor a multi-agent solution which includes both human and machine agents. But, yes, a multi-machine solution may well differ from a unified artificial rational agent. For one thing, the composite will not be itself a rational agent (it may split its charitable contributions between two different charities, for example. :)

ETA: More to the point, a singleton must self-modify to 'grow' in power and intelligence, and will strive to preserve its utility function (values) in the course of doing so. A coalition, on the other hand, grows in power by creating or enlisting new members. So, for example, rogue AI's can be incorporated into a coalition, whereas a singleton will have to defeat and destroy them. Furthermore, the political balance within a coalition may shift over time, as agents who are willing to delay gratification gain in power, and agents who demand instant gratification lose relative power. And as the political balance shifts, so does the effective composite utility function.

[-]timtyler15y00

More to the point, a singleton must self-modify to 'grow' in power and intelligence, and will strive to preserve its utility function (values) in the course of doing so. A coalition, on the other hand, grows in power by creating or enlisting new members. So, for example, rogue AI's can be incorporated into a coalition, whereas a singleton will have to defeat and destroy them. Furthermore, the political balance within a coalition may shift over time, as agents who are willing to delay gratification gain in power, and agents who demand instant gratification lose relative power. And as the political balance shifts, so does the effective composite utility function.

It sounds as though you are thinking about the early days.

ISTM that a single creature could grow in the manner you describe a coalition growing - by assimilation and compromise. Its might not naturally favour behaving in that way - but it is possible to make an agent with whatever values you like.

More to the point, if a single creature forms from a global government, or the internet, it will probably start off in a pretty inclusive state. Only the terrorists will be excluded. There is no monopolies and mergers commission at that level, just a hangover from past, fragmented times.

[-]timtyler15y10

The section on Bayes' rule in Chapter 13 might appeal to some of those here too.

[-]diegocaleiro14y00

Is there any post in LW guiving a LWer guidelines for reading AIMA?

Meaning which chapters are more or less relevant for those whose needs are of abstract and intelectual kind, not those who need to actually do the AI stuff?

An AIMA for non-AI people thing.

[-][anonymous]13y00

Read the first two chapters (part I). Skim the introductory paragraphs and summary section and read the bibliography section of each chapter in parts II - VII. Read chapters 26 and 27 (part VIII).

Yes, I realize that's basically the introductory, high level summary, and epilogue materials. AIMA is a technical book on AI. If you're not an AI person then I'm not sure what the point would be in reading it...

[-]ata15y00

Nifty. I've been looking forward to reading AIMA. I get most of my textbooks online these days, but this looks like the sort of thing I'd like to have on my bookshelf next to PT:LOS, "Causality", "Introduction to Algorithms", etc. I just wish it weren't $109...

[-]lukeprog15y20

There are electronic copies floating around, but the hardcover is worth the price.

[-]Carnegie_IB15y-40

I would like to share some thoughts on this topic in general in terms of AI, and Singularity.

I am a speculator and find that a right decision typically does not exist. A decision is, more like a judgement and selection of the better alternative. Most executives when making a decision will need to use more opinion than fact in top decisions, especially when high amounts of uncertainty are involved.

In many cases, outcomes do not come out as intended.

Relative, this AI singularity matter. In an effort to create a potential hedge of protection for the good of mankind, we consider the idea, "to create AI machines that are intended to be human friendly before any other AI machines are made".

This may be the last invention man will ever make...

Please consider:

Good intentions typically have unintended consequences,
Law of opposites must be considered,
To achieve optimal Market Standing, rather than Total Dominance, and
To realize we have an illusion of control.

These are a few of many considerations that require analysis and consideration. Determining the right questions to ask is another hard part.

This post will not even attempt to solve this problem.

I hope this adds value to the discussion, if not here, that it be directed to the best place to achieve the most value to the decision making process.

Moderation Log