You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

[Link] The Leverhulme Centre for the Future of Intelligence officially launches.

1 ignoranceprior 21 October 2016 01:22AM

UC Berkeley launches Center for Human-Compatible Artificial Intelligence

10 ignoranceprior 29 August 2016 10:43PM

Source article: http://news.berkeley.edu/2016/08/29/center-for-human-compatible-artificial-intelligence/

UC Berkeley artificial intelligence (AI) expert Stuart Russell will lead a new Center for Human-Compatible Artificial Intelligence, launched this week.

Russell, a UC Berkeley professor of electrical engineering and computer sciences and the Smith-Zadeh Professor in Engineering, is co-author of Artificial Intelligence: A Modern Approach, which is considered the standard text in the field of artificial intelligence, and has been an advocate for incorporating human values into the design of AI.

The primary focus of the new center is to ensure that AI systems are beneficial to humans, he said.

The co-principal investigators for the new center include computer scientists Pieter Abbeel and Anca Dragan and cognitive scientist Tom Griffiths, all from UC Berkeley; computer scientists Bart Selman and Joseph Halpern, from Cornell University; and AI experts Michael Wellman and Satinder Singh Baveja, from the University of Michigan. Russell said the center expects to add collaborators with related expertise in economics, philosophy and other social sciences.

The center is being launched with a grant of $5.5 million from the Open Philanthropy Project, with additional grants for the center’s research from the Leverhulme Trust and the Future of Life Institute.

Russell is quick to dismiss the imaginary threat from the sentient, evil robots of science fiction. The issue, he said, is that machines as we currently design them in fields like AI, robotics, control theory and operations research take the objectives that we humans give them very literally. Told to clean the bath, a domestic robot might, like the Cat in the Hat, use mother’s white dress, not understanding that the value of a clean dress is greater than the value of a clean bath.

The center will work on ways to guarantee that the most sophisticated AI systems of the future, which may be entrusted with control of critical infrastructure and may provide essential services to billions of people, will act in a manner that is aligned with human values.

“AI systems must remain under human control, with suitable constraints on behavior, despite capabilities that may eventually exceed our own,” Russell said. “This means we need cast-iron formal proofs, not just good intentions.”

One approach Russell and others are exploring is called inverse reinforcement learning, through which a robot can learn about human values by observing human behavior. By watching people dragging themselves out of bed in the morning and going through the grinding, hissing and steaming motions of making a caffè latte, for example, the robot learns something about the value of coffee to humans at that time of day.

“Rather than have robot designers specify the values, which would probably be a disaster,” said Russell, “instead the robots will observe and learn from people. Not just by watching, but also by reading. Almost everything ever written down is about people doing things, and other people having opinions about it. All of that is useful evidence.”

Russell and his colleagues don’t expect this to be an easy task.

“People are highly varied in their values and far from perfect in putting them into practice,” he acknowledged. “These aspects cause problems for a robot trying to learn what it is that we want and to navigate the often conflicting desires of different individuals.”

Russell, who recently wrote an optimistic article titled “Will They Make Us Better People?,” summed it up this way: “In the process of figuring out what values robots should optimize, we are making explicit the idealization of ourselves as humans. As we envision AI aligned with human values, that process might cause us to think more about how we ourselves really should behave, and we might learn that we have more in common with people of other cultures than we think.”

Self-improvement without self-modification

3 Stuart_Armstrong 23 July 2015 09:59AM

This is just a short note to point out that AIs can self-improve without having to self-modify. So locking down an agent from self-modification is not an effective safety measure.

How could AIs do that? The easiest and the most trivial is to create a subagent, and transfer their resources and abilities to it ("create a subagent" is a generic way to get around most restriction ideas).

Or it the AI remains unchanged and in charge, it could change the whole process around itself, so that the whole process changes and improves. For instance, if the AI is inconsistent and has to pay more attention to problems that are brought to its attention than problems that aren't, it can start to act to manage the news (or the news-bearers) to hear more of what it wants. If it can't experiment on humans, it will give advice that will cause more "natural experiments", and so on. It will gradually try to reform its environment to get around its programmed limitations.

Anyway, that was nothing new or deep, just a reminder point I hadn't seen written out.

 

Restrictions that are hard to hack

6 Stuart_Armstrong 09 March 2015 01:52PM

A putative new idea for AI control; index here.

Very much in the spirit of "if you want something, you have to define it, then code it, rather than assuming you can get if for free through some other approach."

 

Difficult children

Suppose you have a child, that you sent to play in their room. You want them to play quietly and silently, so you want them:

"I'll be checking up on you!"

The child, however, has modelled you well, and knows that you will look in briefly at midnight and then go away. The child has two main options:

  1. Play quietly the whole time.
  2. Be as noisy as they want, until around 23:59, then be totally quiet for two minutes, then go back to being noisy.

We could call the first option obeying the spirit of the law, and the second obeying the letter.

continue reading »