Daniel Dewey, 'Learning What to Value'
Abstract: I.J. Good's theory of an "intelligence explosion" predicts that ultraintelligent agents will undergo a process of repeated self-improvement. In the wake of such an event, how well our values are fulfilled will depend on whether these ultraintelligent agents continue to act desirably and as intended. We examine several design approaches, based on AIXI, that could be used to create ultraintelligent agents. In each case, we analyze the design conditions required for a successful, well-behaved ultraintelligent agent to be created. Our main contribution is an examination of value-learners, agents that learn a utility function from experience. We conclude that the design conditions on value-learners are in some ways less demanding than those on other design approaches.
How was it received at the conference?
Well, overall.
I think most people understood the basic argument: powerful reinforcement learners would behave badly, and we need to look for other frameworks. Pushing that idea was my biggest goal at the conference. I didn't get much further than that with most people I talked to.
Unfortunately, almost nobody seemed convinced that it was an urgent issue, or one that could be solved, so I don't expect many people to start working on FAI because of me. Hopefully repeated exposure to SI's ideas will convince people gradually.
Common responses I got when I faile... (read more)