Disappointingly, this paper is still pre-UDT thinking. An AIXI-like agent doesn't understand that it lives within the universe that it's trying to affect, so it can unwittingly destroy its own hardware with its mining claws (to borrow a phrase from Tim Tyler).
An AIXI-like agent doesn't understand that it lives within the universe that it's trying to affect, so it can unwittingly destroy its own hardware with its mining claws (to borrow a phrase from Tim Tyler).
It looks like the newer version of the paper tries to deal with this explicitly, by introducing the concept of "agent implementation". But I can't verify whether the solution actually works, since the probability function P is left undefined.
I think the paper suffers from what may be a common failure mode in AI design: problems with the overa...
Daniel Dewey, 'Learning What to Value'
Abstract: I.J. Good's theory of an "intelligence explosion" predicts that ultraintelligent agents will undergo a process of repeated self-improvement. In the wake of such an event, how well our values are fulfilled will depend on whether these ultraintelligent agents continue to act desirably and as intended. We examine several design approaches, based on AIXI, that could be used to create ultraintelligent agents. In each case, we analyze the design conditions required for a successful, well-behaved ultraintelligent agent to be created. Our main contribution is an examination of value-learners, agents that learn a utility function from experience. We conclude that the design conditions on value-learners are in some ways less demanding than those on other design approaches.