I saw this go by on arXiv, and thought it deserved a discussion here.

Decision-theoretic agents predict and evaluate the results of their actions using a model, or ontology, of their environment. An agent's goal, or utility function, may also be specified in terms of the states of, or entities within, its ontology. If the agent may upgrade or replace its ontology, it faces a crisis: the agent's original goal may not be well-defined with respect to its new ontology. This crisis must be resolved before the agent can make plans towards achieving its goals. We discuss in this paper which sorts of agents will undergo ontological crises and why we may want to create such agents. We present some concrete examples, and argue that a well-defined procedure for resolving ontological crises is needed. We point to some possible approaches to solving this problem, and evaluate these methods on our examples.

I'll post my analysis and opinion of this paper in a comment after I've taken some time to digest it.

New to LessWrong?

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 12:52 PM

I agree this is an important problem, but the choice of ontology and how to upgrade from one to another seem to involve so many hard philosophical problems, that the approach taken by the paper, of trying to find specific algorithms that can be used for upgrading ontologies, seems doomed or at least highly premature. I wonder if the author takes that approach seriously, or is only using it for pedagogical purposes.

I've (tried to) read it several times. While I agree on the basic idea of finding isomorphisms by looking at bisimulations or bijections, and the minimizing differences sounds like a good idea inasmuch as it follows Occam's razor, a lot of it seems unmotivated and unexplained.

Like the use of the Kullback-Leibler divergence. Why that, specifically - is it just that obvious and desirable? It would seem to have some not especially useful properties like not being symmetrical (so would an AI using it would exhibit non-monotonic behavior in changing ontologies?), which don't seem to be discussed.