Vladimir_Nesov comments on AIXI and Existential Despair - Less Wrong

13 Post author: paulfchristiano 08 December 2011 08:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vladimir_Nesov 10 December 2011 02:07:16PM *  0 points [-]

It can't be taught to be any algorithm, since it has to maximize reward, that part is fixed. My point was mostly unrelated to yours (but inspired by it): the set of correct (i.e. those that won't be falsified) program-models (those more like Model 1, listening to hypothetical outputs, that is I'm stipulating that what you discuss in the post doesn't happen or doesn't matter) includes UDT-like agents that reason about logical dependence of reward on their decisions, not just explicit dependence schemes (see "against explicit dependence" section of this post) that suppose some kind of cartesian magic.

That is, there probably is an AIXI configuration (state of knowledge or initial training observation/reward prefix) that turns AIXI into a competent agent that gets all sorts of decision problems right: can reason about itself and cooperation with other copies, care about counterfactuals and so on. That's an unexpected result quite different from what I previously believed (even though it doesn't assert that this happens on its own, but I can't really tell).