V_V comments on The Problem with AIXI - Less Wrong

24 Post author: RobbBB 18 March 2014 01:55AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (78)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 13 March 2014 02:45:07PM 1 point [-]

But nowhere in AIXI's hypothesis space is a reasoner that is a native representation of AIXI as 'me', as the agent doing the hypothesizing.

I disagree: Among all the world-programs in AIXI model space, there are some programs where, after AIXI performs one action, all its future actions are ignored and control is passed to a subroutine "AGENT" in the program. In principle AIXI can reason that if the last action it performs damages AGENT, e.g. by dropping an anvil on its head, the reward signal, computed by some reward subroutine in the world-program, won't be maximized anymore.

Of course there are the usual computability issues: the true AIXI is uncomputable, hence the AGENTs would be actually a complexity-weighted mixture of its computable approximations. AIXItl would have the same issue w.r.t. the resource bounds t and l.
I'm not sure this is necessarily a severe issue. Anyway, I suppose that AIXItl could be modified in some UDT-like way to include a quined source code and recognize copies of itself inside the world-programs.

The other issue is how does AIXI learn to assign high weights to these world-programs in a non-ergodic environment? Humans seem to manage to do that by a combination of innate priors and tutoring. I suppose that something similar is in principle applicable to AIXI.