V_V comments on Botworld: a cellular automaton for studying self-modifying agents embedded in their environment - Less Wrong

50 Post author: So8res 12 April 2014 12:56AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (54)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 02 May 2014 04:02:36PM *  0 points [-]

Quite frankly, it seems that you have completely misunderstood what AIXI is.
AIXI (and its computable variants) is a reinforcement learning agent. You can't expect it to perform well in a fixed duration one-shot problem.

The thing that you describe as AIXI in your comment doesn't do any learning and therefore is not AIXI. I'm not sure what you have in mind, but you seem to describe some sort of expected utility maximizer agent which operates on an explicit model of the world, iterating over Turing machines rather than actions for some (possibly erroneous) reason (AIXI iterates over Turing machines to perform Solomonoff induction. This thing doesn't perform any induction, hence why bother with Turing machines? Maybe you are thinking of something like UDT, but it is not clear).

But in any case, your model is broken: if the agent simulates a Turing machine which performs the action "Rewrite self into smallest Turing machine which does nothing ever.", outputting the content of the output tape of the simulated machine on the agent output channel, then the rewrite is not carried out in the simulation inside the agent, but in the real world, therefore the agent gets rewritten and the player reap their reward.

Comment author: So8res 02 May 2014 05:11:08PM 2 points [-]

Yes, yes I was implicitly assuming that the AIXI has already been trained up on the game---the technical argument (which allows for training and explores possibilities like "allow the AIXI to choose the agent machine") is somewhat more nuaunced, and will be explored in depth in an upcoming post. (I was hoping that readers could see the problem from the sketch above, but I suppose if you can't see AIXI's problems from Robby's posts then you'll probably want to wait for the fully explicit argument.)

Comment author: V_V 02 May 2014 06:52:40PM 0 points [-]

Ok, I'll wait.