AlexMennen comments on Failures of an embodied AIXI - LessWrong

29 Post author: So8res 15 June 2014 06:29PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (45)

You are viewing a single comment's thread. Show more comments above.

Comment author: AlexMennen 10 June 2014 08:33:43PM *  1 point [-]

Just look at the AIXI equation itself: .

(observations) and (rewards) are the signals sent from the environment to AIXI, and (actions) are AIXI's outputs. Notice that future are predicted by picking the one that would maximize expected reward through timestep m, just like AIXI does, and there is no summation over possible ways that the environment could make AIXI output actions computed some other way, like there is for and .