Squark comments on How to make AIXI-tl incapable of learning - Less Wrong

4 Post author: itaibn0 27 January 2014 12:05AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (5)

You are viewing a single comment's thread.

Comment author: Squark 27 January 2014 07:49:15AM *  1 point [-]

AIXI-tl should handle your problem OK, since the relationship between hn and sn is irrelevant to the relationship between bn and rn. If it failed here it would fail for a problem with hn and sn random noise. However it might be possible to construct a more complex example where the fact that SHA512(sn)=hn is important for optimization of the reward...

Comment author: itaibn0 28 January 2014 12:22:40AM 0 points [-]

I don't think so. Imagine there were there were a way to determine weather a string is a valid hash, and it turned out that hn turned out not to be a valid hash. Then that information is relevant: you now know you are no longer playing the same game as the last times, and so you would no longer use the same strategy. From the perspective of AIXI-tl, not being able to prove that hn is a valid hash is treated the same as having a reliable indicator tell it that hn is not a valid hash, and it would act accordingly.