You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

V_V comments on Versions of AIXI can be arbitrarily stupid - Less Wrong Discussion

15 Post author: Stuart_Armstrong 10 August 2015 01:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (59)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 14 August 2015 02:51:55PM 1 point [-]

In theory, changing the exploration rate and changing the prior are equivalent. I think that it might be easier to decide upon an exploration rate that gives a good result for generic priors, than to be sure that generic priors have good exploration rates. But this is just an impression.

Comment author: V_V 17 August 2015 10:26:26AM 1 point [-]

In theory, changing the exploration rate and changing the prior are equivalent.

Not really. Standard AIXI is completely deterministic, while the usual exploration strategies for reinforcement learning, such as ɛ-greedy and soft-max, are stochastic.