Anja comments on A utility-maximizing varient of AIXI - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (20)
True, the U(program, action sequence) framework can be implemented within the U(action/observation sequence) framework, although you forgot to multiply by 2^-l(q) when describing how. I also don't really like the finite look-ahead (until m_k) method, since it is dynamically inconsistent.
Not sure what you mean by that.
I think then you would count that twice, wouldn't you? Because my original formula already contains the Solomonoff probability...
Oh right. But you still want the probability weighting to be inside the sum, so you would actually need=\frac{1}{\xi\left(\dot{y}\dot{x}_{%3Ck}y\underline{x}_{k:m_{k}}\right)}\sum_{q:q(y_{1:m_k})=x_{1:m_k}}%20U(q,y_{1:m_k})2%5E{-\ell\left(q\right)}%0A)
True. :)