Anja comments on A utility-maximizing varient of AIXI - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (20)
I like how you specify utility directly over programs, it describes very neatly how someone who sat down and wrote a utility function
would do it: First determine how the observation could have been computed by the environment and then evaluate that situation. This is a special case of the framework I wrote down in the cited article; you can always set
This solves wireheading only if we can specify which environments contain wireheaded (non-dualistic) agents, delusion boxes, etc..
True, the U(program, action sequence) framework can be implemented within the U(action/observation sequence) framework, although you forgot to multiply by 2^-l(q) when describing how. I also don't really like the finite look-ahead (until m_k) method, since it is dynamically inconsistent.
Not sure what you mean by that.
I think then you would count that twice, wouldn't you? Because my original formula already contains the Solomonoff probability...
Oh right. But you still want the probability weighting to be inside the sum, so you would actually need=\frac{1}{\xi\left(\dot{y}\dot{x}_{%3Ck}y\underline{x}_{k:m_{k}}\right)}\sum_{q:q(y_{1:m_k})=x_{1:m_k}}%20U(q,y_{1:m_k})2%5E{-\ell\left(q\right)}%0A)
True. :)