Anja comments on A utility-maximizing varient of AIXI - Less Wrong

15 Post author: AlexMennen 17 December 2012 03:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread. Show more comments above.

Comment author: AlexMennen 19 December 2012 08:13:03PM 1 point [-]

True, the U(program, action sequence) framework can be implemented within the U(action/observation sequence) framework, although you forgot to multiply by 2^-l(q) when describing how. I also don't really like the finite look-ahead (until m_k) method, since it is dynamically inconsistent.

This solves wireheading only if we can specify which environments contain wireheaded (non-dualistic) agents, delusion boxes, etc..

Not sure what you mean by that.

Comment author: Anja 20 December 2012 06:29:57PM 2 points [-]

you forgot to multiply by 2^-l(q)

I think then you would count that twice, wouldn't you? Because my original formula already contains the Solomonoff probability...

Comment author: AlexMennen 20 December 2012 09:18:30PM 1 point [-]

Oh right. But you still want the probability weighting to be inside the sum, so you would actually need

Comment author: Anja 20 December 2012 09:36:35PM 2 points [-]

True. :)