AlexMennen comments on A utility-maximizing varient of AIXI - Less Wrong

15 Post author: AlexMennen 17 December 2012 03:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread. Show more comments above.

Comment author: AlexMennen 19 December 2012 08:13:03PM 1 point [-]

True, the U(program, action sequence) framework can be implemented within the U(action/observation sequence) framework, although you forgot to multiply by 2^-l(q) when describing how. I also don't really like the finite look-ahead (until m_k) method, since it is dynamically inconsistent.

This solves wireheading only if we can specify which environments contain wireheaded (non-dualistic) agents, delusion boxes, etc..

Not sure what you mean by that.

Comment author: Anja 20 December 2012 06:29:57PM 2 points [-]

you forgot to multiply by 2^-l(q)

I think then you would count that twice, wouldn't you? Because my original formula already contains the Solomonoff probability...

Comment author: AlexMennen 20 December 2012 09:18:30PM 1 point [-]

Oh right. But you still want the probability weighting to be inside the sum, so you would actually need

Comment author: Anja 20 December 2012 09:36:35PM 2 points [-]

True. :)

Comment author: Anja 20 December 2012 06:25:49PM 0 points [-]

Let's stick with delusion boxes for now, because assuming that we can read off from the environment whether the agent has wireheaded breaks dualism. So even if we specify utility directly over environments, we still need to master the task of specifying which action/environment combinations contain delusion boxes to evaluate them correctly. It is still the same problem, just phrased differently.

Comment author: AlexMennen 20 December 2012 09:21:39PM *  0 points [-]

If I understand you correctly, that sounds like a fairly straightforward problem for AIXI to solve. Some programs q_1 will mimic some other program q_2's communication with the agent while doing something else in the background, but AIXI considers the possibilities of both q_1 and q_2.