V_V comments on Bridge Collapse: Reductionism as Engineering Problem - Less Wrong

44 Post author: RobbBB 18 February 2014 10:03PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (61)

You are viewing a single comment's thread. Show more comments above.

Comment author: V_V 19 February 2014 02:46:51PM 2 points [-]

I think you are conflating two different problems:

  • How to learn by reinforcement in an unknown non-ergodic environment (e.g. one where it is possible to drop an anvil on your head)

  • How to make decisions that take into account future reward, in a non-ergodic environment, where actions may modify the agent.

The first problem is well known the reinforcement learning community, and in fact it is mentioned also in the first AIXI papers, but it is sidestepped with an ergodicity assumption, rather than addressed.
I don't think there can be really general solutions for this problem: you need some environment-specific prior or supervision.

The second problem doesn't seem as hard as the first one.
AIXI, of course, can't model self-modifications, because it is incomputable and it can only deal with computable environments, but computable varieties of AIXI (Schmidhuber's Gödel machine, perhaps?) can easily represent themselves as part of the environment.