shminux comments on Building Phenomenological Bridges - Less Wrong

56 Post author: RobbBB 23 December 2013 07:57PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (116)

You are viewing a single comment's thread. Show more comments above.

Comment author: itaibn0 24 December 2013 03:36:30PM 7 points [-]

Well here is my take on how AIXI would handle these sorts of situations:

First, let's assume it lives in a universe so which in any time t is in a state S(t) which is computable in terms of t. Now, AIXI finds this function S(t) interesting because it can be used to predict the input bits. More precisely, AIXI generates some function f which locate the machine running it and returns the input bits in that machine, and generates the model where its inputs in time t is f(S(t)). This function f is AIXI's phenomenological bridge, it is naturally emergent from the formalism of AIXI. This does not take into account how AIXI's model active has its future inputs depend on its current output, which would make the model more complicated.

Now suppose that AIXI considers an action with the result that in some time t' the machine computing it in no longer exists. Then AIXI would be able to compute S(t'), but f(S(t')) would no longer be well defined. What would AIXI do then? It will have to start using a different model for its inputs. Weather it will perform such an action depends on its predictions of its reward signal in this alternative. The exact result would be unpredictable, but one possible regularity would be that if it receives a very low reward signal then by regression to the mean it would expect to do better in these alternative models and would be in favor of actions which lead to its host machine's destruction.

However, it gets more complicated than that. While in our intuitive models the AIXI's input is no longer well defined when its host machine is destroyed, in its internal model the function f would probably be defined everywhere. For example, if its input are stored in a string of capacitors, its function f may be "electric fields in point x0, ..., xn", which is defined even when the capacitors are destroyed or displaced. A more interesting example would be if its inputs are generated from perfect-fidelity measurement of the physical world. Then the most favored hypothesis for f may be that f(s) is the measurement of those observables, and AIXI's actions would optimize the physical parameter corresponding to its reward circuit regardless of what it predicts would happen to its reward circuit.

It gets even more interesting. Suppose such an AIXI predicts that its input stream will be tampered with. What would it do? Here, the part of its model which depends on its own output, which I previously ignored, becomes crucial. It would be reasonable for it to think as follows: When the inputs for its machine don't match the physical parameters these inputs are supposed to measure, then AIXI's prediction for its future input no longer matches the inputs the machine receives. Therefore the machine's actions should no longer match AIXI's intentions, but AIXI's reward signal will still be at the mercy of this machine. This would generally be assigned a suboptimal utility and be avoided. However, AIXI's model for its output circuit may be that it influences the physical state even after its host machine no longer implements it. In that case, it would not be reluctant to tamper with its input circuit.

Overall, AIXI's actions eerily resemble the way humans behave.

Comment author: shminux 24 December 2013 09:07:20PM *  4 points [-]

These scenarios call for SMBC-like comic strip illustrations. Maybe ping Zach?