So8res comments on Failures of an embodied AIXI - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (45)
Hooray! I was hoping we'd be able to find some common ground.
My concerns with AIXI and AIXItl are largely of the form "this model doesn't quite capture what I'm looking for" and "AGI is not reduced to building better and better approximations of AIXI". These seem like fairly week and obvious claims to me, so I'm glad they are not contested.
(From some of the previous discussion, I was not sure that we agreed here.)
Cool. I make no such claim here.
Since you seem really interested in talking about Solomonoff induction and it's ability to deal with these situations, I'll say a few things that I think we can both agree upon. Correct me if I'm wrong:
Does that mean they are dumb? No, of course not. Nothing that you can do inside the universe is going to give you the optimality principle of an AIXI that is actually sitting outside the universe using a hypercomputer. You can't get a perfect model of the universe from inside the universe, and it's unreasonable to expect that you should.
While doing Solomonoff induction inside the universe can never give you a perfect model, it can indeed get you a good computable approximation (one of the best computable approximations around, in fact).
(I assume we agree so far?)
The thing is, when we're inside a universe and we can't have that optimality principle, I already know how to build the best universe model that I can: I just do Bayesian updates using all my evidence. I don't need new intractable methods for building good environment models, because I already have one. The problem, of course, is that to be a perfect Bayesian, I need a good prior.
And in fact, Solomonoff induction is just Bayesian updating with a Komolgorov prior. So of course it will give you good results. As I stated here, I don't view my concerns with Solomonoff induction as an "induction problem" but rather as a "priors problem": Solomonoff induction works very well (and, indeed, is basically just Bayesian updating), but the question is, did it pick the right prior?
Maybe Komolgorov complexity priors will turn out to be the correct answer, but I'm far from convinced (for a number of reasons that go beyond the scope of this discussion). Regardless, though, Solomonoff induction surely gives you the best model you can get given the prior.
(In fact, the argument with AlexMennen is an argument about whether AIXItl's prior stating that it is absolutely impossible that the universe is a Turing machine with length > l is bad enough to hurt AIXItl. I won't hop into that argument today, but instead I will note that this line of argument does not seem like a good way to approach the question of which prior we should use.)
I'm not trying to condemn Solomonoff induction in this post. I'm trying to illustrate the fact that even if you could build an AIXItl, it wouldn't be an ideal agent.
There's one obvious way to embed AIXItl into its environment (hook its output register to its motor output channel) that prevents it from self-modifying, which results in failures. There's another way to embed AIXItl into its environment (hook its program registers to its output channel) that requires you to do a lot more work before the variant becomes useful.
Is it possible to make an AIXItl variant useful in the latter case? Sure, probably, but this seems like a pretty backwards way to go about studying self-modification when we could just use a toy model that was designed to study this problem in the first place.
As an aside, I'm betting that we disagree less than you think. I spent some time carefully laying out my concerns in this post, and alluding to other concerns that I didn't have time to cover (e.g., that the Legg-Hutter intelligence metric fails to capture some aspects of intelligence that I find important), in an attempt to make my position very clear. From your own response, it sounds like you largely agree with my concerns.
And yet, you still put very different words about very different concerns into my mouth when arguing with other people against positions that you believed I held.
I find this somewhat frustrating, and while I'm sure it was an honest mistake, I hope that you will be a bit more careful in the future.
Yup.
When I was talking about positions I believe you have held (and may currently still hold?), I was referring to your words in a previous post:
I appreciate the way you've stated this concern. Comity!
Yeah, I stand by that quote. And yet, when I made my concerns more explicit:
You said you agree, but then still made claims like this:
It sounds like you retained an incorrect interpretation of my words, even after I tried to make them clear in the above post and previous comment. If you still feel that the intended interpretation is unclear, please let me know and I'll clarify further.
The text you've quoted in the parent doesn't seem to have anything to do with my point. I'm talking about plain vanilla AIXI/AIXItl. I've got nothing to say about self-modifying agents.
Let's take a particular example you gave:
Let's consider an AIXI with a Solomonoff induction unit that's already been trained to understand physics to the level that we understand it in an outside-the-universe way. It starts receiving bits and rapidly (or maybe slowly, depends on the reward stream, who cares) learns that its input stream is consistent with EM radiation bouncing off of nearby objects. Conveniently, there is a mirror nearby...
Solomonoff induction will generate confabulations about the Solomonoff induction unit of the agent, but all the other parts of the agent run on computable physics, e.g., the CCD camera that generates the input stream, the actuators that mediate the effect of the output voltage. Time to hack the input registers to max out the the reward stream!
Plain vanilla AIXI/AIXItl doesn't have a reward register. It has a reward channel. (It doesn't save its rewards anywhere, it only acts to maximize the amount of reward signal on the input channel.)
I agree that a vanilla AIXI would abuse EM radiation to flip bits on its physical input channel to get higher rewards.
AIXItl might be able to realize that the contents of its ram correlate with computations done by its Solomonoff inductor, but it won't believe that changing the RAM will change the results of induction, and it wouldn't pay a penny to prevent a cosmic ray from interfering with the inductor's code.
From AIXI's perspective, the code may be following along with the induction, but it isn't actually doing the induction, and (AIXI thinks) wiping the code isn't a big deal, because (AIXI thinks) it is a given that AIXI will act like AIXI in the future.
Now you could protest that AIXI will eventually learn to stop letting cosmic rays flip its bits because (by some miraculous coincidence) all such bit-flips result in lower expected rewards, and so it will learn to prevent them even while believing that the RAM doesn't implement the induction.
And when I point out that this isn't the case in all situations, you can call foul on games where it isn't the case.
But both of these objections are silly; it should be obvious that an AIXI in such a situation is non-optimal, and I'm still having trouble understanding why you think that AIXI is optimal under violations of ergodicity.
And then I quote V_V, which is how you know that this conversation is getting really surreal:
Yeah, I changed that while your reply was in progress.
More to come later...
ETA: Later is now!
I don't think that AIXI is optimal under violations of ergodicity; I'm not talking about the optimality of AIXI at all. I'm talking about whether or not the Solomonoff induction part is capable of prompting AIXI to preseve itself.
I'm going to try to taboo "AIXI believes" and "AIXI thinks". In hypothetical reality, the physically instantiated AIXI agent is a motherboard with sensors and actuators that are connected to the input and output pins, respectively, of a box labelled "Solomonoff Magic". This agent is in a room. Somewhere in the space of all possible programs there are two programs. The first is just the maximally compressed version of the second, i.e., the first and the second give the same outputs on all possible inputs. The second one in written in Java, with a front-end interpreter that translates the Java program into the native language of the Solomonoff unit. (Java plus a prefix-free coding, blar blar blar). This program contains a human-readable physics simulation and an observation prediction routine. The initial conditions of the physics simulation match hypothetical reality except that the innards of the CPU are replaced by a computable approximation, including things like waste heat and whatnot. The simulation uses the input to determine the part of the initial conditions that specifies simulated-AIXI's output voltages... ah! ah! ah! Found the Cartesian boundary! No matter how faithful the physics simulation is, AIXI only ever asks for one time-step at a time, so although the simulation' state propagates to simulated-AIXI's input voltages, it doesn't propagate all the way through to the output voltage.
Thank you for your patience, Nate. The outside view wins again.
Can you please expand?
Actually, I find myself in a state of uncertainty as a result of doing a close reading section 2.6 of the Gentle Introduction to AIXI in light of your comment here. You quoted Paul Christiano as saying
EY, Nate, Rob, and various commenters here (including myself until recently) all seemed to take this as given. For instance, above I wrote:
On this "program-that-takes-action-choice-as-an-input" view (perhaps inspired by a picture like that on page 7 of the Gentle Introduction and surrounding text), a simulated event like, say, a laser cutter slicing AIXI's (sim-)physical instantiation in half, could sever the (sim-)causal connection from (sim-)AIXI's input wire to its output wire, and this event would not change the fact that the simulation specifies the voltage on the output wire from the expectimax action choice.
Your claim, if I understand you correctly, is that the AIXI formalism does not actually express this kind of back-and-forth state swapping. Rather, for any given universe-modeling program, it simulates forward from the specification of the (sim-)input wire voltage (or does something computationally equivalent), not from a specification of the (sim-)output wire voltage. There is some universe-model which simulates a computable approximation of all of (sim-)AIXI's physical state changes; once the end state of has been specified, real-AIXI gives zero weight all branches of the expectimax tree that do not have an action that matches the state of (sim-)AIXI's output wire.
Do I have that about right?