Nebu comments on Reply to Holden on 'Tool AI' - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (348)
I believe AIXI is much more inspectable than you make it out to be. I think it is important to challenge your claim here because Holden appears to have trusted your expertise and hereby concede an important part of the argument.
AIXI's utility judgements are based a Solomonoff prior, which are based on the computer programs which return the input data. Computer programs are not black-boxes. A system implementing AIXI can easily also return a sample of typical expected future histories and the programs compressing these histories. By examining these programs, we can figure out what implicit model the AIXI system has of its world. These programs are optimized for shortness so they are likely to be very obfuscated, but I don't expect them to be incomprehensible (after all, they're not optimized for incomprehensibility). Even just sampling expected histories without their compressions is likely to be very informative. In the case of AIXItl the situation is better in the sense that it's output at any give time is guaranteed to be generated by just one length <l subprogram, and this subprogram comes with a proof justifying its utility judgement. It's also worse in that there is no way to sample its expected future histories. However, I expect the proof provided would implicitly contain such information. If either the programs or the proofs cannot be understood by humans, the programmers can just reject them and look at the next best candidates.
As for "What will be its effect on _?", this can be answered as well. I already stated that with AIXI you can sample future histories. This is because AIXI has a specific known prior it implements for its future histories, namely Solomonoff induction. This ability may seem limited because it only shows the future sensory data, but sensory data can be whatever you feed AIXI as input. If you want it to a have a realistic model of the world, this includes a lot of relevant information. For example, if you feed it the entire database of Wikipedia, it can give likely future versions of Wikipedia which already provides a lot of details on the effect of its actions.
Can you be a bit more specific in your interpretation of AIXI here?
Here are my assumptions, let me know where you have different assumptions:
How does Tool-AIXI work in contrast to this? Holden seems to want to avoid having any utility function pre-defined at all. However, presumably Tool-AIXI still receives inputs and still produces outputs (probably Holden intends not to allow Tool-AIXI to control a robot servo arm, but he might intend for Tool-AIXI to be able to control an LCD monitor, or at the very least, produce some sort of text file as output).
Does Tool-AIXI proceed in discrete time steps gathering input? Or do we prevent Tool-AIXI from running until a user is ready to submit a curated input to Tool-AIXI? If the latter, how quickly to we expect Tool-AIXI to be able to formulate an reasonable model of our universe?
How does Tool-AIXI choose what output to produce, if there's no utility function?
If we type in "Tool-AIXI, please give me a cure for cancer" onto a keyboard attached to Tool-AIXI and submit that as an input, do we think that a model that encodes ASCII, the English language, bio-organisms, etc. has a lower kolmogorov complexity than a model that says "we live in a universe where we receive exactly this hardcoded stream of bytes"?
Does Tool-AIXI model the output it produces (whether that be pixels on a screen, or bytes to a file) as an action, or does it somehow prevent itself from modelling its output as if it were an action that had some effect on the universe that it exists in? If the former, then isn't this just an agenty Oracle AI? If the latter, then what kind of programs is it generate for its model (surely not programs that take (S, A) pairs as inputs, or else what would it use for A when evaluating its plans and predicting the future)?