Tuukka_Virtaperko comments on Welcome to Less Wrong! - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (1953)
I don't really understand this part.
"The scanner does not understand the information but the person does" sounds like some variant of Searle's Chinese Room argument when presented without further qualifiers. People in AI tend to regard Searle as a confused distraction.
The intelligent agent model still deals with deterministic machines that take input and produce output, but it incorporates the possibility of changing the agent's internal state by presenting the output function as just taking the entire input history X* as an input to the function that produces the latest output Y, so that a different history of inputs can lead to a different output on the latest input, just like it can with humans and more sophisticated machines.
I suppose the idea here is that there is some difference whether there is a human being sitting in the scanner, or, say, a toy robot with a state of two bits where one is I am thinking about cats and the other is I am broken and will lie about thinking about cats. With the robot, we could just check the "broken" bit as well from the scan when the robot is disagreeing with the scanner, and if it is set, conclude that the robot is broken.
I'm not seeing how humans must be fundamentally different. The scanner can already do the extremely difficult task of mapping a raw brain state to the act of thinking about a cat, it should also be able to tell from the brain state whether the person has something going on in their brain that will make them deny thinking about a cat. Things being deterministic and predictable from knowing their initial state doesn't mean they can't have complex behavior reacting to a long history of sensory inputs accompanied by a large amount of internal processing that might correspond quite well to what we think of as reflection or understanding.
Sorry I keep skipping over your formalism stuff, but I'm still not really grasping the underlying assumptions behind this approach. (The underlying approach in the computer science approach are, roughly, "the physical world exists, and is made of lots of interacting, simple, Turing-computable stuff and nothing else", "animals and humans are just clever robots made of the stuff", "magical souls aren't involved, not even if they wear a paper bag that says 'conscious experience' on their head")
The whole philosophical theory of everything thing does remind me of this strange thing from a year ago, where the building blocks for the theory were made out of nowadays more fashionable category theory rather than set theory though.
I've read some of this Universal Induction article. It seems to operate from flawed premises.
Suppose the brain uses algorithms. An uncontroversial supposition. From a computational point of view, the former citation is like saying: "In order for a computer to not run a program, such as Indiana Jones and the Fate of Atlantis, the computer must be executing some command to the effect of "DoNotExecuteProgram('IndianaJonesAndTheFateOfAtlantis')".
That's not how computers operate. They just don't run the program. They don't need a special process for not running the program. Instead, not running the program is "implicitly contained" in the state of affairs that the computer is not running it. But this notion of implicit containment makes no sense for the computer. There are infinitely many programs the computer is not running at a given moment, so it can't process the state of affairs that it is not running any of them.
Likewise, the use of an implicit bias towards simplicity cannot be meaningfully conceptualized by humans. In order to know how this bias simplifies everything, one would have to know, what information regarding "everything" is omitted by the bias. But if we knew that, the bias would not exist in the sense the author intends it to exist.
Furthermore:
The author says that there are variations of the no free lunch theorem for particular contexts. But he goes on to generalize that the notion of no free lunch theorem means something independent of context. What could that possibly be? Also, such notions as "arbitrary complexity" or "randomness" seem intuitively meaningful, but what is their context?
The problem is, if there is no context, the solution cannot be proven to address the problem of induction. But if there is a context, it addresses the problem of induction only within that context. Then philosophers will say that the context was arbitrary, and formulate the problem again in another context where previous results will not apply.
In a way, this makes the problem of induction seem like a waste of time. But the real problem is about formalizing the notion of context in such a way, that it becomes possible to identify ambiguous assumptions about context. That would be what separates scientific thought from poetry. In science, ambiguity is not desired and should therefore be identified. But philosophers tend to place little emphasis on this, and rather spend time dwelling on problems they should, in my opinion, recognize as unsolvable due to ambiguity of context.
The omitted information in this approach is information with a high Kolmogorov complexity, which is omitted in favor of information with low Kolmogorov complexity. A very rough analogy would be to describe humans as having a bias towards ideas expressible in few words of English in favor of ideas that need many words of English to express. Using Kolmogorov complexity for sequence prediction instead of English language for ideas in the construction gets rid of the very many problems of rigor involved in the latter, but the basic idea is pretty much the same. You look into things that are briefly expressible in favor of things that must be expressed in length. The information isn't permanently omitted, it's just depriorized. The algorithm doesn't start looking at the stuff you need long sentences to describe before it has convinced itself that there are no short sentences that describe the observations it wants to explain in a satisfactory way.
One bit of context that is assumed is that the surrounding universe is somewhat amenable to being Kolmogorov-compressed. That is, there are some recurring regularities that you can begin to discover. The term "lawful universe" sometimes thrown around in LW probably refers to something similar.
Solomonoff's universal induction would not work in a completely chaotic universe, where there are no regularities for Kolmogorov compression to latch on. You'd also be unlikely to find any sort of native intelligent entities in such universes. I'm not sure if this means that the Solomonoff approach is philosophically untenable, but needing to have some discoverable regularities to begin with before discovering regularities with induction becomes possible doesn't strike me as that great a requirement.
If the problem of context is about exactly where you draw the data for the sequence which you will then try to predict with Solomonoff induction, in a lawless universe you wouldn't be able to infer things no matter which simple instrumentation you picked, while in a lawful universe you could pick all sorts of instruments, tracking the change of light during time, tracking temperature, tracking the luminousity of the Moon, for simple examples, and you'd start getting Kolmogorov-compressible data where the induction system could start figuring repeating periods.
The core thing "independent of context" in all this is that all the universal induction systems are reduced to basically taking a series of numbers as input, and trying to develop an efficient predictor for what the next number will be. The argument in the paper is that this construction is basically sufficient for all the interesting things an induction solution could do, and that all the various real-world cases where induction is needed can be basically reduced into such a system by describing the instrumentation which turns real-world input into a time series of numbers.
Okay. In this case, the article does seem to begin to make sense. Its connection to the problem of induction is perhaps rather thin. The idea of using low Kolmogorov complexity as justification for an inductive argument cannot be deduced as a theorem of something that's "surely true", whatever that might mean. And if it were taken as an axiom, philosophers would say: "That's not an axiom. That's the conclusion of an inductive argument you made! You are begging the question!"
However, it seems like advancements in computation theory have made people able to do at least remotely practical stuff on areas, that bear resemblance to more inert philosophical ponderings. That's good, and this article might even be used as justification for my theory RP - given that the use of Kolmogorov complexity is accepted. I was not familiar with the concept of Kolmogorov complexity despite having heard of it a few times, but my intuitive goal was to minimize the theory's Kolmogorov complexity by removing arbitrary declarations and favoring symmetry.
I would say, that there are many ways of solving the problem of induction. Whether a theory is a solution to the problem of induction depends on whether it covers the entire scope of the problem. I would say this article covers half of the scope. The rest is not covered, to my knowledge, by anyone else than Robert Pirsig and experts of Buddhism, but these writings are very difficult to approach analytically. Regrettably, I am still unable to publish the relativizability article, which is intended to succeed in the analytic approach.
In any case, even though the widely rejected "statistical relevance" and this "Kolmogorov complexity relevance" share the same flaw, if presented as an explanation of inductive justification, the approach is interesting. Perhaps, even, this paper should be titled: "A Formalization of Occam's Razor Principle". Because that's what it surely seems to be. And I think it's actually an achievement to formalize that principle - an achievement more than sufficient to justify the writing of the article.
Commenting the article:
"When artificial intelligence researchers attempted to capture everyday statements of inference using classical logic they began to realize this was a difficult if not impossible task."
I hope nobody's doing this anymore. It's obviously impossible. "Everyday statements of inference", whatever that might mean, are not exclusively statements of first-order logic, because Russell's paradox is simple enough to be formulated by talking about barbers. The liar paradox is also expressible with simple, practical language.
Wait a second. Wikipedia already knows this stuff is a formalization of Occam's razor. One article seems to attribute the formalization of that principle to Solomonoff, another one to Hutter. In addition, Solomonoff induction, that is essential for both, is not computable. Ugh. So Hutter and Rathmanner actually have the nerve to begin that article by talking about the problem of induction, when the goal is obviously to introduce concepts of computation theory? And they are already familiar with Occam's razor, and aware of it having, at least probably, been formalized?
Okay then, but this doesn't solve the problem of induction. They have not even formalized the problem of induction in a way that accounts for the logical structure of inductive inference, and leaves room for various relevance operators to take place. Nobody else has done that either, though. I should get back to this later.