Summary: The problem of AI has turned out to be a lot harder than was originally thought. One hypothesis is that the obstacle is not a shortcoming of mathematics or theory, but limitations in the philosophy of science. This article is a preview of a series of posts that will describe how, by making a minor revision in our understanding of the scientific method, further progress can be achieved by establishing AI as an empirical science.
The field of artificial intelligence has been around for more than fifty years. If one takes an optimistic view of things, its possible to believe that a lot of progress has been made. A chess program defeated the top-ranked human grandmaster. Robotic cars drove autonomously across 132 miles of Mojave desert. And Google seems to have made great strides in machine translation, apparently by feeding massive quantities of data to a statistical learning algorithm.
But even as the field has advanced, the horizon has seemed to recede. In some sense the field's successes make its failures all the more conspicuous. The best chess programs are better than any human, but go is still challenging for computers. Robotic cars can drive across the desert, but they're not ready to share the road with human drivers. And Google is pretty good at translating Spanish to English, but still produces howlers when translating Japanese to English. The failures indicate that, instead of being threads in a majestic general theory, the successes were just narrow, isolated solutions to problems that turned out to be easier than they originally appeared.
So what went wrong, and how to move forward? Most mainstream AI researchers are reluctant to provide clear answers to this question, so instead one must read behind the lines in the literature. Every new paper in AI implicitly suggests that the research subfield of which it is a part will, if vigorously pursued, lead to dramatic progress towards intelligence. People who study reinforcement learning think the answer is to develop better versions of algorithms like Q-Learning and temporal difference (TD) learning. The researchers behind the IBM Blue Brain project think the answer is to conduct massive neural simulations. For some roboticists, the answer involves the idea of embodiment: since the purpose of the brain is to control the body, to understand intelligence one should build robots, put them in the real world, watch how they behave, notice the problems they encounter, and then try to solve those problems. Practitioners of computer vision believe that since the visual cortex takes up such a huge fraction of total brain volume, the best way to understand general intelligence is to first study vision.
Now, I have some sympathy for the views mentioned above. If I had been thinking seriously about AI in the 80s, I would probably have gotten excited about the idea of reinforcement learning. But reinforcement learning is now basically an old idea, as is embodiment (this tradition can be traced back to the seminal papers by Rodney Brooks in the early 90s), and computer vision is almost as old as AI itself. If these avenues really led to some kind of amazing result, it probably would already have been found.
So, dissatisfied with the ideas of my predecessors, I've taken some trouble to develop my own hypothesis regarding the question of how to move forward. And desperate times call for desperate measures: the long failure of AI to live up to its promises suggests that the obstacle is no small thing that can be solved merely by writing down a new algorithm or theorem. What I propose is nothing less than a complete reexamination of our answers to fundamental philosophical questions. What is a scientific theory? What is the real meaning of the scientific method (and why did it take so long for people to figure out the part about empirical verification)? How do we separate science from pseudoscience? What is Ockham's Razor really telling us? Why does physics work so amazingly, terrifyingly well, while fields like economics and nutrition stumble?
Now, my answers to these fundamental questions aren't going to be radical. It all adds up to normality. No one who is up-to-date on topics like information theory, machine learning, and Bayesian statistics will be shocked by what I have to say here. But my answers are slightly different from the traditional ones. And by starting from a slightly different philosophical origin, and following the logical path as it opened up in front of me, I've reached a clearing in the conceptual woods that is bright, beautiful, and silent.
Without getting too far ahead of myself, let me give you a bit of a preview of the ideas I'm going to discuss. One highly relevant issue is the role that other, more mature fields have had in shaping modern AI. One obvious influence comes from computer science, since presumably AI will eventually be built using software. But this fact appears irrelevant to me, and so the influence of computer science on AI seems like a disastrous historical accident. To suggest that computer science should be an important influence on AI is a bit like suggesting that woodworking should be an important influence on music, since most musical instruments are made out of wood. Another influence, that should in principle be healthy but in practice isn't, comes from physics. Unfortunately, for the most part, AI researchers have imitated only the superficial appearance of physics - its use of sophisticated mathematics - while ignoring its essential trait, which is its obsession with reality. In my view, AI can and must become a hard, empirical science, in which researchers propose, test, refine, and often discard theories of empirical reality. But theories of AI will not work like theories of physics. We'll see that AI can be considered, in some sense, the epistemological converse of physics. Physics works by using complex deductive reasoning (calculus, differential equations, group theory, etc) built on top of a minimalist inductive framework (the physical laws). Human intelligence, in contrast, is based on a complex inductive foundation, supplemented by minor deductive operations. In many ways, AI will come to resemble disciplines like botany, zoology, and cartography - fields in which the researchers' core methodological impulse is to go out into the world and write down what they see.
An important aspect of my proposal will be to expand the definitions of the words "scientific theory" and "scientific method". A scientific theory, to me, is a computational tool that can be used to produce reliable predictions, and a scientific method is a process of obtaining good scientific theories. Botany and zoology make reliable predictions, so they must have scientific theories. In contrast to physics, however, they depend far less on the use of controlled experiments. The analogy to human learning is strong: humans achieve the ability to make reliable predictions without conducting controlled experiments. Typically, though, experimental sciences are considered to be far harder, more rigorous, and more quantitative than observational sciences. But I will propose a generalized version of the scientific method, which includes human learning as a special case, and shows how to make observational sciences just as hard, rigorous, and quantitative as physics.
As a result of learning, humans achieve the ability to make fairly good predictions about some types of phenomena. It seems clear that a major component of that predictive power is the ability to transform raw sensory data into abstract perceptions. The photons fall on my eye in a certain pattern which I recognize as a doorknob, allowing me to predict that if I turn the knob, the door will open. So humans are amazingly talented at perception, and modestly good at prediction. Are there any other ingredients necessary for intelligence? My answer is: not really. In particular, in my view humans are terrible at planning. Our decision making algorithm is not much more than: invent a plan, try to predict what will happen based on that plan, and if the prediction seems good, implement the plan. All the "magic" really comes from the ability to make accurate predictions. So a major difference in my approach as opposed to traditional AI is that the emphasis is on prediction through learning and perception, as opposed to planning through logic and deduction.
As a final point, I want to note that my proposal is not analogous to or in conflict with theories of brain function like deep belief networks, neural Darwinism, symbol systems, or hierarchical temporal memories. My proposal is like an interface: it specifies the input and the output, but not the implementation. It embodies an immense and multifaceted Question, to which I have no real answer. But, crucially, the Question comes with a rigorous evaluation procedure that allows one to compare candidate answers. Finding those answers will be an awesome challenge, and I hope I can convince some of you to work with me on that challenge.
I am going to post an outline of my proposal over the next couple of weeks. I expect most of you will disagree with most of it, but I hope we can at least identify concretely the points at which our views diverge. I am very interested in feedback and criticism, both regarding material issues (since we reason to argue), and on issues of style and presentation. The ideas are not fundamentally difficult; if you can't understand what I'm saying, I will accept at least three quarters of the blame.
You maybe should have mentioned the earlier discussion of your idea on the open thread, in which I believed I spotted some critical problems with where you're going: you seem to be endorsing a sort of "blank slate" model in that humans have a really good reasoning engine, and the stimuli humans get after birth are sufficient to make all the right inferences.
However, all experimental evidence tells us (cf. Pinker's The Blank Slate) that humans make a significantly smaller set of inferences on our sense data than are logically possible under constraint of Occam's razor; there are grammatical errors that children never make in any language; there are expectations babies all have, at the same time, though none has gathered enough postnatal sense data to justify such inferences, etc.
I conclude that it is fruitless to attempt to find "general intelligence" by looking at what general algorithm would make the inferences human do, given postnatal stimuli. My alternative suggestion is to identify human intelligence as a combination of general reasoning and pre-encoding of environment-specific knowledge that humans do not have to entirely relearn after birth because the brain wiring-up in the womb already filters out inference patterns that don't win.
That knowledge can come from the "accumulated wisdom" of the evolution history, meaning you need to account for how that data was transformed in a human's present internal model.
ETA: Wow, I was sloppy when I wrote this; hope the point was able to shine through. Typos and missing words corrected. Should make more sense now.
The reason I didn't link to that discussion is that it was kind of tangential to what will be my main points. My goal is to understand the natural setting of the learning problem, not the specifics of how humans solve it.