Cross-posting this from my blog, since it seems relevant.
The case for GPT understanding language, by way of understanding the world
There's a debate going on about whether or not language models similar to ChatGPT have the potential to be scaled up to something truly transformative. There's a group of mostly cognitive linguists (e.g. Gary Marcus ) that hold that ChatGPT does not understand language, it merely models what word is likely to follow the preceding- and this is importantly different from true language understanding. They see this as an "original sin" of language models which means there are limits to how good language models can get.
Freddie de Boer says much the same thing:
You could say that ChatGPT has passed Winograd’s test with flying colors. And for many practical purposes you can leave it there. But it’s really important that we all understand that ChatGPT is not basing its coindexing on a theory of the world, on a set of understandings about the understandings and the ability to reason from those principles to a given conclusion. There is no place where a theory of the world “resides” for ChatGPT, the way our brains contain theories of the world. ChatGPT’s output is fundamentally a matter of association - an impossibly complicated matrix of associations, true, but more like Google Translate than like a language-using human. If you don’t trust me on this topic (and why would you), you can hear more about this from an expert on this recent podcast with Ezra Klein.
It is true that GPT works by predicting the next token of language. It is, in some sense, as Gary Marcus put it “a glorified spreadsheet” built for this purpose. However, I do not think that this is contradictory to the notion that it understands, even if imperfectly, both language and world. My view is that in a very large language corpus, there are patterns that correspond to the way things relate to each other in the world. As models of language become more sophisticated and predictively powerful, they necessarily become models not just of language, but of the data-generating process for a corpus of that language. That data-generating process for language includes a theory of the world. Given enough data (billions or even trillions of tokens) these models can be very accurate.
We might say:
If X models Y, and Y models Z,
and if each model is sufficiently faithful,
Then X is a model of Z.
Since GPT is a superb model of language, and language corpora are such a superb model of the world, it seems to me that GPT is probably a model of the world. Moreover, ChatGPT in particular is a model very capable of responding to questions about the reality it models. I think that it’s usually true that if I contain a detailed model of something, and I can use that model to solve difficult problems about the thing it models, then I understand that thing to at least some degree. I want to say then that GPT understands the world- or that if it doesn’t understand the world, the problem isn’t that its understanding of language is ungrounded in extralinguistic factors.
But it seems there's a sort of prejudice against learning by direct exposure to language, as opposed to other kinds of sensory stimulus. Learning language, without being exposed to other sensory modalities of the world, is seen as disconnected from reality, but that seems rather unfair- why think a string of words alone is insufficient to infer an underlying reality, but a field of color patches alone might be sufficient?
Consider a kind of naive empiricist view of learning, in which one starts with patches of color in a field (vision), and slowly infers an underlying universe of objects through their patterns of relations and co-occurrence. Why is this necessarily any different or more grounded than learning by exposure to a vast language corpus, wherein one also learns through gaining insight into the relations of words and their co-occurences?
Suppose that we trained a computer to get very good at predicting what visual experiences will follow after previous visual experiences. Imagine a different Freddie de Boer wrote this:
You could say that VisionPredictor has passed the test of predicting future patches of colour with flying colours. And for many practical purposes you can leave it there. But it’s really important that we all understand that VisionPredictor is not basing its predictions on a theory of the world, on a set of understandings about the understandings and the ability to reason from those principles to a given conclusion. There is no place where a theory of the world “resides” for VisionPredictor, the way our brains contain theories of the world. VisionPredictors output is fundamentally a matter of association - an impossibly complicated matrix of associations between patches of colour.
No one would buy this- they’d come to the more reasonable conclusion that VisionPredictor had constructed implicit theories about ideas like “chairs” and “trees” which are used to predict what the next frame of what it sees will look like. If something that looks like a chair is in one frame, they’ll probably be one in the next frame, looking the appropriate way given the apparent motion of the camera. Certainly, if VisionPredictor is vast- with billions of parameters- and complex in its structure- with dozens of successive layers, it will have enough room to store and implement such a theory. Moreover, a process of training via backpropagation and gradient descent will lead it- through a quasi-evolutionary process- to such a theory, as the only efficient way to predict the next visual frame.
So why do people have more trouble thinking that people could understand the world through pure vision than pure text? I think people's different treatment of these cases- vision and language- may be caused by a poverty of stimulus- overgeneralizing from cases in which we have only a small amount of text. It's true that if I just tell you that all qubos are shrimbos, and all shrimbos are tubis, you'll be left in the dark about all of these terms, but that intuition doesn't necessarily scale up into a situation in which you are learning across billions of instances of words and come to understand their vastly complex patterns of co-occurrence with such precision that you can predict the next word with great accuracy. Based on that vast matrix of words- words from copra being used to describe the world- you’d have enough data to construct a theory of what the underlying reality that generated those corpora looked like.
The fundamental pattern in language learning seems to me to be the same in the naïve empiricist story- there are bits [patches of color or words] that are inherently meaningless, but co-occur in certain ways, gradually from conjunctions we build up a theory of an underlying world behind the [patches of color or words]. Understanding built on a vast corpus of text is no different to understanding built on thousands of hours of visual experience.
True, vision feels like a more immediate connection with reality. But remember that this immediacy- this sense of directly seeing “tables” and “chairs” is mediated through numerous mechanisms- some evolved, some learned and others in between- constructing a theory of the arrangement of things from patches of color, using clues like movement, change, the difference in what the two eyes see, etc.
So there’s little reason to think a theory of the world couldn’t be learned from text in principle. Could GPT-3 in particular learn a model from the world? It seems to me there is no reason to think that GPT-3 couldn't contain, in its billions of parameters and scores of layers, a model of the world. You can model just about anything with neural nets, and certainly, the training data seems extensive enough. Moreover, without going into the technical details, my understanding is that if creating a model of the world to predict the next world is the only viable way of predicting the next word with sufficient accuracy, the process of backpropagation and gradient descent will drive GPT towards it.
So we have three reasons to think ChatGPT understands ‘the world behind the language’:
1. It's superb at behaving as if does.
2. In principle, it has enough free variables to represent a pretty good model of the world.
3. There's no reason a model of the world couldn't be learned from a very large corpus of text that represents the world, and the interrelations between words in that corpus, that correspond to patterns in the world. It’s possible in the same way that viewing nothing but a sequence of films (a sequence of arranged pixels) could be used to construct a theory of the world.
I ask ChatGPT
Me: Okay, I'm going to lay my cards on the table here. I'm a philosopher with an interest in language and AI. A common idea about language is that in order to understand language, an agent must be "grounded" in a certain sense- it must correlate words with experiences. Your existence- ChatGPT- has threatened many people's conceptions- because your training data is exclusively textual, and yet you seem to have a comprehension of the world in the operational sense that you can answer questions about it. There's a huge debate going on as to whether you can understand language as such. I'd like to ask if there was anything you wanted to add to that debate?
ChatGPT: As a language model, my understanding of language is based on the patterns and relationships that I have learned from the large amount of text data that I have been trained on. This text data includes a wide variety of information about the world, including facts, ideas, and experiences. While I do not have my own experiences or consciousness, I can use the patterns and relationships that I have learned to generate responses to questions and prompts that are coherent and informative.
It is true that my understanding of language is different from that of humans, and some people may question whether I can truly understand language in the same way that humans do. However, it is also important to recognize that my ability to generate responses to questions and prompts can be seen as a form of understanding in its own right. Additionally, my ability to generate responses to questions and prompts can be tested, and my performance can be evaluated and compared with that of humans.
It is also worth noting that the debate about AI and language understanding is ongoing and complex, and there are many different perspectives and opinions on this topic. My own capabilities are still limited and there are a lot of open question about the nature of intelligence and understanding, and it is an active area of research.
GPT can not "predict the next word with great accuracy" for arbitrary text, the way that a physics model can predict the path of a falling or orbiting object for arbitrary objects. For example, neither you nor any language model (including future language models, unless they have training data pertaining to this Lesswrong comment) can predict that the next word, or following sequence of words making up the rest of this paragraph, will be:
first, a sentence about what beer I drank yesterday and what I am doing right now - followed by some sentences explicitly making my point. The beer I had was Yuengling and right now I am waiting for my laundry to be done as I write this comment. It was not predictable that those would be the next words because the next sequence of words in any text is inherently highly underdetermined - if the only information you have is the prompt that starts the text. There is no ground truth, independent of what the person writing the text intends to communicate, about what the correct completion of a text prompt is supposed to be.
Well one thing to note is that actual learning (in humans at least) does not only involve getting data from vision, but also interacting with the world and getting information from multiple senses.
But the real reason I think the two are importantly different is that visual data about the world is closely tied to the way the world actually is - in a simple, straightforward way that does not require any prior knowledge about human minds (or any minds or other information processing systems) to interpret. For example, if I see what looks like a rock, and then walk a few steps and look back and see what looks like the other side of the rock, and then walk closer and it still looks like a rock, the most likely explanation for what I am seeing is that there is an actual rock there. And if I still have doubts, I can pick it up and see if it feels like a rock or drop it and see if it makes the sound a rock would make. The totality of the data pushes me towards a coherent "rock" concept and a world model that has rocks in it - as this is the simplest and most natural interpretation of the data.
By contrast, there is no reason to think that humans having the type of minds we have, living in our actual world, and using written language for the range of purposes we use it for is the simplest, or most likely, or most easily-converged-to explanation for why a large corpus of text exists.
From our point of view, we already know that humans exist and use language to communicate and as part of each human's internal thought process, and that large numbers of humans over many years wrote the documents that became GPT's training data.
But suppose you were something that didn't start out knowing (or having any evolved instinctive expectation) that humans exist, or that minds or computer programs or other data-generating processes exist, and you just received GPT's training data as a bunch of meaningless-at-first-glance tokens. There is no reason to think that building a model of humans and the world humans inhabit (as opposed to something like a markov model or a stochastic physical process or some other type of less-complicated-than-humans model) would be the simplest way to make sense of the patterns in that data.