I structure this post as a critique of some recent papers on the philosophy of mind in application to LLMs, concretely, on whether we can say that LLMs think, reason, understand language, refer to the real world when producing language, have goals and intents, etc. I also use this discussion as a springboard to express some of my views about the ontology of intelligence, agency, and alignment.
* Mahowald, Ivanova, et al., “Dissociating language and thought in large language models: a cognitive perspective” (Jan 2023). Note that this is a broad review paper, synthesising findings from computational linguistics, cognitive science, and neuroscience, as well as offering an engineering vision (perspective) of building an AGI (primarily, in section 5). I don’t argue with these aspects of the paper’s content (although I disagree with something about their engineering perspective, I think that engaging in this disagreement would be infohazarous). I argue with the philosophical content of the paper, which is revealed in the language that the authors use and the conclusions that they make, as well as the ontology of linguistic competencies that the authors propose.
* Shanahan, “Talking About Large Language Models” (Dec 2022).
Dissociating language and thought in large language models: a cognitive perspective
In this section, I shortly expose the gist of the paper by Mahowald, Ivanova, et al., for the convenience of the reader.
Abstract:
> Today’s large language models (LLMs) routinely generate coherent, grammatical and seemingly meaningful paragraphs of text. This achievement has led to speculation that these networks are—or will soon become—“thinking machines”, capable of performing tasks that require abstract knowledge and reasoning. Here, we review the capabilities of LLMs by considering their performance on two different aspects of language use: ‘formal linguistic competence’, which includes knowledge of rules and patterns of a given language, and ’functional lingu