LLMs may capture key components of human agency

catubc

TLDR: LLM confabulation may be a significant feature - not flaw - of how human memory works and how humans experience agency in the world. Namely through narratives that are largely self-consistent - but not guaranteed to match facts in the world.

Introduction

The recent explosion of Large Language Models (LLMs) models such as GPT2 and GPT3 has shown that language models can achieve state-of-the-art results on a variety of natural language processing tasks, while also providing insights into how humans construct meaning. One of the most interesting results is that LLMs often engage in "confabulation". Confabulation is a well known phenomenon (going back to Freud) and refers to the situations where people create false memories, or false explanations without the intent to deceive (also known as "honest lying" Moscovitch 1989).

There are several lines of criticisms on why GPT3 models are not as useful nor as good of models of human language as it is claimed (or even true artificial general intelligence (AGI)). Here I argue that GPT3 confabulation may not be a weakness but a feature. In particular, I argue that confabulation may be a feature of memory and the way the human mind works, and that even our sense of agency may be founded on this capacity to make up stories that are self-consistent.

Maybe human memory defaults to confabulation when information is missing or a self-consistent story needs to be told. Confabulation is often studied in subjects with memory disorders, dementia or psychiatric disorders and often the "vividness" with which subjects describe confabulates stories can be remarkable. Such stories can be internally consistent with one another and often contain details that seem impossible to know if they came from personal experience.

But confabulation is an extremely broad phenomenon occurring in both pathological and non-pathological conditions. And it may not be a flaw of reasoning or a way to escape social embarrassment of not knowing, it might be a fundamental way that human memory and language work. That is, memory is often imperfect—we forget things and most of us don't have photographic recollections - and our minds fill in the gaps with information from other sources. There is evidence showing that normal adults can also engage in confabulation. A study by Moscovitch (1992) showed that when asked questions about their life experiences, subjects would often claim to have had experiences they could not possibly have had. And memory is essentially a "reconstructive" process where we piece together parts into a coherent self-consistent whole.

Memory distortion, rather than memory loss, occurs because remembering is often a reconstructive process. To convince oneself of this, one only has to try to remember yesterday's events and the order in which they occurred; or even, as sometimes happens, what day yesterday was. Damage to neural structures involved in the storage, retention, and automatic recovery of encoded information produces memory loss which in its most severe form is amnesia (...). Memory distortion, however, is no more a feature of the memory deficit of these patients than it is of the benign, and all too common, memory failure of normal people. ... Though flagrantly distorted and easily elicited, confabulations nonetheless share many characteristics with the type of memory distortions we all produce. Studying confabulation from a cognitive neuroscience perspective, of interest in its own right, may also contribute to our understanding of how memories are normally distorted. (Moscovitch, "Confabulation")

There is also "motivated" confabulation, e.g. one study showed that high school students who were more emotionally invested in a romantic partner were more likely to confabulate about that relationship; another study of college students found that subjects were more likely to confabulate when they were under pressure to perform well.

In sum, although confabulation is most vivid and apparent in pathological conditions - there is evidence to suggest that memory reconstruction even in healthy individuals requires varying degrees of confabulation to fill in information, elevate social status or defend one's position among many factors.

Maybe we don't have free will - and confabulation is how we deal with it. Confabulation may play an even more striking role in our very sense of being agents in the world. There are two main psychological theories on why humans feel agency. The first theory is called the "comparator model" and posits that humans "feel a sense of agency" in the world by comparing an "efference copy" of motor commands with the incoming sensory information following an action. If the "predicted" change in the state of the world (contained in the sensory state) matches the predicted one by the motor command - then a sense of agency is experienced. That is, we feel like we were the actors behind the change of state in the world if we could predict the change (at the neural state level) with high accuracy. This system can be gamed - and we can feel an illusory sense of agency by accident; for example we can feel "agency when externally generated events incidentally matched their predictions" (Sato and Yasuda 2005). In the same way we can even experience vicarious agency as "control over the movements of others" in lab experiments where subjects view movement in the limbs of others carrying out “instructed” actions (Wegner et al 2004).

The second theory of agency is the theory "of apparent mental causation" (Wegner and Wheatley 1999) and is even more striking. The theory posits that agency is experienced from a "cognitive inference", i.e. an essentially conscious post facto evaluation, of whether one’s thoughts might have caused one’s actions. This evaluation requires that thoughts are consistent with outcomes (consistency); that actions are conducted before outcomes (priority); and that no alternative causes or explanations are available (exclusivity). Basically, if we intended for some event, the event happened and there's nobody else that could have caused it - we feel like "our thoughts" caused it. I say "our thoughts" because the theory of "apparent mental causation" basically states that our thoughts never cause anything, it is subconscious states that cause both the action and the thought. And we experience the conscious thought and observe the action but don't have access or insight of where they come from as by their nature they are caused by subconscious systems.

In sum, both of these theories require a post facto evaluation to be carried out where a narrative unfolds to support some set of facts. In other words, neither theory requires knowing - or matching - the de facto ground truth of what caused the event, but rather a highly consistent story either at the internal (neuronal) or cognitive (evaluative) levels^[1]. Sounds familiar?

Conclusion

Maybe LLM confabulation is just an exaggerated mode of otherwise normal human function. Namely, of how human beings tell themselves stories about their memories and their role in the world that only require internal consistency - not to be true reflections of the state of the world.

---Moscovitch, M. (1989). Confabulation and the frontal systems: Strategic versus associative retrieval in neuropsychological theories of memory. In H. L. Roediger III & F. I. M. Craik (Eds.), Varieties of memory and consciousness: Essays in honour of Endel Tulving (pp. 133–160). Lawrence Erlbaum Associates, Inc.

Sato, A. & Yasuda, A. Illusion of sense of self-agency: discrepancy between the predicted and actual sensory consequences of actions modulates the sense of self-agency, but not the sense of self-ownership. Cognition 94, 241–255 (2005).

Wegner, D. M., Sparrow, B. & Winerman, L. Vicarious agency: experiencing control over the movements of others. J. Pers. Soc. Psychol. 86, 838–848 (2004).

^{^}
There is also an Integrator model of agency which combines all the factors of the two main theories discusses with other causes in a Bayesian evaluative process.

LESSWRONG
LW

LESSWRONG
LW

27

LLMs may capture key components of human agency

27

27

27