In this post, I’ve made no attempt to give an exhaustive presentation of the countless unintended consequences of widespread LLM use; rather, I’ve concentrated on three potential effects that are at the borderline of research, infrequently discussed, and appear to resist a foreseeable solution.

This post argues that while LLMs exhibit impressive capabilities in mimicking human language, their reliance on pattern recognition and replication may, among other societally destructive consequences:

  1. stifle genuine creativity and lead to a homogenization of writing styles — and consequently —  thinking styles, by inadvertently reinforcing dominant linguistic patterns while neglecting less common or marginalized forms of expression,
  2. eliminate opportunities for serendipitous discovery; and
  3. disrupt intergenerational transfer of wisdom and knowledge.

 

As I argue in detail below, there is no reason to believe that those problems are easily mitigatable. The sheer scale of LLM-derived content production, which is likely to dwarf human-generated linguistic output in the near future, poses a serious challenge to the preservation of lexical diversity. The rapid proliferation of AI-generated text could create a “linguistic monoculture”, where the nuanced and idiosyncratic expressions that characterize human language are drowned out by the algorithmic efficiency of LLMs.

 

LLMs Threaten Creativity in Writing (and Thinking)

LLMs are undoubtedly useful for content generation. These models, trained on vast amounts of data, can generate perfectly coherent and contextually relevant text in ways that ostensibly mimic human creativity. It is precisely this very efficiency of LLMs that will tilt the scales in favor of AI-generated content over time. Upon closer examination, there seem to be a number of insidious consequences infrequently discussed in this connection: the potential erosion of genuine creativity, linguistic diversity, and ultimately, the richness of human expression, human thinking, and human experience that depend on our independent linguistic capacities.


LLMs operate on a principle of pattern recognition and replication. They are great at identifying and reproducing stylistic patterns, grammatical structures, and even thematic elements from their training data. This often creates an illusion of creativity, where the generated text appears novel and insightful. However, this "creativity" is merely a recombination of existing elements, utterly devoid of true originality or the spark of human imagination. It is a human, not machine, who will definitively decide whether an idea is truly creative with any degree of meaningful acceptance. 

As writers and readers increasingly rely on LLMs for inspiration and content generation, there is a plausible risk that their own creative processes will become substantially stifled. The convenience and efficiency of these models is likely to discourage people from engaging in deep thought, exploration, and experimentation, which are essential for cultivating genuine creativity. Instead, they may become passive consumers of pre-packaged ideas and expressions, thereby leading to a homogenization of writing styles and a decline in originality.


It’s important to remember that language is inherently dynamic and evolving. It is shaped by cultural influences, historical events, and individual expression grounded in human experience. It is characterized by rich vocabulary, idiomatic expressions, and sentence structures that reflect the diversity of human thought and experience. Precisely because of that, I argue, the increasing reliance on LLMs threatens to inevitably erode this linguistic diversity. For example, it is reasonable to expect the narrowing of vocabulary, the loss of unique phrases and idioms, and a decline in the use of certain sentence structures, partly because those may be less “algorithmically efficient/desirable” and partly due to a “bottleneck effect”, which is created where less frequent or specialized terms and sentence structures are gradually marginalized and eventually lost from active vocabulary, given enough time.

I am reasonably confident that the in-built tendency of LLMs to favor high-frequency lexical items, coupled with the self-reinforcing nature of language use, is going to further exacerbate this phenomenon. As a result, the written world, including academic papers, news articles, blogposts and other forms of textual content, may become increasingly standardized and formulaic and would notably lack the vibrancy and nuance it previously had.

 

More worryingly, LLMs’ perpetuation of biases and stereotypes, which are clearly present in their training data, only makes things worse. It further homogenizes language and limits the range of perspectives expressed in writing. It has a detrimental impact since unique voices and experiences may be overlooked or misrepresented. 


One should remember that writing is not merely a means of communication; it is an act of self-expression, and a way for individuals to share their thoughts, emotions, and experiences with the world. Each writer has (and some used to have) an utterly unique voice, shaped by their personal background, cultural identity, and individual perspective. This voice is (and in some cases, was, prior to the widespread use of LLMs) reflected in their choice of words, their use of language, the references they make, the examples they pick, and their overall style.
 

Unfortunately, however, there’s every reason to believe that the increasing reliance on LLMs may lead to a loss of this individual voice on a massive scale. Unique styles may become diluted or even erased. The generated text, while seemingly “plausible-sounding, syntactically correct and semantically meaningful”, [1] often lack the personal touch, the emotional resonance, and the idiosyncrasies that make writing truly compelling and engaging.
 

One could make a counterargument that bias mitigation is possible and that LLMs that are trained on a wide range of data sources and reflect diverse perspectives and experiences are going to productively address the problems presented in this article. 

 

The problem with this line of thought, however, is that it doesn’t look far enough. The true issue has to do not with mere representation, but with the inherent nature of LLMs as statistical pattern recognition machines. Even with diverse training data, LLMs tend to favor high-frequency patterns and predictable outputs, which will inexorably lead to a gradual homogenization of language and a neglect of less common or nuanced expressions – regardless of the diversity of data-sources on which the models are trained. The pressure to optimize for fluency and coherence, even in highly bias-mitigated models, could unintentionally reinforce the use of generic and safe language, thus stifling creative exploration and critical thinking.

The dynamic interplay between language and thought is unquestionable. It suggests that the impact of LLMs on vocabulary is not simply a matter of input and output. The very act of interacting with and heavily relying on LLMs for language processing could subtly shape our cognitive processes and linguistic habits. The ease and efficiency of LLM-generated text might discourage the active engagement and deep processing necessary for maintaining a rich and diverse vocabulary. Further, the allure of effortless communication could easily lead to a gradual atrophy of our lexical retrieval abilities, as we outsource the cognitive effort of word choice and meaning-making to AI (i.e., “cognitive offloading”). 

The Loss of Serendipity and Intergenerational Wisdom in the Age of LLMs

It is easy to imagine how LLMs’ overreliance on optimization algorithms will stifle serendipity and disrupt the intergenerational transfer of wisdom, both of which are essential elements of human knowledge acquisition and cultural evolution.
 
Serendipity

From an epistemological viewpoint, serendipity is defined as the accidental discovery of something valuable or insightful while searching for something else. It is a key driver of innovation, as it allows for the unexpected connections and cross-pollination of ideas that often lead to breakthroughs in science, art, and other fields.
 

It’s plausible to assume that LLMs, with their excessive focus on optimization and pattern recognition, will eliminate many of the conditions that foster natural serendipity. Their algorithms are fundamentally designed to deliver the most relevant and predictable results based on existing data and user preferences. This is likely to create a huge, potentially unrecognizable filter bubble, where users are only exposed to information that aligns with their pre-existing beliefs and interests (i.e., massive confirmation bias).
 

Lack of exposure to diverse and, more specifically, unexpected information can limit the scope for serendipitous discovery. Less random encounters with novel ideas and perspectives means less creative processes and less innovative solutions to complex problems. Seeing the same regurgitated content framed in seemingly different, yet fundamentally similar ways, is neither intellectually fruitful nor conducive to productive discussions. 

The precise mechanisms by which LLMs may curtail serendipitous discovery can be elucidated through the lens of their algorithmic architecture and training data. For instance, LLMs that are employed in recommendation systems or personalized search engines often prioritize content that aligns with a user's existing interests and preferences. As advised, this is bound to create an informational echo chamber and limit exposure to certain perspectives and unexpected connections that might otherwise spark novel ideas or insights. The algorithmic focus on relevance and engagement ends up filtering out the very elements of surprise and incongruity that often catalyze serendipitous discoveries.  The critique of echo chambers applies to recommendation algorithms generally, but LLMs in particular amplify this effect due to their rapid, high-volume content generation. This societal-level echo chamber can arise from biased training data, self-reinforcing feedback loops, and the homogenization of information.

Unlike regular recommendation algorithms, where content curation and filtering occur at a slower pace and on individual level, the sheer volume of content generated by LLMs and the speed at which they can produce information means that the echo chamber effect can develop rapidly and be more pervasive.  

The training data for LLMs is indeed incredibly vast and wide-ranging. However, it is inherently biased towards existing knowledge and established patterns. The result is a reinforcement of conventional wisdom and a reluctance to explore unconventional or less-trodden paths, as is the case with human creativity. The algorithmic tendency to generate predictable and plausible outputs, while useful in many applications, inadvertently stifles the exploration of the unknown and the generation of truly original ideas.

Importantlythe fluency and apparent comprehensiveness of LLM-generated responses creates an illusion of omniscience and discourages users from seeking out alternative sources of information or engaging in independent exploration. The ease with which LLMs provide answers, even to complex or open-ended questions, fosters a passive and uncritical approach to knowledge acquisition and hinders the active pursuit of knowledge and the joy of discovery that often characterize the natural cognitive process.

The emphasis on data-driven and algorithmic approaches in LLM development inadvertently devalues the role of human intuition and tacit knowledge in the creative process. Serendipity often arises from the interplay between conscious reasoning and unconscious associations, a process that LLMs, given their reliance on explicit data and apparently logical inference, may struggle to replicate. The prioritization of quantifiable metrics and algorithmic efficiency will likely lead to a neglect of the more elusive and intuitive aspects of human creativity.

One could further counter-argue in this regard that LLMs can actually facilitate serendipity by generating novel combinations of existing ideas and concepts. 

But this is easily refutable: While LLMs can generate novel combinations of existing elements, this "combinatorial creativity" is fundamentally different from the true serendipity that arises from unexpected encounters with unrelated or seemingly irrelevant information. The algorithmic nature of LLMs is indeed good at pattern recognition and recombination, but it may struggle to replicate the intuitive leaps and non-linear connections that often characterize genuine serendipitous discoveries in the real-world. 

Intergenerational Transfer of Wisdom

The intergenerational transfer of wisdom is a fundamental aspect of human civilization, and as will be argued below, could be profoundly shaped by the advent of LLMs and their increasing predominance. It is through the transmission of knowledge, values, and cultural practices from one generation to the next that societies maintain their identity, adapt to change, and ensure their continued survival. LLMs are known for their useful ability to generate vast amounts of information and automate tasks, but at the same time, they could be highly disruptive to cross-generational wisdom transfer.

 The convenience and efficiency of these models are likely to lead to a decreased reliance on traditional forms of knowledge transmission, such as oral storytelling, apprenticeship, mentorship, and as implied above - creative writing. The algorithms that power LLMs may not be equipped to capture the nuances and subtleties of human wisdom, which often lies in context-rich intuition, tacit knowledge, embodied practices, and emotional intelligence. With the passage of time, we can plausibly expect to see less valuable insights and perspectives that are not easily codified or digitized. 

The LessWrong community tends to rightly emphasize the importance of rationality, which involves making decisions based on evidence and logical reasoning. Along the same vein, rationality very well recognizes the value of uncertainty and the potential for unexpected discoveries. In this respect, LLMs are exceedingly and dangerously focused on algorithmic optimization, user-friendliness, and predictability, which I believe will contribute to a vastly narrower and more deterministic view of knowledge acquisition. Tacit knowledge, knowledge derived from anecdotal experiences, and cultural practices are extremely instrumental in wisdom building and may not be easily quantifiable or programmable. 

 

[1] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10539543/

New Comment
2 comments, sorted by Click to highlight new comments since:

Given the subject matter, I have to ask. To what extent, if any, were LLMs involved in the creation of this text?

Thank you for the article. I think these "small" impacts are important to talk about. If one frame the question as "the impact of machines that think for humans", that impact isn't going to be a binary of just "good stuff" and "takes over and destroys humanity", there are intermediate situations like the decay of human abilities to think critically that are significant, not just in themselves but for further impacts; IE, if everyone is dependent on Google for their opinions, how does this impact people's opinion AI taking over entirely.