Thursday, July 18, 2024

LLM Echo Chamber

I'm not sure when I started to dislike what I call puffy writing. It was some time after college, probably. I think it rubbed off on me from working at publications and having friends who were editors. I was a typesetter, and then a graphic designer. Often we needed to shorten things to get them to fit.

Puffiness is about pretension — beyond the mere unnecessary.

A recent study by researchers in Germany and the U.S. analyzed vocabulary usage in 14 million PubMed abstracts between 2010–2024, looking at how usage changed after the introduction of "AI" (LLMs) as a tool, particularly in the wording of abstracts.

They found that there has been an obvious increase in the frequency of what the researchers call "style words," such as:

  • delves
  • showcasing
  • underscores
  • crucial
  • intricate / intricacies
  • surpassing
  • comprehensive

They write:

The following examples from three real 2023 abstracts illustrate this ChatGPT-style flowery language:
  • By meticulously delving into the intricate web connecting [...] and [...], this comprehensive chapter takes a deep dive into their involvement as significant risk factors for [...].
  • A comprehensive grasp of the intricate interplay between [...] and [...] is pivotal for effective therapeutic strategies.
  • Initially, we delve into the intricacies of [...], accentuating its indispensability in cellular physiology, the enzymatic labyrinth governing its flux, and the pivotal [...] mechanisms.

From their discussion:

We found that the effect was unprecedented in quality and quantity: hundreds of words have abruptly increased their frequency after ChatGPT became available. In contrast to previous shifts in word popularity, the 2023–24 excess words were not content-related nouns, but rather style-affecting verbs and adjectives that ChatGPT-like LLMs prefer.

Our analysis of the excess frequency of such LLM-preferred style words suggests that at least 10% of 2024 PubMed abstracts were processed with LLMs. With ∼1.5 million papers being currently indexed in PubMed per year, this means that LLMs assist in writing at least 150 thousand papers per year.

What are the implications of this ongoing revolution in scientific writing? ...LLMs are infamous for making up references, providing inaccurate summaries, and making false claims that sound authoritative and convincing... [citations omitted from this paragraph]

And what is the title of this paper? "Delving into ChatGPT Usage in Academic Writing Through Excess Vocabulary." Ha, very ha.


No comments: