Words That Betray Generative AI Text

Share

So far, even Artificial intelligence companies have struggled to create tools that can reliably detect when a text has been written generated using a large language modelNow, a group of researchers has developed a fresh method for estimating LLM usage across a broad set of scholarly works by measuring which “redundant words” began to appear significantly more frequently in the LLM era (i.e., 2023 and 2024). The results “suggest that at least 10 percent of the 2024 abstracts were processed using LLM,” according to the researchers.

IN pre-release article published earlier this monthFour researchers from Germany’s University of Tübingen and Northwestern University said they were inspired by studies measuring the impact of the Covid-19 pandemic looking at the excess deaths compared to the recent past. Looking in a similar way at the “excessive use of words” after LLM writing tools became widely available in late 2022researchers found that “the emergence of LLM led to a sudden increase in the frequency of certain style words” that was “unprecedented in terms of quality and quantity.”

Delving In

To measure these changes in vocabulary, researchers analyzed 14 million abstracts of articles published on PubMed between 2010 and 2024, tracking the relative frequency of each word in each year. They then compared the expected frequency of those words (based on the pre-2023 trend line) with the actual frequency of those words in the abstracts from 2023 and 2024, when LLMs were in common employ.

The results showed that many words that were extremely infrequent in these scientific abstracts before 2023 suddenly became more common after the introduction of the LLM. For example, the word “delves” appears in 25 times more 2024 articles than would be expected under the pre-LLM trend; words such as “showcasing” and “underscores” also increased their usage ninefold. Other previously common words became noticeably more common in the abstracts after the LLM: for example, the frequency of “potential” increased by 4.1 percentage points, “findings” by 2.7 percentage points, and “crucial” by 2.6 percentage points.

These kinds of changes in word usage can, of course, occur independently of LLM usage — the natural evolution of language means that words sometimes go in and out of fashion. But the researchers found that in the pre-LLM era, such huge and sudden year-on-year increases were only seen for words associated with major global health events: “ebola” in 2015; “zika” in 2017; and words like “coronavirus,” “lockdown,” and “pandemic” in 2020–2022.

But in the period following the LLM, the researchers found hundreds of words with sudden, pronounced spikes in scientific usage that had nothing to do with world events. In fact, while the Covid pandemic’s excess words were overwhelmingly nouns, the researchers found that the words with a post-LLM spike were overwhelmingly “style words,” such as verbs, adjectives, and adverbs (a compact sample: “across, additionally, comprehensive, crucial, intensifying, exhibited, observations, in particular, particularly, within”).

This is not a completely fresh discovery – the term “delve” is increasingly appearing in scientific papers has been widely reported in the recent pastfor example. However, previous studies have relied on comparisons with “ground truth” human writing samples or lists of predefined LLM markers obtained from outside the study. Here, the pre-2023 set of abstracts acts as its own effective control group to show how vocabulary choice has changed overall in the post-LLM era.

Elaborate interaction

By highlighting hundreds of so-called “marker words” that have become much more common in the post-LLM era, signs of LLM employ can sometimes be straightforward to spot. Take this sample abstract poem elicited by researchers, with the marker words highlighted: “A exhaustive understanding complicated interaction between […] AND […] Is key to find effective therapeutic strategies.”

After doing some statistical measurements of the occurrence of tag words in individual articles, the researchers estimate that at least 10 percent of the post-2022 articles in the PubMed corpus were written with at least some LLM assistance. That number could be even higher, the researchers say, because their set may have been missing LLM-assisted abstracts that do not contain any of the tag words they identified.

Latest Posts

More News