Original version With this story appeared in Quanta Magazine.
Which of the countless abilities that humans possess are uniquely human? Language has been a prime candidate since at least the time of Aristotle, who wrote that humanity is “an animal that has a tongue.” Even if gigantic language models like ChatGPT superficially replicate regular speech, researchers want to know whether there are specific aspects of human language that simply have no analogues in the communication systems of other animals or artificial intelligence devices.
In particular, researchers have examined the extent to which language models can make inferences about the language itself. For some members of the linguistic community, linguistic models are not just models NO they have the ability to reason, they jargon. This view was summarized by the eminent linguist Noam Chomsky and two co-authors in 2023 when he typed New York Times that “correct explanations of language are complex and cannot be learned by simply immersing yourself in large datasets.” Scientists argue that AI models may be proficient in language, but they are unable to analyze language in a sophisticated way.
This view has recently been challenged paper By Gasper Begušlinguist from the University of California, Berkeley; Maksymilian Dąbkowskiwho recently earned a PhD in linguistics from Berkeley; AND Ryan Rhodes Rutgers University. The researchers subjected a series of gigantic language models, or LLMs, to a series of language tests, including in one case an LLM generalization of the principles of an invented language. Although most LLMs couldn’t analyze language rules the way humans can, one had impressive abilities that far exceeded expectations. He was able to analyze language in much the same way that a linguistics graduate would – by diagramming sentences, resolving many ambiguous meanings, and using convoluted linguistic features such as recursion. This discovery, Beguš said, “challenges our understanding of what artificial intelligence can do.”
This fresh work is both timely and “very important,” he said Tom McCoya computational linguist at Yale University who was not involved in the research. “As society becomes more and more dependent on this technology, it becomes increasingly important to understand where it can succeed and where it can fail.” He added that language analysis is an ideal testing ground for assessing the degree to which these language models can reason like humans.
Infinite complexity
One of the challenges of putting language models through a stringent language test is making sure they don’t already know the answer. These systems are typically trained on immense amounts of written information – not only most of the Internet in dozens if not hundreds of languages, but also on materials such as linguistics textbooks. The models could theoretically simply remember and repeat the information they received during training.
To avoid this, Beguš and his colleagues created a four-part language test. Three of the four parts involved asking the model to parse specially crafted sentences using tree diagrams, which were first introduced in Chomsky’s seminal 1957 book, Syntactic structures. These diagrams divide sentences into noun and verb phrases, and then divide them into nouns, verbs, adjectives, adverbs, prepositions, conjunctions, and so on.
One part of the test focused on recursion – the ability to embed phrases within phrases. “The sky is blue” is a uncomplicated English phrase. “Jane said the sky was blue” embeds the original sentence in a slightly more convoluted one. Importantly, this process of recursion can continue indefinitely: “Mary wondered if Sam knew that Omar heard Jane say the sky was blue” is also a grammatically correct, if awkward, recursive sentence.
