Why did language evolve? While the answer may seem obvious – as a way for individuals to exchange information – linguists and other communication researchers have been debating this question for years. Many prominent linguists, including MIT’s Noam Chomsky, argue that language is actually poorly designed for communication. They argue that such utilize is merely a byproduct of a system that probably evolved for other reasons – perhaps to structure our private thoughts.
As evidence, these linguists point to the existence of ambiguity: They argue that in a system optimized for transmitting information between speaker and listener, each word would have only one meaning, eliminating any chance of confusion or misunderstanding. Now a group of cognitive scientists at MIT have turned that idea on its head. In a up-to-date theory, they argue that ambiguity actually makes language more productive by allowing the reuse of brief, effective sounds that listeners can easily disambiguate through context.
“Various people have argued that ambiguity is a communication problem,” says Ted Gibson, a professor of cognitive science at MIT and senior author of a paper describing the research, forthcoming in the journal Cognition. “But the fact that context disambiguates has important implications for the reuse of potentially ambiguous forms. Ambiguity is no longer a problem – it can be exploited because it can be easily reused [words] in different contexts, many times.”
The paper’s lead author is Steven Piantadosi, a 2011 Ph.D. student; co-author is Harry Tily, assistant professor at the Department of Brain Sciences and Poznań.
What do you mean’?
As a somewhat ironic example of ambiguity, consider the word “to signify.” Of course, it can mean to indicate or signify, but it can also refer to an intention or purpose (“I was going to go to the store”); something offensive or nasty; or the mathematical average of a set of numbers. Adding the “s” introduces even more potential definitions: an instrument or method (“a means to an end”) or financial resources (“to live within one’s means”).
Yet virtually no English speaker is confused when they hear the word “mean.” This is because the different meanings of the word occur in such different contexts that listeners can almost automatically infer its meaning.
Given the power of context to discriminate, researchers hypothesized that languages might exploit ambiguity to reuse words—presumably the easiest words for language-processing systems to handle. Based on observations and previous research, they assumed that words with fewer syllables, high frequency, and simplest pronunciations should have the most meaning.
To test this prediction, Piantadosi, Tily, and Gibson conducted corpus studies of English, Dutch, and German. (In linguistics, a corpus is a enormous collection of samples of language in its natural utilize that can be used to find word frequencies or patterns.) By comparing certain properties of words with their number of meanings, the researchers confirmed their suspicions that shorter, more repeated words, as well as those that follow the typical sound patterns of the language, were most likely to be ambiguous – these trends were statistically significant in all three languages.
To understand why ambiguity makes language more productive, not less, think about the conflicting desires of a speaker and a listener. The speaker is interested in conveying as much as possible using as few words as possible, while the listener is trying to gain a full and detailed understanding of what the speaker is trying to say. But, the researchers write, it is “cognitively cheaper” for the listener to infer things from context than for the speaker to spend time on longer, more complicated utterances. The result is a system that leans toward ambiguity, reusing the “easiest” words. When context is taken into account, it becomes clear that “ambiguity is something you actually want in a communication system,” Piantadosi says.
Tom Wasow, professor of linguistics and philosophy at Stanford University, calls the article “important and insightful.”
“You would expect that since languages are constantly changing, they would evolve to get rid of ambiguity,” Wasow says. “But if we look at natural languages, they are enormously ambiguous: words have many meanings, there are many ways to parse strings of words. … This article makes a really rigorous argument for why this kind of ambiguity is actually functional for communication purposes rather than dysfunctional.”
Implications for computing
The researchers say the statistical nature of their paper reflects a trend in linguistics that is increasingly relying on information theory and quantitative methods.
“The impact of computer science on linguistics is very large right now,” says Gibson, adding that natural language processing (NLP) is the main focus of those working at the intersection of the two fields.
Piantadosi points out that the ambiguity of natural language poses enormous challenges for NLP programmers. “Ambiguity is good for us [as humans] because we have these really sophisticated cognitive mechanisms for discrimination,” he says. “It’s really hard to figure out the details of what they are, or even some approximation that a computer could use.”
But, Gibson says, computer scientists have long been aware of the problem. The up-to-date study provides a better theoretical and evolutionary explanation for why the ambiguity exists, but the message is the same: “Basically, if you have any human language in your input or output, you’re stuck needing context to disambiguate,” he says.