Cognitive scientists have long sought to understand what makes some sentences more difficult to understand than others. The researchers believe that any account of understanding language will benefit from understanding the difficulties of comprehension.
In recent years, researchers have succeeded in developing two models that explain two important types of difficulty in comprehending and producing sentences. While these models successfully predict specific patterns of comprehension difficulties, their predictions are limited and do not exactly match the results of behavioral experiments. Moreover, until recently researchers have been unable to integrate these two models into a coherent account.
A new study led by researchers from the Department of Brain and Cognitive Sciences (BCS) at MIT now provides such a standardized account of the difficulties in understanding language. Building on recent advances in machine learning, researchers have developed a model that better predicts the ease, or lack thereof, with which individuals produce and understand sentences. They recently published their findings in Proceedings of the National Academy of Sciences.
The paper’s lead authors are BCS professors Roger Levy and Edward (Ted) Gibson. The lead author is Michael Hahn, a former visiting student of Levy and Gibson and now a professor at Saarland University. The second author is Richard Futrell, another former student of Levy and Gibson who is now a professor at the University of California, Irvine.
“This is not just an expanded version of existing accounts of the difficulties of understanding,” says Gibson. “We provide a new fundamental theoretical approach that allows for better predictions.”
The researchers relied on the two existing models to create a unified theoretical account of comprehension difficulty. Each of these ancient paradigms identifies a distinct cause of frustrated comprehension: difficulty in anticipation and difficulty in recalling memory. We have difficulty predicting when a sentence does not allow us to easily anticipate its next words. We have difficulty recalling when we have difficulty keeping track of a sentence that has a complex structure of embedded sentences, such as: “The fact that a doctor who is not trusted by a lawyer upset a patient is surprising.”
In 2020 Futrell created for the first time a theory that unifies these two models. He argued that limits in memory not only affect recall of sentences with embedded sentences but infect entire linguistic comprehension; Our memory limitations do not allow us to perfectly represent sentence contexts while understanding language in general.
Thus, according to this unified model, memory limitations can create a new source of difficulty in anticipation. We might have a hard time anticipating a word coming in a sentence even if the word is easily predictable from the context – if the context of the sentence itself is difficult to keep in memory. Consider, for example, a sentence beginning with the words “Bob threw the trash…” We can easily anticipate the last word – “get out”. But if the context of the sentence preceding the last word is more complex, prediction difficulties arise: “Bob threw out the old trash that had been sitting in the kitchen for days (out).”
Researchers determine comprehension difficulty by measuring the time it takes readers to respond to various comprehension tasks. The longer the response time, the more difficult it is to understand a particular sentence. Results from previous experiments showed that Futrell’s standardized narrative predicted readers’ comprehension difficulties better than the two older models. But his model did not identify which parts of the sentence we tend to forget – and how this failure to retrieve memory clouds comprehension.
Han’s new study fills in these gaps. In the new paper, cognitive scientists from MIT join Futrell to propose an enhanced model that underlies a coherent new theoretical framework. The new model identifies and corrects missing elements in Futrell’s unified calculation and provides accurate new predictions that better match results from experimental experiments.
As in Futrell’s original model, the researchers start with the idea that our brain, due to memory limitations, does not perfectly represent the sentences we encounter. But they add to this theoretical principle of cognitive competence. They suggest that the brain tends to deploy its limited memory resources in a way that improves its ability to accurately predict the input of new words into sentences.
This idea leads to several empirical predictions. According to one major prediction, readers compensate for incomplete memory representations by relying on their knowledge of statistical synchronicity of words in order to implicitly reconstruct sentences they have read in their mind. Sentences containing rare words and phrases are therefore difficult to remember completely, making it difficult to anticipate the next words. As a result, these sentences are generally more difficult to understand.
To assess whether this prediction matches our language behavior, the researchers used GPT-2, an artificial intelligence-based natural language tool based on neural network modeling. This machine learning tool, first published in 2019, allowed researchers to test the model on large-scale text data in a way that was not possible before. But GPT-2’s powerful language modeling ability also created a problem: unlike humans, a fully pure GPT-2 memory represents all the words in the very long, complex texts it processes. To more accurately characterize human language understanding, the researchers added a component that simulated human-like limitations on memory resources — as in Futrell’s original model — and used machine learning techniques to improve how those resources are used — as in their proposed new model. The resulting model maintains GPT-2’s ability to accurately predict words most of the time, but shows human-like divisions in instances of sentences with rare combinations of words and phrases.
“This is a wonderful illustration of how modern tools of machine learning can advance cognitive theory and our understanding of how the mind works,” says Gibson. “We couldn’t have done this research here until a few years ago.”
The researchers fed the machine learning model a set of sentences with complex embedded clauses such as, “The report that a doctor who does not trust a lawyer upset a patient was surprising.” The researchers then took these sentences and replaced the introductory nouns — “report” in the example above — with other nouns, each with a probability of occurring with the next sentence or not. Some nouns made time-slotted sentences easier so the AI program could “understand” them. For example, the model was able to predict more accurately how these sentences would end when they began with the common phrasing “true” than when they began with the rare phrasing “determine that.”
The researchers then proceeded to substantiate the AI-based results by running experiments with participants who had read similar sentences. Their response times to the grasping tasks were similar to those of the model predictions. “When sentences began with the words ‘report it,’ people tended to remember the sentence in a distorted way,” says Gibson. The scarce phrasing further restricted their memory and, as a result, their comprehension.
These results show that the new model outperforms existing models in predicting how humans process language.
Another feature that the model demonstrates is its ability to make different predictions from one language to another. “Previous models were known to explain why certain language structures, such as sentences with inline clauses, generally have difficulty working within memory constraints, but our new model can explain why the same constraints behave differently in different languages,” says Levy. “Sentences with middle clauses, for example, sound easier to German speakers than to native English speakers, since German speakers are used to reading sentences where subordinate clauses push the verb to the end of the sentence.”
According to Levy, more research on the form is needed to determine the causes of inaccurate sentence representation other than inline sentences. “There are other types of ‘confusions’ that we need to test.” At the same time, Han adds, “the model may anticipate other ‘confusions’ that no one has even considered. We are now trying to find them and see if they affect human understanding as expected.”
Another question for future studies is whether the new paradigm will lead to a rethinking of a long line of research focused on the difficulties of sentence integration: “Many researchers have emphasized the difficulties related to the process in which we reconstruct linguistic structures in our minds,” says Levy. “The new paradigm might show that the difficulty is not related to the process of mentally reconstructing these sentences, but to preserving the mental representation once it has already been established. The big question is whether or not these are two separate things.”
One way or another, Gibson adds, “this kind of work marks the future of research on these questions.”