February 2001

From Washington University in St. Louis

Parents' instinctive use of isolated words may help babies learn language

Brevity, as the Bard said, may well be the soul of wit. But, according to research by a computer scientist at Washington University in St. Louis, brevity also is the nature of speech to infants, and this may help them to learn their first words.

Michael Brent, Ph.D., associate professor of computer science at Washington University in St. Louis, has found that before the age of 15 months the words infants learn are mainly those that their mothers utter in isolation. Hearing "kitty," "red," or "come" in isolation may make learning those words much easier for young infants than hearing them buried in a longer sentence.

Brent's findings challenge recent language acquisition theory, which suggests that infants rely heavily on segmenting longer utterances into their individual words. The segmentation-based perspective emerged in recent years as a number of studies in other laboratories began to show that infants can segment longer sentences.

But in the analysis of eight mothers' conversations with their infant children Brent and his collaborator, Jeffrey Siskind, Ph.D., of NEC Institute in Princeton, NJ, found that 9 percent of all utterances the mothers spoke to their children were isolated words. The words were not only nouns and verbs, but some adjectives and adverbs, too. This confirmed that infants have ample opportunity to learn from isolated words, despite suggestions in the scientific literature that they might not.

Moreover, they found that the frequency with which a mother says a word in isolation is a predictor of whether the child will know that word later. In contrast, the overall frequency with which a mother says a word is not a predictor of whether the child will know that word later.

Brent presented his results February 19, 2001, at the annual meeting of the American Association for the Advancement of Science in San Francisco.

There will be a press briefing on this research and others in the language acquisition symposium at 11 a.m., Feb. 19, 2001 at the Nikko Grand Ballroom II. The symposium, "Tools Human Infants Might Use to Learn Language" will be held from 4:30 p.m. to 6 p.m. at the Hilton Continental Ballroom 2.

Brent and his colleagues recorded the speech of mothers to their infant children in 14 visits to the home of each family during the time when the child was between 9 and 15 months old. The spontaneous utterances of the mothers and children were recorded for one to two hours in each instance, in the absence of a researcher, yielding more than 200 hours of recorded speech.

The researchers then took the tapes back, transferred them onto a computer, transcribed them, and analyzed them using computer software they developed for the purpose. The times at which each utterance began and ended were measured to within one twentieth of a second, allowing researchers to define isolated words precisely in terms of their separation from the nearest speech.

"We have this huge data base now of recordings and transcripts, and it's going to be publicly available in a data repository for other researchers to use," Brent said.

To determine which words infants learned, Brent surveyed the mothers about which words their children knew periodically throughout the course of the study. Findings based on these surveys were confirmed using the words each child spoke during the recording sessions.

"What is thought-provoking about our findings is that it had been shown in laboratory situations that infants are able to recognize words by segmenting connected speech," Brent said. "So, the assumption has been that children rely on speech segmentation to build their vocabularies from the very beginning. What we've found is that, while infants can segment speech, it seems to be relatively rare for young infants to learn words that are not spoken in isolation. We're saying, 'Wait a minute, they can segment, but is this ability actually called upon for early word learning?' We think that speech segmentation is important later on, when kids are learning words really fast, but it may not be relied upon at the beginning, when they're learning slowly."

Brent's current study was motivated by a computational model of speech segmentation that he developed in the late 1990s. The model predicts certain patterns of segmentation. For example, hearing "ball" in isolation should make it easier to segment "red" out of the phrase "redball." Brent and collaborators tested the theory on adults, and the findings supported the model. But when they tested it on 12-month-old infants, they found little evidence for the predicted segmentation patterns. This meant that 12-month-olds either learned language through a different segmentation process than adults, or, Brent wondered, do they even use segmentation at all?

"I realized then that we knew infants could pull repeated words out of fluent speech, and we knew they could remember the sounds of those words, but we didn't know whether they actually learned the meanings of their first words that way," Brent said.

The current study shows that the first words children tend to learn are words their mothers speak in isolation, suggesting that such isolated words may form a foundation for early vocabulary learning. The notion that mothers' isolated words help young infants learn is part of Brent's larger theory about how our instinctive style of speech to children helps them learn language.

"Short utterances lay bare the structure of language," he said. "For example, in a long sentence like 'I know the famous scientist you're discussing,' it may be difficult for a child to figure out what the phrase 'you're discussing' relates to. Does it modify ‘scientist,’ or is it what the speaker knows? In a short utterance like 'I know Elmo,' it's obvious that 'Elmo' is the direct object of 'know'." Brent has found that short sentences are useful in computer simulations on learning the grammatical roles of words.

Brent's specialties are artificial intelligence, natural language processing and computational biology. His work in language acquisition is related to a special interest in artificial intelligence whereby a computer someday would be able to infer the grammatical structure of a language from sample sentences of that language. For example, he currently is working on how a computer might figure out the word-formation rules for a language --- for example, the fact that in English "-ly" is a suffix that goes on adjectives, turning them into adverbs. In French, the suffix "-ment" plays the same role.

He uses Baysian models to train a computer how to recognize patterns. Baysian models allow the computer to combine prior knowledge about the commonalities of all languages --- such as the fact that suffixes are very common across languages --- with evidence about a particular language gleaned from example sentences --- such as the fact that "-ly" is a suffix in English. Both the prior knowledge and the evidence gleaned from experience are represented as probabilities.

Ultimately, the goal is to develop an artificial intelligence 'linguist' that could be presented a text in an unknown language and output a grammatical analysis of that language. This could find a number of applications, whether in decoding a specialized language or a lost one, such as the languages found on ancient stone tablets.

Similarly, in computational biology, Brent uses the same methods in a quest to predict the structure of genes and their protein products. While "cat" and "red" seem a long way from RNA and DNA, Brent says there is a strong connection between the two areas. "In both language acquisition and DNA sequencing, we're taking in large sequences of symbols, whether DNA symbols or letters representing the pronunciation of words, and we're trying to learn the structure underlying the patterns in those symbols automatically," Brent said.

Others apparently agree. At a government-sponsored workshop being held in Philadelphia at the end of February, Brent will join top researchers in both computational linguistics and computational biology to help cement the growing ties between the two fields.

Embargoed for 11 A.M., PST, FEB. 19, 2001

This research has received partial funding from NIH.

This article comes from Science Blog. Copyright © 2004

Archives 2001 F