Sunday 12:30 to 14:20 Thirlmere

Symposium

Learning to recognize words in fluent speech

Chair: Peter W. Jusczyk

Discussant: Dennis Norris

A pre-requisite for learning any language is to build a lexicon. Becausemany of the words that learners hear are presented in the context of longerutterances, a learner has to be able to recognize and segment words from thespeech stream. Knowledge of when infants are able to segment certain types ofwords from fluent speech is critical for understanding how they learn thegrammatical organization of their native language.Until relatively recently, little was known about when infants begin tosegment words from fluent speech. Although we have a clearer understandingof when elementary word segmentation abilities begin during infancy, muchremains to be learned about how these abilities develop throughout infancy.Thepapers in this symposium explore various facets of word segmentation abilitiesin infants. The papers document when infants are able to use particulartypes ofword segmentation cues, how the ability to use multiple cues for wordsegmentation develops, the range of situations in which infants are able tosegment words, and how prior experience with particular words and theirmeanings affects word recognition processes. Information about the issuesis notonly helpful towards understanding the course of language acquisition, butalsohas some potentially interesting implications for understanding the mechanismsand processes underlying word recognition by adults. Our discussant hasworked extensively on adult word recognition processes. He will consider whatinfant researchers in this area can learn from what is known about wordrecognition processes in adults, and what researchers of adult wordrecognitionprocesses can learn from the recent findings with infants.


Details of individual items:


paper

Infants' statistical learning of non-adjacent regularities in fluent speech

Richard N. Aslin, Elissa L. Newport

A number of recent studies (Saffran, Aslin & Newport, 1996; Aslin, Saffran& Newport, 1998; Marcus, Vijayan, Bandi Rao & Vishton, 1999) havedemonstrated that 8-month-old infants can utilize the distributionalproperties of artificial-language corpora to group syllables and torecognize their common sequential patterning. These statistical learningand pattern recognition abilities presumably enable infants in the earlieststages of language acquisition to isolate candidate acoustic units thatcorrespond to words and to form rudimentary word-classes. To learn themore complex structures of language, infants will require moresophisticated statistical tools, which must also be constrained to preventthem from attempting to compute all of the possible statistics that arecontained in a corpus of input. At issue, then, are which sequentialstatistics infants are capable of computing, and what biases or constraintsoperate to limit the statistics that are computed so that infants learnjust the ones that are needed to rapidly acquire their native language.Following on the work of Saffran and colleagues, we have constructedartificial languages in which the sequential statistics among adjacentunits provide no information for word boundaries. Rather, in the presentwork, we are interested in examining how learners acquire non-adjacentstatistics, which characterize some of the higher-level properties ofnatural languages. We first tested adults on these artificial languages todetermine whether they were learnable. The non-adjacent statistics weredefined either between the first and third syllables or between the first,third, and fifth segments (consonants) of trisyllabic words. Thetransitional probabilities between these non-adjacent syllables or segmentswas high (1.0) within words and low (0.33) between words. By randomizingthe middle syllable in each word (e.g., dutiba, dupoba) or the vowels thatfollowed each consonant in a word 'frame' (e.g., dutiba, ditabu), thetransitional probabilities between adjacent syllables or adjacent segmentswere kept low and provided no information for grouping or segmentation.Adults failed to learn the words in the artificial language wherenon-adjacent syllables had high transitional probabilities, but they easilylearned words from the artificial language where non-adjacent segments (theconsonant-frame) had high transitional probabilities.Extensions of these studies to 8-month-old infants suggested that they werenot able to learn non-adjacent statistics defined at either the syllable orthe segment level. However, 10-month-olds could learn non-adjacentstatistics at the segment level. Infants were tested with the headturnpreference procedure after exposure to 2 min of the artificial languagewith a fixed consonant 'frame'. Their post- familiarization listeningtimes were longer to the familiar test words (9.95 sec) than to the noveltest words (8.01 sec).Follow-up studies of 10-month-olds with the language containingnon-adjacent statistics at the syllable level will be conducted todetermine whether infants are incapable of extracting these statisticalrelations.If so, then apparently infants, like adults, can readily acquireonly certain types of non-adjacent regularities from fluent speech.


paper

Making inferences about early lexical representations

James L. Morgan

Every theory of language learning implicitly assumes that children canrepresent the words they hear (or signs they see) as instances of lexicaltypes. Without such consistent representations, there could be nothingwith which to associate contextual or distributional information and henceno basis for drawing semantic or grammatical inferences from input.However, children never directly encounter lexical types. Instead, theirexperience is with purely episodic lexical tokens. From episode toepisode, the tokens that exemplify particular lexical types vary. In someinstances, the dimensions along which tokens vary may be perceptuallysalient and even communicatively relevant (e.g., extremes of speakeraffect) but nonetheless irrelevant for lexical identification. In otherinstances, the dimensions of variation may be more subtle yet neverthelessessential for lexical identification (as in the degree of fronting in thehigh rounded vowel in /k_/ in French). Variations that are relevant tolexical identification in some languages are irrelevant in others. Thus,acquiring the ability to identify lexical types in the native languagepresents considerable difficulties for learning.A variety of studies on infants' word segmentation and word recognition influent speech contribute evidence in support of the following generaldevelopmental sequence:_ By about 7-8 months, infants have acquired a substantial store ofword-sized representations. However, at this point, there is no evidencefor anything beyond episodic representations of lexical tokens. Note thatsuch representations suffice for certain types ofstatistical-distributional analyses, but not others. For example, knowledgeof patterns of phonotactic sequences specific to word-level units in thenative language begin to be deployed in segmenting speech at this time._ Over the second 6 months, processes involved in establishing episodiclexical traces become increasingly automatized, and dimensions of variationthat are irrelevant to word identification in the native language becomemore easily ignored. For example, although effects of talker variabilitycan be demonstrated even into adulthood, these effects are much more subtleafter 12 months than they are at 7-9 months._ By the end of the 1st year (or very shortly afterwards), infants haveestablished lexical representations that, if not fully abstract themselves,may be abstracted over. Infants can now apply similarstatistical/distributional analyses across something akin to lexical types.Thus, infants beyond this age can identify consistent patterns of variationwithin lexical types and use these to abstract morphophonological patternsof the native language.Many key questions concerning lexical identification remain unexplored atthis time, including how lexical neighborhood density affectscategorization of word tokens and how conflicts between bottom-upsegmentation strategies and type-level distributional evidence areresolved.As adults, we can virtually instantaneously recognize any one of 100,000 orso lexical types from any speaker at any time, with no apparent effort.This ability provides the foundation for all fluent speech comprehension.Understanding how infants acquire this ability is an important step towardlinking results of research on infant speech perception with research onearly language acquisition.


paper

Changes in word segmentation abilities between 7- and 16-months

Peter W. Jusczyk

Infants first show some ability to segment words from fluent speech shortlyafter 7-months of age. Using a variant of the Headturn PreferenceProcedure,Jusczyk and Aslin found that infants familiarized with a pair of targetwords, either spoken in isolation or in fluent speech passages, were ableto recognize these words when spoken in another context (i.e. they listenedsignificantly longer to test materials with these targets than to oneswithout them). There is evidence that infants' abilities to recognize wordsin fluent speech develop rapidly over the course of the next few months.They show a greater ability to use a wider range of word segmentation cuesand improvements in their abilities to segment a greater variety of words,and to generalize across a broader range of talkers.Many recent investigations have focused on the means by which infants areable to segment words from fluent speech contexts. At present, there isevidence to suggest that by 10.5 months of age, infants are able to use avariety of different word segmentation cues including syllable stress(Jusczyk et al.,1999), statistical cues involving transitional probabilities (Saffran etal., 1996), phonotactic cues (Mattys et al., 1999), and allophonic cues(Jusczyk et al., in press). However, as these recent investigationsdemonstrate, sensitivity to these different sources of information aboutword boundaries develops at different points during the second half of thefirst year. Moreover, these different word segmentation cues do not allcarry the same importance for language learners.Several recent investigations in our laboratory show that infants appear torely more heavily on some types of cues than others. In these studies,infants are tested in situations in which two different types ofsegmentation cues are pitted against each other. We will review findingsfrom English-learners indicating that, when cues conflict, infants relymore heavily on stress cues than on phonotactic and statistical cues.In addition to the changes that occur in the kinds of cues that infants useto recognize words in fluent speech, there are indications that theirabilities evolve in other ways. Most past studies of infant wordsegmentation abilities have used target items that begin and end withsegments such as stop consonants (e.g. as in 'dog'). However, words inlanguages such as English may also begin with vowels, such as 'ice' and'eat'. In order to master the language, learners must also be able tosegment such words from fluent speech. Two recent investigations conductedin our laboratory suggest that infants may not begin to segment such wordsuntil some time around 16 months of age. Similarly, the ability of infantsto generalize from words produced by one talker to those of another talkeralso appears to undergo considerable improvement during this period. Wewill discuss our findings in both these areas and consider theirimplications for the way in which word segmentation abilities develop.


paper

Phonological representation and word segmentation

Kim Plunkett, Todd Bailey, Peter E. Bryant

We report on a series of three experiments, using the preferential lookingtask, which attempt to evaluate how children's ability to identify words inspeech changes during the second year of life. In the first experiment, wefind that 18month olds orient towards a target picture when that picture isnamed by an isolated word 'dog', or when instructed to look at the target,'Look at the dog over there'. However, they fail to orient towards thetarget when presented with the keyword spliced out of the instruction,'..dog..'. In contrast, 24 month olds orient towards the target pictureunder all 3 conditions. Next, we investigate how systematic distortion of aword (changing zero, one or two features) and whether the word has beenrecently learnt or is well-known to the child, influences her orientationtowards a target picture. In the second experiment, word onsets aremanipulated by changing no features (identity), one feature (place, manneror voicing) or two features (any two of place, manner or voicing). Theprobability of orienting towards the target picture in the preferentiallooking task is compared for all three distortion conditions. In the thirdexperiment, word codas are manipulated in the same way and targetorientation measured. The results of the second experiment indicate that 24month olds recognise both undistorted and distorted tokens of words, thoughundistorted words elicit a stronger looking effect. No difference was foundbetween recently learnt words and well-known words. The behaviour of the 18month olds in this study, and that of the participants in the thirdexperiment, have yet to be analysed. We discuss the results of theseexperiments in the light of the hypothesis that the phonologicalrepresentations of words become increasingly flexible during the course ofchildren's second year.