Refine
Year of publication
Document Type
- Article (84) (remove)
Keywords
- prosody (7)
- speech perception (7)
- German (5)
- language acquisition (5)
- Language acquisition (4)
- infants (4)
- Prosody (3)
- eye-tracking (3)
- habituation (3)
- musicality (3)
Institute
Previous studies have revealed that infants aged 6-10 months are able to use the acoustic correlates of major prosodic boundaries, that is, pitch change, preboundary lengthening, and pause, for the segmentation of the continuous speech signal. Moreover, investigations with American-English- and Dutch-learning infants suggest that processing prosodic boundary markings involves a weighting of these cues. This weighting seems to develop with increasing exposure to the native language and to underlie crosslinguistic variation. In the following, we report the results of four experiments using the headturn preference procedure to explore the perception of prosodic boundary cues in German infants. We presented 8-month-old infants with a sequence of names in two different prosodic groupings, with or without boundary markers. Infants discriminated both sequences when the boundary was marked by all three cues (Experiment 1) and when it was marked by a pitch change and preboundary lengthening in combination (Experiment 2). The presence of a pitch change (Experiment 3) or preboundary lengthening (Experiment 4) as single cues did not lead to a successful discrimination. Our results indicate that pause is not a necessary cue for German infants. Pitch change and preboundary lengthening in combination, but not as single cues, are sufficient. Hence, by 8 months infants only rely on a convergence of boundary markers. Comparisons with adults' performance on the same stimulus materials suggest that the pattern observed with the 8-month-olds is already consistent with that of adults. We discuss our findings with respect to crosslinguistic variation and the development of a language-specific prosodic cue weighting.
This study compares the development of prosodic processing in French- and German-learning infants. The emergence of language-specific perception of phrase boundaries was directly tested using the same stimuli across these two languages. French-learning (Experiment 1, 2) and German-learning 6- and 8-month-olds (Experiment 3) listened to the same French noun sequences with or without major prosodic boundaries ([Loulou et Manou] [et Nina]; [Loulou et Manou et Nina], respectively). The boundaries were either naturally cued (Experiment 1), or cued exclusively by pitch and duration (Experiment 2, 3). French-learning 6- and 8-month-olds both perceived the natural boundary, but neither perceived the boundary when only two cues were present. In contrast, German-learning infants develop from not perceiving the two-cue boundary at 6 months to perceiving it at 8 months, just like German-learning 8-month-olds listening to German (Wellmann, Holzgrefe, Truckenbrodt, Wartenburger, & Hohle, 2012). In a control experiment (Experiment 4), we found little difference between German and French adult listeners, suggesting that later, French listeners catch up with German listeners. Taken together, these cross-linguistic differences in the perception of identical stimuli provide direct evidence for language-specific development of prosodic boundary perception.
On the distribution of dorsals in complex and simple onsets in child German, Dutch and English
(2009)
In a preferential looking paradigm, we studied how children's looking behavior and pupillary response were modulated by the degree of phonological mismatch between the correct label of a target referent and its manipulated form. We manipulated degree of mismatch by introducing one or more featural changes to the target label. Both looking behavior and pupillary response were sensitive to degree of mismatch, corroborating previous studies that found differential responses in one or the other measure. Using time-course analyses, we present for the first time results demonstrating full separability among conditions (detecting difference not only between one vs. more, but also between two and three featural changes). Furthermore, the correct labels and small featural changes were associated with stable target preference, while large featural changes were associated with oscillating looking behavior, suggesting significant shifts in looking preference over time. These findings further support and extend the notion that early words are represented in great detail, containing subphonemic information.
Our paper reports an act out task with German 5- and 6-year olds and adults involving doubly-quantified sentences with a universal object and an existential subject. We found that 5- and 6-year olds allow inverse scope in such sentences, while adults do not. Our findings contribute to a growing body of research (e.g. Gualmini et al. 2008; Musolino 2009, etc.) showing that children are more flexible in their scopal considerations than initially proposed by the Isomorphism proposal (Lidz & Musolino 2002; Musolino & Lidz 2006). This result provides support for a theory of German, a “no quantifier raising”-language, in terms of soft violable constraints, or global economy terms (Bobaljik & Wurmbrand 2012), rather than in terms of hard inviolable constraints or rules (Frey 1993). Finally, the results are compatible with Reinhart’s (2004) hypothesis that children do not perform global interface economy considerations due to the increased processing associated with it.
Previous research on young children's knowledge of prosodic focus marking has revealed an apparent paradox, with comprehension appearing to lag behind production. Comprehension of prosodic focus is difficult to study experimentally due to its subtle and ambiguous contribution to pragmatic meaning. We designed a novel comprehension task, which revealed that three- to six-year-old children show adult-like comprehension of the prosodic marking of subject and object focus. Our findings thus support the view that production does not precede comprehension in the acquisition of focus. We tested participants speaking English, German, and French. All three languages allow prosodic subject and object focus marking, but use additional syntactic marking to varying degrees (English: dispreferred; German: possible; French preferred). French participants produced fewer subject marked responses than English participants. We found no other cross-linguistic differences. Participants interpreted prosodic focus marking similarly and in an adult-like fashion in all three languages.
Prosody plays an important role in early language acquisition that in most children proceeds rapidly and easily. From birth on infants are able to perceive prosodic information in the speech signal. During the course of the first year of life prosodic perception abilities continue to develop. Cross-linguistic studies have shown that this development is already influenced by the native language. As prosodic and syntactic units occur often in correlation, prosodic cues in the continuous speech signal might help infants to derive information on how to segment their native language into syntactically relevant units. Indeed, infants use their prosodic perception and are able to detect word, phrase and clause boundaries using prosodic cues from the speech signal. Thus, during the first year of life when perceiving speech the processing of prosodic cues is focussed and allows for an efficient access to language acquisition. Future studies need to determine whether early prosodic perception abilities can provide markers for later language development and predict language impairment.
The recognition of the prosodic focus position in German-Learning Infants from 4 to 14 Months
(2006)
The recognition of the prosodic focus position in German-learning infants from 4 to 14 months
(2006)
The aim of the present study was to elucidate in a study with 4-, 6-, 8-, and 14-month-old German-learning children, when and how they may acquire the regularities which underlie Focus-to-Stress Alignment (FSA) in the target language, that is, how prosody is associated with specific communicative functions. Our findings suggest, that 14-month-olds have already found out that German allows for variable focus positions, after having gone through a development which goes from a predominantly prosodically driven processing of the input to a processing where prosody interacts more and more with the growing lexical and syntactic knowledge of the child.
We report two corpus analyses to examine the impact of animacy, definiteness, givenness and type of referring expression on the ordering of double objects in the spontaneous speech of German-speaking two- to four-year-old children and the child-directed speech of their mothers. The first corpus analysis revealed that definiteness, givenness and type of referring expression influenced word order variation in child language and child-directed speech when the type of referring expression distinguished between pronouns and lexical noun phrases. These results correspond to previous child language studies in English (e.g., de Marneffe et al. 2012). Extending the scope of previous studies, our second corpus analysis examined the role of different pronoun types on word order. It revealed that word order in child language and child-directed speech was predictable from the types of pronouns used. Different types of pronouns were associated with different sentence positions but also showed a strong correlation to givenness and definiteness. Yet, the distinction between pronoun types diminished the effects of givenness so that givenness had an independent impact on word order only in child-directed speech but not in child language. Our results support a multi-factorial approach to word order in German. Moreover, they underline the strong impact of the type of referring expression on word order and suggest that it plays a crucial role in the acquisition of the factors influencing word order variation.
One of the most important social cognitive skills in humans is the ability to “put oneself in someone else’s shoes,” that is, to take another person’s perspective. In socially situated communication, perspective taking enables the listener to arrive at a meaningful interpretation of what is said (sentence meaning) and what is meant (speaker’s meaning) by the speaker. To successfully decode the speaker’s meaning, the listener has to take into account which information he/she and the speaker share in their common ground (CG). We here further investigated competing accounts about when and how CG information affects language comprehension by means of reaction time (RT) measures, accuracy data, event-related potentials (ERPs), and eye-tracking. Early integration accounts would predict that CG information is considered immediately and would hence not expect to find costs of CG integration. Late integration accounts would predict a rather late and effortful integration of CG information during the parsing process that might be reflected in integration or updating costs. Other accounts predict the simultaneous integration of privileged ground (PG) and CG perspectives. We used a computerized version of the referential communication game with object triplets of different sizes presented visually in CG or PG. In critical trials (i.e., conflict trials), CG information had to be integrated while privileged information had to be suppressed. Listeners mastered the integration of CG (response accuracy 99.8%). Yet, slower RTs, and enhanced late positivities in the ERPs showed that CG integration had its costs. Moreover, eye-tracking data indicated an early anticipation of referents in CG but an inability to suppress looks to the privileged competitor, resulting in later and longer looks to targets in those trials, in which CG information had to be considered. Our data therefore support accounts that foresee an early anticipation of referents to be in CG but a rather late and effortful integration if conflicting information has to be processed. We show that both perspectives, PG and CG, contribute to socially situated language processing and discuss the data with reference to theoretical accounts and recent findings on the use of CG information for reference resolution.
This study investigates prosodic phrasing of bracketed lists in German. We analyze variation in pauses, phrase-final lengthening and f0 in speech production and how these cues affect boundary perception. In line with the literature, it was found that pauses are often used to signal intonation phrase boundaries, while final lengthening and f0 are employed across different levels of the prosodic hierarchy. Deviations from expectations based on the standard syntax-prosody mapping are interpreted in terms of task-specific effects. That is, we argue that speakers add/delete prosodic boundaries to enhance the phonological contrast between different bracketings in the experimental task. In perception, three experiments were run, in which we tested only single cues (but temporally distributed at different locations of the sentences). Results from identification tasks and reaction time measurements indicate that pauses lead to a more abrupt shift in listeners׳ prosodic judgments, while f0 and final lengthening are exploited in a more gradient manner. Hence, pauses, final lengthening and f0 have an impact on boundary perception, though listeners show different sensitivity to the three acoustic cues.
Previous research has shown that high phonotactic frequencies facilitate the production of regularly inflected verbs in English-learning children with specific language impairment (SLI) but not with typical development (TD). We asked whether this finding can be replicated for German, a language with a much more complex inflectional verb paradigm than English. Using an elicitation task, the production of inflected nonce verb forms (3rd person singular with - t suffix) with either high-or low-frequency subsyllables was tested in sixteen German-learning children with SLI (ages 4;1-5;1), sixteen TD-children matched for chronological age (CA) and fourteen TD-children matched for verbal age (VA) (ages 3;0-3;11). The findings revealed that children with SLI, but not CA-or VA-children, showed differential performance between the two types of verbs, producing more inflectional errors when the verb forms resulted in low-frequency subsyllables than when they resulted in high-frequency subsyllables, replicating the results from English-learning children.
This paper investigates the predictions of the Derivational Complexity Hypothesis by studying the acquisition of wh-questions in 4- and 5-year-old Akan-speaking children in an experimental approach using an elicited production and an elicited imitation task. Akan has two types of wh-question structures (wh-in-situ and wh-ex-situ questions), which allows an investigation of children’s acquisition of these two question structures and their preferences for one or the other. Our results show that adults prefer to use wh-ex-situ questions over wh-in-situ questions. The results from the children show that both age groups have the two question structures in their linguistic repertoire. However, they differ in their preferences in usage in the elicited production task: while the 5-year-olds preferred the wh-in-situ structure over the wh-ex-situ structure, the 4-year-olds showed a selective preference for the wh-in-situ structure in who-questions. These findings suggest a developmental change in wh-question preferences in Akan-learning children between 4 and 5 years of age with a so far unobserved u-shaped developmental pattern. In the elicited imitation task, all groups showed a strong tendency to maintain the structure of in-situ and ex-situ questions in repeating grammatical questions. When repairing ungrammatical ex-situ questions, structural changes to grammatical in-situ questions were hardly observed but the insertion of missing morphemes while keeping the ex-situ structure. Together, our findings provide only partial support for the Derivational Complexity Hypothesis.
Young infants can segment continuous speech with statistical as well as prosodic cues. Understanding how these cues interact can be informative about how infants solve the segmentation problem. Here we investigate how German-speaking adults and 9-month-old German-learning infants weigh statistical and prosodic cues when segmenting continuous speech. We measured participants' pupil size while they were familiarized with a continuous speech stream where prosodic cues were pitted off against transitional probabilities. Adult participants' changes in pupil size synchronized with the occurrence of prosodic words during the familiarization and the temporal alignment of these pupillary changes was predictive of adult participants' performance at test. Further, 9-month-olds as a group failed to consistently segment the familiarization stream with prosodic or statistical cues. However, the variability in temporal alignment of the pupillary changes at word frequency showed that prosodic and statistical cues compete for dominance when segmenting continuous speech. A followup language development questionnaire at 40 months of age suggested that infants who entrained to prosodic words performed better on a vocabulary task and those infants who relied more on statistical cues performed better on grammatical tasks. Together these results suggest that statistics and prosody may serve different roles in speech segmentation in infancy.
Infants show impressive speech decoding abilities and detect acoustic regularities that highlight the syntactic relations of a language, often coded via non-adjacent dependencies (NADs, e.g., is singing). It has been claimed that infants learn NADs implicitly and associatively through passive listening and that there is a shift from effortless associative learning to a more controlled learning of NADs after the age of 2 years, potentially driven by the maturation of the prefrontal cortex. To investigate if older children are able to learn NADs, Lammertink et al. (2019) recently developed a word-monitoring serial reaction time (SRT) task and could show that 6–11-year-old children learned the NADs, as their reaction times (RTs) increased then they were presented with violated NADs. In the current study we adapted their experimental paradigm and tested NAD learning in a younger group of 52 children between the age of 4–8 years in a remote, web-based, game-like setting (whack-a-mole). Children were exposed to Italian phrases containing NADs and had to monitor the occurrence of a target syllable, which was the second element of the NAD. After exposure, children did a “Stem Completion” task in which they were presented with the first element of the NAD and had to choose the second element of the NAD to complete the stimuli. Our findings show that, despite large variability in the data, children aged 4–8 years are sensitive to NADs; they show the expected differences in r RTs in the SRT task and could transfer the NAD-rule in the Stem Completion task. We discuss these results with respect to the development of NAD dependency learning in childhood and the practical impact and limitations of collecting these data in a web-based setting.
The ability to determine how many objects are involved in physical events is fundamental for reasoning about the world that surrounds us. Previous studies suggest that infants can fail to individuate objects in ambiguous occlusion events until their first birthday and that learning words for the objects may play a crucial role in the development of this ability. The present eye-tracking study tested whether the classical object individuation experiments underestimate young infants’ ability to individuate objects and the role word learning plays in this process. Three groups of 6-month-old infants (N = 72) saw two opaque boxes side by side on the eye-tracker screen so that the content of the boxes was not visible. During a familiarization phase, two visually identical objects emerged sequentially from one box and two visually different objects from the other box. For one group of infants the familiarization was silent (Visual Only condition). For a second group of infants the objects were accompanied with nonsense words so that objects’ shape and linguistic labels indicated the same number of objects in the two boxes (Visual & Language condition). For the third group of infants, objects’ shape and linguistic labels were in conflict (Visual vs. Language condition). Following the familiarization, it was revealed that both boxes contained the same number of objects (e.g. one or two). In the Visual Only condition, infants looked longer to the box with incorrect number of objects at test, showing that they could individuate objects using visual cues alone. In the Visual & Language condition infants showed the same looking pattern. However, in the Visual vs Language condition infants looked longer to the box with incorrect number of objects according to linguistic labels. The results show that infants can individuate objects in a complex object individuation paradigm considerably earlier than previously thought and that linguistic cues enforce their own preference in object individuation. The results are consistent with the idea that when language and visual information are in conflict, language can exert an influence on how young infants reason about the visual world.
During the first year of life, infants undergo a process known as perceptual narrowing, which reduces their sensitivity to classes of stimuli which the infants do not encounter in their environment. It has been proposed that perceptual narrowing for faces and speech may be driven by shared domain-general processes. To investigate this theory, our study longitudinally tested 50 German Caucasian infants with respect to these domains first at 6 months of age followed by a second testing at 9 months of age. We used an infant-controlled habituation-dishabituation paradigm to test the infants' ability to discriminate among other-race Asian faces and non-native Cantonese speech tones, as well as same-race Caucasian faces as a control. We found that while at 6 months of age infants could discriminate among all stimuli, by 9 months of age they could no longer discriminate among other-race faces or non-native tones. However, infants could discriminate among same-race stimuli both at 6 and at 9 months of age. These results demonstrate that the same infants undergo perceptual narrowing for both other-race faces and non-native speech tones between the ages of 6 and 9 months. This parallel development of perceptual narrowing occurring in both the face and speech perception modalities over the same period of time lends support to the domain-general theory of perceptual narrowing in face and speech perception.
During the first year of life, infants undergo perceptual narrowing in the domains of speech and face perception. This is typically characterized by improvements in infants' abilities in discriminating among stimuli of familiar types, such as native speech tones and same-race faces. Simultaneously, infants begin to decline in their ability to discriminate among stimuli of types with which they have little experience, such as nonnative tones and other-race faces. The similarity in time-frames during which perceptual narrowing seems to occur in the domains of speech and face perception has led some researchers to hypothesize that the perceptual narrowing in these domains could be driven by shared domain-general processes. To explore this hypothesis, we tested 53 Caucasian 9-month-old infants from monolingual German households on their ability to discriminate among non-native Cantonese speech tones, as well among same-race German faces and other-race Chinese faces. We tested the infants using an infant-controlled habituation-dishabituation paradigm, with infants' preferences for looking at novel stimuli versus the habituated stimuli (dishabituation scores) acting as indicators of discrimination ability. As expected for their age, infants were able to discriminate between same-race faces, but not between other-race faces or non-native speech tones. Most interestingly, we found that infants' dishabituation scores for the non-native speech tones and other-race faces showed significant positive correlations, while the dishabituation scores for non-native speech tones and same-race faces did not. These results therefore support the hypothesis that shared domain-general mechanisms may drive perceptual narrowing in the domains of speech and face perception.
The other-race effect (ORE) can be described as difficulties in discriminating between faces of ethnicities other than one's own, and can already be observed at approximately 9 months of age. Recent studies also showed that infants visually explore same-and other-race faces differently. However, it is still unclear whether infants' looking behavior for same- and other-race faces is related to their face discrimination abilities. To investigate this question we conducted a habituation-dishabituation experiment to examine Caucasian 9-month-old infants' gaze behavior, and their discrimination of same- and other-race faces, using eye-tracking measurements. We found that infants looked longer at the eyes of same-race faces over the course of habituation, as compared to other-race faces. After habituation, infants demonstrated a clear other-race effect by successfully discriminating between same-race faces, but not other-race faces. Importantly, the infants' ability to discriminate between same-race faces significantly correlated with their fixation time towards the eyes of same-race faces during habituation. Thus, our findings suggest that for infants old enough to begin exhibiting the ORE, gaze behavior during habituation is related to their ability to differentiate among same-race faces, compared to other-race faces.
Perceptual narrowing in the domain of face perception typically begins to reduce infants' sensitivity to differences distinguishing other-race faces from approximately 6 months of age. The present study investigated whether it is possible to re-sensitize Caucasian 12-month-old infants to other-race Asian faces through statistical learning by familiarizing them with different statistical distributions of these faces. The familiarization faces were created by generating a morphed continuum from one Asian face identity to another. In the unimodal condition, infants were familiarized with a frequency distribution wherein they saw the midpoint face of the morphed continuum the most frequently. In the bimodal condition, infants were familiarized with a frequency distribution wherein they saw faces closer to the endpoints of the morphed continuum the most frequently. After familiarization, infants were tested on their discrimination of the two original Asian faces. The infants' looking times during the test indicated that infants in the bimodal condition could discriminate between the two faces, while infants in the unimodal condition could not. These findings therefore suggest that 12-month-old Caucasian infants could be re-sensitized to Asian faces by familiarizing them with a bimodal frequency distribution of such faces.
German-learning infants' ability to detect unstressed closed-class elements in continuous speech
(2003)
Phonological specificity of early lexical representations in German 19-month-olds at risk for SLI
(2006)
In this article we report on early rhythmic discrimination performance of children who participated in a longitudinal study following children from birth to their 6th year of life. Thirty-four children including 8 children with a family risk for developmental language impairment were tested on the discrimination of trochaic and iambic disyllabic sequences when they were 4 months old. At 5 years of age, standardized measures on language performance (SETK3-5) and nonverbal intelligence (SON-R) were obtained. Overall, evidence of discrimination of the rhythmic patterns was found only for children without a family risk. The performance in early rhythmic discrimination correlated with the later outcomes in SETK3-5 subtests on sentence comprehension and morphological skills, but not with subtests related to memory performance nor with nonverbal intelligence. Our results suggest that indicators of language development can be discovered as early as 4 months of age, and seem to correlate with later outcomes in rather specific language skills.
How do children determine the syntactic category of novel words? In this article we present the results of 2 experiments that investigated whether German children between 12 and 16 months of age can use distributional knowledge that determiners precede nouns and subject pronouns precede verbs to syntactically categorize adjacent novel words. Evidence from the head-turn preference paradigm shows that, although 12- to 13-month-olds cannot do this, 14- to 16- month-olds are able to use a determiner to categorize a following novel word as a noun. In contrast, no categorization effect was found for a novel word following a subject pronoun. To understand this difference we analyzed adult child- directed speech. This analysis showed that there are in fact stronger co-occurrence relations between determiners and nouns than between subject pronouns and verbs. Thus, in German determiners may be more reliable Cues to the syntactic category of an adjacent novel word than are subject pronouns. We propose that the capacity to syntactically categorize novel words, demonstrated here for the first time in children this young, mediates between the recognition of the specific morphosyntactic frame in which a novel word appears and the word-to-world mapping that is needed to build up a semantic representation for the novel word
Two experiments tested how faithfully German children aged 4; 5 to 5; 6 reproduce ditransitive sentences that are unmarked or marked with respect to word order and focus (Exp1) or definiteness (Exp2). Adopting an optimality theory (OT) approach, it is assumed that in the German adult grammar word order is ranked lower than focus and definiteness. Faithfulness of children's reproductions decreased as markedness of inputs increased; unmarked structures were reproduced most faithfully and unfaithful outputs had most often an unmarked form. Consistent with the OT proposal, children were more tolerant against inputs marked for word order than for focus; in conflict with the proposal, children were less tolerant against inputs marked for word order than for definiteness. Our results suggest that the linearization of objects in German double object constructions is affected by focus and definiteness, but that prosodic principles may have an impact on the position of a focused constituent.
Children’s interpretations of sentences containing focus particles do not seem adult-like until school age. This study investigates how German 4-year-old children comprehend sentences with the focus particle ‘nur’ (only) by using different tasks and controlling for the impact of general cognitive abilities on performance measures. Two sentence types with ‘only’ in either pre-subject or pre-object position were presented. Eye gaze data and verbal responses were collected via the visual world paradigm combined with a sentence-picture verification task. While the eye tracking data revealed an adult-like pattern of focus particle processing, the sentence-picture verification replicated previous findings of poor comprehension, especially for ‘only’ in pre-subject position. A second study focused on the impact of general cognitive abilities on the outcomes of the verification task. Working memory was related to children’s performance in both sentence types whereas inhibitory control was selectively related to the number of errors for sentences with ‘only’ in pre-subject position. These results suggest that children at the age of 4 years have the linguistic competence to correctly interpret sentences with focus particles, which–depending on specific task demands–may be masked by immature general cognitive abilities.
Children’s interpretations of sentences containing focus particles do not seem adult-like until school age. This study investigates how German 4-year-old children comprehend sentences with the focus particle ‘nur’ (only) by using different tasks and controlling for the impact of general cognitive abilities on performance measures. Two sentence types with ‘only’ in either pre-subject or pre-object position were presented. Eye gaze data and verbal responses were collected via the visual world paradigm combined with a sentence-picture verification task. While the eye tracking data revealed an adult-like pattern of focus particle processing, the sentence-picture verification replicated previous findings of poor comprehension, especially for ‘only’ in pre-subject position. A second study focused on the impact of general cognitive abilities on the outcomes of the verification task. Working memory was related to children’s performance in both sentence types whereas inhibitory control was selectively related to the number of errors for sentences with ‘only’ in pre-subject position. These results suggest that children at the age of 4 years have the linguistic competence to correctly interpret sentences with focus particles, which–depending on specific task demands–may be masked by immature general cognitive abilities.
Only the right noise?
(2020)
Seminal work by Werker and colleagues (Stager & Werker [1997]Nature, 388, 381-382) has found that 14-month-old infants do not show evidence for learning minimal pairs in the habituation-switch paradigm. However, when multiple speakers produce the minimal pair in acoustically variable ways, infants' performance improves in comparison to a single speaker condition (Rost & McMurray [2009]Developmental Science, 12, 339-349). The current study further extends these results and assesses how different kinds of input variability affect 14-month-olds' minimal pair learning in the habituation-switch paradigm testing German learning infants. The first two experiments investigated word learning when the labels were spoken by a single speaker versus when the labels were spoken by multiple speakers. In the third experiment we studied whether non-acoustic variability, implemented by visual variability of the objects presented together with the labels, would also affect minimal pair learning. We found enhanced learning in the multiple speakers compared to the single speaker condition, confirming previous findings with English-learning infants. In contrast, visual variability of the presented objects did not support learning. These findings both confirm and better delimit the beneficial role of speech-specific variability in minimal pair learning. Finally, we review different proposals on the mechanisms via which variability confers benefits to learning and outline what may be likely principles that underlie this benefit. We highlight among these the multiplicity of acoustic cues signalling phonemic contrasts and the presence of relations among these cues. It is in these relations where we trace part of the source for the apparent paradoxical benefit of variability in learning.