Refine
Document Type
- Article (6)
- Doctoral Thesis (1)
- Other (1)
Language
- English (8)
Is part of the Bibliography
- yes (8) (remove)
Keywords
- speech perception (8) (remove)
Institute
- Department Linguistik (8) (remove)
Speech perception requires rapid extraction of the linguistic content from the acoustic signal. The ability to efficiently process rapid changes in auditory information is important for decoding speech and thereby crucial during language acquisition. Investigating functional networks of speech perception in infancy might elucidate neuronal ensembles supporting perceptual abilities that gate language acquisition. Interhemispheric specializations for language have been demonstrated in infants. How these asymmetries are shaped by basic temporal acoustic properties is under debate. We recently provided evidence that newborns process non-linguistic sounds sharing temporal features with language in a differential and lateralized fashion. The present study used the same material while measuring brain responses of 6 and 3 month old infants using simultaneous recordings of electroencephalography (EEG) and near-infrared spectroscopy (NIRS). NIRS reveals that the lateralization observed in newborns remains constant over the first months of life. While fast acoustic modulations elicit bilateral neuronal activations, slow modulations lead to right-lateralized responses. Additionally, auditory-evoked potentials and oscillatory EEG responses show differential responses for fast and slow modulations indicating a sensitivity for temporal acoustic variations. Oscillatory responses reveal an effect of development, that is, 6 but not 3 month old infants show stronger theta-band desynchronization for slowly modulated sounds. Whether this developmental effect is due to increasing fine-grained perception for spectrotemporal sounds in general remains speculative. Our findings support the notion that a more general specialization for acoustic properties can be considered the basis for lateralization of speech perception. The results show that concurrent assessment of vascular based imaging and electrophysiological responses have great potential in the research on language acquisition.
Previous studies have revealed that infants aged 6-10 months are able to use the acoustic correlates of major prosodic boundaries, that is, pitch change, preboundary lengthening, and pause, for the segmentation of the continuous speech signal. Moreover, investigations with American-English- and Dutch-learning infants suggest that processing prosodic boundary markings involves a weighting of these cues. This weighting seems to develop with increasing exposure to the native language and to underlie crosslinguistic variation. In the following, we report the results of four experiments using the headturn preference procedure to explore the perception of prosodic boundary cues in German infants. We presented 8-month-old infants with a sequence of names in two different prosodic groupings, with or without boundary markers. Infants discriminated both sequences when the boundary was marked by all three cues (Experiment 1) and when it was marked by a pitch change and preboundary lengthening in combination (Experiment 2). The presence of a pitch change (Experiment 3) or preboundary lengthening (Experiment 4) as single cues did not lead to a successful discrimination. Our results indicate that pause is not a necessary cue for German infants. Pitch change and preboundary lengthening in combination, but not as single cues, are sufficient. Hence, by 8 months infants only rely on a convergence of boundary markers. Comparisons with adults' performance on the same stimulus materials suggest that the pattern observed with the 8-month-olds is already consistent with that of adults. We discuss our findings with respect to crosslinguistic variation and the development of a language-specific prosodic cue weighting.
Language and music share many rhythmic properties, such as variations in intensity and duration leading to repeating patterns. Perception of rhythmic properties may rely on cognitive networks that are shared between the two domains. If so, then variability in speech rhythm perception may relate to individual differences in musicality. To examine this possibility, the present study focuses on rhythmic grouping, which is assumed to be guided by a domain-general principle, the Iambic/Trochaic law, stating that sounds alternating in intensity are grouped as strong-weak, and sounds alternating in duration are grouped as weak-strong. German listeners completed a grouping task: They heard streams of syllables alternating in intensity, duration, or neither, and had to indicate whether they perceived a strong-weak or weak-strong pattern. Moreover, their music perception abilities were measured, and they filled out a questionnaire reporting their productive musical experience. Results showed that better musical rhythm perception - ability was associated with more consistent rhythmic grouping of speech, while melody perception - ability and productive musical experience were not. This suggests shared cognitive procedures in the perception of rhythm in music and speech. Also, the results highlight the relevance of - considering individual differences in musicality when aiming to explain variability in prosody perception.
Prosody is a rich source of information that heavily supports spoken language comprehension. In particular, prosodic phrase boundaries divide the continuous speech stream into chunks reflecting the semantic and syntactic structure of an utterance. This chunking or prosodic phrasing plays a critical role in both spoken language processing and language acquisition. Aiming at a better understanding of the underlying processing mechanisms and their acquisition, the present work investigates factors that influence prosodic phrase boundary perception in adults and infants. Using the event-related potential (ERP) technique, three experimental studies examined the role of prosodic context (i.e., phrase length) in German phrase boundary perception and of the main prosodic boundary cues, namely pitch change, final lengthening, and pause. With regard to the boundary cues, the dissertation focused on the questions which cues or cue combination are essential for the perception of a prosodic boundary and on whether and how this cue weighting develops during infancy.
Using ERPs is advantageous because the technique captures the immediate impact of (linguistic) information during on-line processing. Moreover, as it can be applied independently of specific task demands or an overt response performance, it can be used with both infants and adults. ERPs are particularly suitable to study the time course and underlying mechanisms of boundary perception, because a specific ERP component, the Closure Positive Shift (CPS) is well established as neuro-physiological indicator of prosodic boundary perception in adults.
The results of the three experimental studies first underpin that the prosodic context plays an immediate role in the processing of prosodic boundary information. Moreover, the second study reveals that adult listeners perceive a prosodic boundary also on the basis of a sub-set of the boundary cues available in the speech signal. Both ERP and simultaneously collected behavioral data (i.e., prosodic judgements) suggest that the combination of pitch change and final lengthening triggers boundary perception; however, when presented as single cues, neither pitch change nor final lengthening were sufficient. Finally, testing six- and eight-month-old infants shows that the early sensitivity for prosodic information is reflected in a brain response resembling the adult CPS. For both age groups, brain responses to prosodic boundaries cued by pitch change and final lengthening revealed a positivity that can be interpreted as a CPS-like infant ERP component. In contrast, but comparable to the adults’ response pattern, pitch change as a single cue does not provoke an infant CPS. These results show that infant phrase boundary perception is not exclusively based on pause detection and hint at an early ability to exploit subtle, relational prosodic cues in speech perception.
Infants as young as six months are sensitive to prosodic phrase boundaries marked by three acoustic cues: pitch change, final lengthening, and pause. Behavioral studies suggest that a language-specific weighting of these cues develops during the first year of life; recent work on German revealed that eight-month-olds, unlike six-month-olds, are capable of perceiving a prosodic boundary on the basis of pitch change and final lengthening only. The present study uses Event-Related Potentials (ERPs) to investigate the neuro-cognitive development of prosodic cue perception in German-learning infants. In adults’ ERPs, prosodic boundary perception is clearly reflected by the so-called Closure Positive Shift (CPS). To date, there is mixed evidence on whether an infant CPS exists that signals early prosodic cue perception, or whether the CPS emerges only later—the latter implying that infantile brain responses to prosodic boundaries reflect acoustic, low-level pause detection.
We presented six- and eight-month-olds with stimuli containing either no boundary cues, only a pitch cue, or a combination of both pitch change and final lengthening. For both age groups, responses to the former two conditions did not differ, while brain responses to prosodic boundaries cued by pitch change and final lengthening showed a positivity that we interpret as a CPS-like infant ERP component. This hints at an early sensitivity to prosodic boundaries that cannot exclusively be based on pause detection. Instead, infants’ brain responses indicate an early ability to exploit subtle, relational prosodic cues in speech perception—presumably even earlier than could be concluded from previous behavioral results.
Speech scientists have long noted that the qualities of naturally-produced vowels do not remain constant over their durations regardless of being nominally "monophthongs" or "diphthongs". Recent acoustic corpora show that there are consistent patterns of first (F1) and second (F2) formant frequency change across different vowel categories. The three Australian English (AusE) close front vowels /i:, 1, i/ provide a striking example: while their midpoint or mean F1 and F2 frequencies are virtually identical, their spectral change patterns distinctly differ. The results indicate that, despite the distinct patterns of spectral change of AusE /i:, i, la/ in production, its perceptual relevance is not uniform, but rather vowel-category dependent.
Perceptuomotor compatibility between phonemically identical spoken and perceived syllables has been found to speed up response times (RTs) in speech production tasks. However, research on compatibility effects between perceived and produced stimuli at the subphonemic level is limited. Using a cue-distractor task, we investigated the effects of phonemic and subphonemic congruency in pairs of vowels. On each trial, a visual cue prompted individuals to produce a response vowel, and after the visual cue appeared a distractor vowel was auditorily presented while speakers were planning to produce the response vowel. The results revealed effects on RTs due to phonemic congruency (same vs. different vowels) between the response and distractor vowels, which resemble effects previously seen for consonants. Beyond phonemic congruency, we assessed how RTs are modulated as a function of the degree of subphonemic similarity between the response and distractor vowels. Higher similarity between the response and distractor in terms of phonological distance-defined by number of mismatching phonological features-resulted in faster RTs. However, the exact patterns of RTs varied across response-distractor vowel pairs. We discuss how different assumptions about phonological feature representations may account for the different patterns observed in RTs across response-distractor pairs. Our findings on the effects of perceived stimuli on produced speech at a more detailed level of representation than phonemic identity necessitate a more direct and specific formulation of the perception-production link. Additionally, these results extend previously reported perceptuomotor interactions mainly involving consonants to vowels.
During the first year of life, infants undergo a process known as perceptual narrowing, which reduces their sensitivity to classes of stimuli which the infants do not encounter in their environment. It has been proposed that perceptual narrowing for faces and speech may be driven by shared domain-general processes. To investigate this theory, our study longitudinally tested 50 German Caucasian infants with respect to these domains first at 6 months of age followed by a second testing at 9 months of age. We used an infant-controlled habituation-dishabituation paradigm to test the infants' ability to discriminate among other-race Asian faces and non-native Cantonese speech tones, as well as same-race Caucasian faces as a control. We found that while at 6 months of age infants could discriminate among all stimuli, by 9 months of age they could no longer discriminate among other-race faces or non-native tones. However, infants could discriminate among same-race stimuli both at 6 and at 9 months of age. These results demonstrate that the same infants undergo perceptual narrowing for both other-race faces and non-native speech tones between the ages of 6 and 9 months. This parallel development of perceptual narrowing occurring in both the face and speech perception modalities over the same period of time lends support to the domain-general theory of perceptual narrowing in face and speech perception.