Filtern
Dokumenttyp
- Wissenschaftlicher Artikel (21)
- Postprint (4)
- Sonstiges (2)
- Monographie/Sammelband (1)
Sprache
- Englisch (28)
Gehört zur Bibliographie
- ja (28)
Schlagworte
- speech production (4)
- speech (3)
- Festschrift (2)
- Informationsstruktur (2)
- Linguistik (2)
- Morphologie (2)
- Speech perception (2)
- Speech production (2)
- Syntax (2)
- acoustic variability (2)
Institut
Spectral change and duration as cues in Australian English listeners' front vowel categorization
(2018)
Australian English /iː/, /ɪ/, and /ɪə/ exhibit almost identical average first (F1) and second (F2) formant frequencies and differ in duration and vowel inherent spectral change (VISC). The cues of duration, F1 × F2 trajectory direction (TD) and trajectory length (TL) were assessed in listeners' categorization of /iː/ and /ɪə/ compared to /ɪ/. Duration was important for distinguishing both /iː/ and /ɪə/ from /ɪ/. TD and TL were important for categorizing /iː/ versus /ɪ/, whereas only TL was important for /ɪə/ versus /ɪ/. Finally, listeners' use of duration and VISC was not mutually affected for either vowel compared to /ɪ/.
Speech scientists have long noted that the qualities of naturally-produced vowels do not remain constant over their durations regardless of being nominally "monophthongs" or "diphthongs". Recent acoustic corpora show that there are consistent patterns of first (F1) and second (F2) formant frequency change across different vowel categories. The three Australian English (AusE) close front vowels /i:, 1, i/ provide a striking example: while their midpoint or mean F1 and F2 frequencies are virtually identical, their spectral change patterns distinctly differ. The results indicate that, despite the distinct patterns of spectral change of AusE /i:, i, la/ in production, its perceptual relevance is not uniform, but rather vowel-category dependent.
During a cue-distractor task, participants repeatedly produce syllables prompted by visual cues. Distractor syllables are presented to participants via headphones 150 ms after the visual cue (before any response). The task has been used to demonstrate perceptuomotor integration effects (perception effects on production): response times (RTs) speed up as the distractor shares more phonetic properties with the response. Here it is demonstrated that perceptuomotor integration is not limited to RTs. Voice Onset Times (VOTs) of the distractor syllables were systematically varied and their impact on responses was measured. Results demonstrate trial-specific convergence of response syllables to VOT values of distractor syllables.
In a preferential looking paradigm, we studied how children's looking behavior and pupillary response were modulated by the degree of phonological mismatch between the correct label of a target referent and its manipulated form. We manipulated degree of mismatch by introducing one or more featural changes to the target label. Both looking behavior and pupillary response were sensitive to degree of mismatch, corroborating previous studies that found differential responses in one or the other measure. Using time-course analyses, we present for the first time results demonstrating full separability among conditions (detecting difference not only between one vs. more, but also between two and three featural changes). Furthermore, the correct labels and small featural changes were associated with stable target preference, while large featural changes were associated with oscillating looking behavior, suggesting significant shifts in looking preference over time. These findings further support and extend the notion that early words are represented in great detail, containing subphonemic information.
This paper addresses the relation between syllable structure and inter-segmental temporal coordination. The data examined are Electromagnetic Articulometry recordings from six speakers of Central Peninsular Spanish (henceforth, Spanish), producing words beginning with the clusters /pl, bl, kl, gl, p(sic), k(sic), t(sic)/ as well as corresponding unclustered sonorant-initial words in three vowel contexts /a, e, o/. In our results, we find evidence for a global organization of the segments involved in these combinations. This is reflected in a number of ways: shortening of the prevocalic sonorant in the cluster-initial case compared to the unclustered case, reorganization of the relative timing of the internal CV subsequence (in a CCV) in the obstruent-lateral context, early vowel initiation, and a strong compensatory relation between the duration of the obstruent-to-lateral transition and the duration of the lateral. In other words, we find that the global organization presiding over the segments partaking in these tautosyllabic CCVs is pleiotropic, that is, simultaneously expressed over a set of different phonetic parameters rather than via a privileged metric such as c-center stability or any other such given single measure (employed in prior works).
Using articulatory data from five German speakers, we study how segmental sequences under different syllabic organizations respond to perturbations of phonetic parameters in the segments that compose them. Target words contained stop-lateral sequences /bl, gl, kl, pl/ in word-initial and cross-word contexts and were embedded in carrier phrases with different prosodic boundaries, i.e., no phrase boundary versus an utterance phrase boundary preceded the target word in the case of word-initial clusters, or separated the consonants in the case of cross-word sequences. For word-initial cluster (CCV) onsets, we find that increasing C1 stop duration or the lag between two consonants leads to earlier vowel initiation and reduced local timing stability across CV, CCV. Furthermore, as the inter-consonantal lag increases, C2 duration decreases. In contrast, for cross-word C#CV sequences, increasing inter-consonantal lag does not lead to earlier vowel initiation and robust local timing stability is maintained across CV, C#CV. In other words, in CCV sequences within words, local perturbations to segments have effects that ripple through the rest of the sequence. Instead, in cross-word C#CV sequences, local perturbations stay local. Overall, the findings indicate that the effects of phonetic perturbations on coordination patterns depend on the syllabic organization superimposed on these clusters.
We propose a theory of how the speech gesture determines change in a functionally relevant variable of vocal tract state (e.g., constriction degree). A core postulate of the theory is that the gesture determines how the variable evolves in time independent of any executive timekeeper. That is, the theory involves intrinsic timing of speech gestures. We compare the theory against others in which an executive timekeeper determines change in vocal tract state. Theories that employ an executive timekeeper have been proposed to correct for disparities between theoretically predicted and experimentally observed velocity profiles. Such theories of extrinsic timing make the gesture a nonautonomous dynamical system. For a nonautonomous dynamical system, the change in state depends not just on the state but also on time. We show that this nonautonomous extension makes surprisingly weak kinematic predictions both qualitatively and quantitatively. We propose instead that the gesture is a theoretically simpler nonlinear autonomous dynamical system. For the proposed nonlinear autonomous dynamical system, the change in state depends nonlinearly on the state and does not depend on time. This new theory provides formal expression to the notion of intrinsic timing. Furthermore, it predicts experimentally observed relations among kinematic variables.
We asked whether invariant phonetic indices for syllable structure can be identified in a language where word-initial consonant clusters, regardless of their sonority profile, are claimed to be parsed heterosyllabically. Four speakers of Moroccan Arabic were recorded, using Electromagnetic Articulography. Pursuing previous work, we employed temporal diagnostics for syllable structure, consisting of static correspondences between any given phonological organisation and its presumed phonetic indices. We show that such correspondences offer only a partial understanding of the relation between syllabic organisation and continuous indices of that organisation. We analyse the failure of the diagnostics and put forth a new approach in which different phonological organisations prescribe different ways in which phonetic indices change as phonetic parameters are scaled. The main finding is that invariance is found in these patterns of change, rather than in static correspondences between phonological constructs and fixed values for their phonetic indices.
Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed.
Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed.