Refine
Year of publication
Document Type
- Article (21)
- Postprint (4)
- Other (2)
- Monograph/Edited Volume (1)
Language
- English (28)
Is part of the Bibliography
- yes (28)
Keywords
- speech production (4)
- speech (3)
- Festschrift (2)
- Informationsstruktur (2)
- Linguistik (2)
- Morphologie (2)
- Speech perception (2)
- Speech production (2)
- Syntax (2)
- acoustic variability (2)
Institute
Previous studies suggest that there are special timing relations in syllable onsets. The consonants are assumed to be timed, on the one hand, with the vocalic nucleus and, on the other hand, with each other. These competing timing relations result in the C-center effect. However, the C-center effect has not consistently been found in languages with complex onsets. Moreover, it has occasionally been found in languages disallowing complex onsets. The present study investigates onset timing in German while discussing alternative explanations (not related to bonding) for the timing patterns observed. Six German speakers were recorded via Electromagnetic Articulography. The corpus contained items with four clusters (/sk/, /kv/, /gl/, and /pl/). The clusters occur in word-initial position, word-medial position, and across a word boundary preceding different vowels. The results suggest that segmental properties (i.e., oral-laryngeal coordination, coarticulatory resistance) determine the observed timing patterns, and specifically the absence or presence of the C-center effect.
The speed-curvature power law is a celebrated law of motor control expressing a relation between the kinematic property of speed and the geometric property of curvature. We aimed to assess whether speech movements obey this law just as movements from other domains do. We describe a metronome-driven speech elicitation paradigm designed to cover a wide range of speeds. We recorded via electromagnetic articulometry speech movements in sequences of the form /CV…/ from nine speakers (five German, four English) speaking at eight distinct rates. First, we demonstrate that the paradigm of metronome-driven manipulations results in speech movement data consistent with earlier reports on the kinematics of speech production. Second, analysis of our data in their full three-dimensions and using advanced numerical differentiation methods offers stronger evidence for the law than that reported in previous studies devoted to its assessment. Finally, we demonstrate the presence of a clear rate dependency of the power law’s parameters. The robustness of the speed-curvature relation in our datasets lends further support to the hypothesis that the power law is a general feature of human movement. We place our results in the context of other work in movement control and consider implications for models of speech production.
We propose a theory of how the speech gesture determines change in a functionally relevant variable of vocal tract state (e.g., constriction degree). A core postulate of the theory is that the gesture determines how the variable evolves in time independent of any executive timekeeper. That is, the theory involves intrinsic timing of speech gestures. We compare the theory against others in which an executive timekeeper determines change in vocal tract state. Theories that employ an executive timekeeper have been proposed to correct for disparities between theoretically predicted and experimentally observed velocity profiles. Such theories of extrinsic timing make the gesture a nonautonomous dynamical system. For a nonautonomous dynamical system, the change in state depends not just on the state but also on time. We show that this nonautonomous extension makes surprisingly weak kinematic predictions both qualitatively and quantitatively. We propose instead that the gesture is a theoretically simpler nonlinear autonomous dynamical system. For the proposed nonlinear autonomous dynamical system, the change in state depends nonlinearly on the state and does not depend on time. This new theory provides formal expression to the notion of intrinsic timing. Furthermore, it predicts experimentally observed relations among kinematic variables.
We examined gestural coordination in C1C2 (C1 stop, C2 lateral or tap) word initial clusters using articulatory (electromagnetic articulometry) and acoustic data from six speakers of Standard Peninsular Spanish. We report on patterns of voice onset time (VOT), gestural plateau duration of C1, C2, and their overlap. For VOT, as expected, place of articulation is a major factor, with velars exhibiting longer VOTs than labials. Regarding C1 plateau duration, voice and place effects were found such that voiced consonants are significantly shorter than voiceless consonants, and velars show longer duration than labials. For C2 plateau duration, lateral duration was found to vary as a function of onset complexity (C vs. CC). As for overlap, unlike in French, where articulatory data for clusters have also been examined, clusters where both C1 and C2 are voiced show more overlap than where voicing differs. Further, overlap was affected by the C2 such that clusters where C2 is a tap show less overlap than clusters where C2 is a lateral. We discuss these results in the context of work aiming to uncover phonetic (e.g., articulatory or perceptual) and phonological forces (e.g., syllabic organization) on timing.
Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed.
Drawing on phonology research within the generative linguistics tradition, stochastic methods, and notions from complex systems, we develop a modelling paradigm linking phonological structure, expressed in terms of syllables, to speech movement data acquired with 3D electromagnetic articulography and X-ray microbeam methods. The essential variable in the models is syllable structure. When mapped to discrete coordination topologies, syllabic organization imposes systematic patterns of variability on the temporal dynamics of speech articulation. We simulated these dynamics under different syllabic parses and evaluated simulations against experimental data from Arabic and English, two languages claimed to parse similar strings of segments into different syllabic structures. Model simulations replicated several key experimental results, including the fallibility of past phonetic heuristics for syllable structure, and exposed the range of conditions under which such heuristics remain valid. More importantly, the modelling approach consistently diagnosed syllable structure proving resilient to multiple sources of variability in experimental data including measurement variability, speaker variability, and contextual variability. Prospects for extensions of our modelling paradigm to acoustic data are also discussed.
We pursue an analysis of the relation between qualitative syllable parses and their quantitative phonetic consequences. To do this, we express the statistics of a symbolic organization corresponding to a syllable parse in terms of continuous phonetic parameters which quantify the timing of the consonants and vowels that make up syllables: consonantal plateau durations, vowel durations, and their variances. These parameters can be estimated from continuous phonetic data. This enables analysis of the link between symbolic phonological form and the continuous phonetics in which this form is manifest. Pursuing such an analysis, we illustrate the predictions of the syllabic organization corresponding to simplex onsets and derive a number of previously experimentally observed and simulation results. Specifically, we derive not only the canonical phonetic manifestations of simplex onsets but also the result that, under certain conditions we make precise, the phonetic indices of the simplex onset organization change to a range of values characteristic of the complex onset organization. Finally, we explore the behavior of phonetic indices for syllabic organization over progressively increasing,sizes of lexical samples, thereby concomitantly diversifying the phonetic context over which these indices are taken.
Spectral change and duration as cues in Australian English listeners' front vowel categorization
(2018)
Australian English /iː/, /ɪ/, and /ɪə/ exhibit almost identical average first (F1) and second (F2) formant frequencies and differ in duration and vowel inherent spectral change (VISC). The cues of duration, F1 × F2 trajectory direction (TD) and trajectory length (TL) were assessed in listeners' categorization of /iː/ and /ɪə/ compared to /ɪ/. Duration was important for distinguishing both /iː/ and /ɪə/ from /ɪ/. TD and TL were important for categorizing /iː/ versus /ɪ/, whereas only TL was important for /ɪə/ versus /ɪ/. Finally, listeners' use of duration and VISC was not mutually affected for either vowel compared to /ɪ/.
Voice onset time (VOT), a primary cue for voicing in many languages including English and German, is known to vary greatly between speakers, but also displays robust within-speaker consistencies, at least in English. The current analysis extends these findings to German. VOT measures were investigated from voiceless alveolar and velar stops in CV syllables cued by a visual prompt in a cue-distractor task. Comparably to English, a considerable portion of German VOT variability can be attributed to the syllable’s vowel length and the stop’s place of articulation. Individual differences in VOT still remain irrespective of speech rate. However, significant correlations across places of articulation and between speaker-specific mean VOTs and standard deviations indicate that talkers employ a relatively unified VOT profile across places of articulation. This could allow listeners to more efficiently adapt to speaker-specific realisations.
Spatiotemporal coordination in word-medial stop-lateral and s-stop clusters of American English
(2021)
This paper is concerned with the relation between syllabic organization and intersegmental spatiotemporal coordination using Electromagnetic Articulometry recordings from seven speakers of American English (henceforth, English). Whereas previous work on English has focused on word-initial clusters (preceding a vowel whose identity was not systematically varied), the present work examined word-medial clusters /pl, kl, sp, sk/ in the context of three different vowel heights (high, mid, low). Our results provide evidence for a global organization for the segments involved in these cluster-vowel combinations. This is reflected in a number of ways: compression of the prevocalic consonant and reduction of CV timing in the word-medial cluster case compared to its singleton paired word in both stop-lateral and s-stop clusters, early vowel initiation (as permitted by the clusters' phonetic properties), and presence of compensatory relations between phonetic properties of different segments or intersegmental transitions within each cluster. In other words, we find that the global organization presiding over the segments partaking in these word-medial tautosyllabic CCVs is pleiotropic, that is, simultaneously expressed in multiple phonetic exponents rather than via a privileged metric such as c-center stability or any other such given single measure employed in previous works.