TY  - JOUR
A1  - Hecker, Pascal
A1  - Steckhan, Nico
A1  - Eyben, Florian
A1  - Schuller, Björn Wolfgang
A1  - Arnrich, Bert
T1  - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends
JF  - Frontiers in Digital Health
N2  - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance.
KW  - neurological disorders
KW  - voice
KW  - speech
KW  - everyday life
KW  - multiple modalities
KW  - machine learning
KW  - disorder recognition
Y1  - 2022
U6  - https://doi.org/10.3389/fdgth.2022.842301
SN  - 2673-253X
PB  - Frontiers Media SA
CY  - Lausanne, Schweiz
ER  - 
TY  - GEN
A1  - Hecker, Pascal
A1  - Steckhan, Nico
A1  - Eyben, Florian
A1  - Schuller, Björn Wolfgang
A1  - Arnrich, Bert
T1  - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends
T2  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 13 
KW  - neurological disorders
KW  - voice
KW  - speech
KW  - everyday life
KW  - multiple modalities
KW  - machine learning
KW  - disorder recognition
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-581019
IS  - 13
ER  - 
TY  - GEN
A1  - Ott, Susan
A1  - Höhle, Barbara
T1  - Verb inflection in German-learning children with typical and atypical language acquisition
BT  - the impact of subsyllabic frequencies
T2  - Journal of Child Language
N2  - Previous research has shown that high phonotactic frequencies
facilitate the production of regularly inflected verbs in English-learning
children with specific language impairment (SLI) but not with typical
development (TD). We asked whether this finding can be replicated
for German, a language with a much more complex inflectional
verb paradigm than English. Using an elicitation task, the production
of inflected nonce verb forms (3 rd person singular with -t suffix)
with either high- or low-frequency subsyllables was tested in
sixteen German-learning children with SLI (ages 4;1–5 ;1), sixteen
TD-children matched for chronological age (CA) and fourteen TD-
children matched for verbal age (VA) (ages 3;0–3 ;11). The findings
revealed that children with SLI, but not CA- or VA-children, showed
differential performance between the two types of verbs, producing
more inflectional errors when the verb forms resulted in low-frequency
subsyllables than when they resulted in high-frequency subsyllables,
replicating the results from English-learning children.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 530 
KW  - english past tense
KW  - phonotactic probability
KW  - sentence repetition
KW  - nonword repetition
KW  - speaking children
KW  - impairment
KW  - morphology
KW  - infants
KW  - speech
KW  - words
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-416475
SN  - 1866-8364
IS  - 530
ER  - 
TY  - JOUR
A1  - Ott, Susan
A1  - Höhle, Barbara
T1  - Verb inflection in German-learning children with typical and atypical language acquisition
BT  - the impact of subsyllabic frequencies
JF  - Journal of child language
N2  - Previous research has shown that high phonotactic frequencies facilitate the production of regularly inflected verbs in English-learning children with specific language impairment (SLI) but not with typical development (TD). We asked whether this finding can be replicated for German, a language with a much more complex inflectional verb paradigm than English. Using an elicitation task, the production of inflected nonce verb forms (3rd person singular with - t suffix) with either high-or low-frequency subsyllables was tested in sixteen German-learning children with SLI (ages 4;1-5;1), sixteen TD-children matched for chronological age (CA) and fourteen TD-children matched for verbal age (VA) (ages 3;0-3;11). The findings revealed that children with SLI, but not CA-or VA-children, showed differential performance between the two types of verbs, producing more inflectional errors when the verb forms resulted in low-frequency subsyllables than when they resulted in high-frequency subsyllables, replicating the results from English-learning children.
KW  - english past tense
KW  - sentence repetition
KW  - nonword repetition
KW  - speaking children
KW  - impairment
KW  - morphology
KW  - infants
KW  - speech
KW  - words
Y1  - 2012
U6  - https://doi.org/10.1017/S030500091200027X
SN  - 0305-0009
VL  - 40
IS  - 1
SP  - 169
EP  - 192
PB  - Cambridge University Press
CY  - New York
ER  - 
TY  - THES
A1  - López Gambino, Maria Soledad
T1  - Time Buying in Task-Oriented Spoken Dialogue Systems
N2  - This dissertation focuses on the handling of time in dialogue. Specifically, it investigates how humans bridge time, or “buy time”, when they are expected to convey information that is not yet available to them (e.g. a travel agent searching for a flight in a long list while the customer is on the line, waiting). It also explores the feasibility of modeling such time-bridging behavior in spoken dialogue systems, and it examines
how endowing such systems with more human-like time-bridging capabilities may affect humans’ perception of them.

The relevance of time-bridging in human-human dialogue seems to stem largely from a need to avoid lengthy pauses, as these may cause both confusion and discomfort among the participants of a conversation (Levinson, 1983; Lundholm Fors, 2015). However, this avoidance of prolonged silence is at odds with the incremental nature of speech production in dialogue (Schlangen and Skantze, 2011): Speakers often start to verbalize their contribution before it is fully formulated, and sometimes even before they possess the information they need to provide, which may result in them running out of content mid-turn.

In this work, we elicit conversational data from humans, to learn how they avoid being silent while they search for information to convey to their interlocutor. We identify commonalities in the types of resources employed by different speakers, and we propose a classification scheme. We explore ways of modeling human time-buying behavior computationally, and we evaluate the effect on human listeners of embedding this behavior in a spoken dialogue system.

Our results suggest that a system using conversational speech to bridge time while searching for information to convey (as humans do) can provide a better experience in several respects than one which remains silent for a long period of time. However, not all speech serves this purpose equally: Our experiments also show that a system whose time-buying behavior is more varied (i.e. which exploits several categories from the classification scheme we developed and samples them based on information from human data) can prevent overestimation of waiting time when compared, for example, with a system that repeatedly asks the interlocutor to wait (even if these requests for waiting are phrased differently each time). Finally, this research shows that it is possible to model human time-buying behavior on a relatively small corpus, and that a system using such a model can be preferred by participants over one employing a simpler strategy, such as randomly choosing utterances to produce during the wait —even when the utterances used by both strategies are the same.
N2  - Die zentralen Themen dieser Arbeit sind Zeit und Dialog. Insbesondere wird untersucht, wie Menschen Zeit gewinnen oder „Zeit kaufen“, wenn sie Informationen übermitteln müssen, die ihnen noch nicht zur Verfügung stehen (z. B. ein Reisebüroangestellter, der in einer langen Liste nach einem Flug sucht, während der Kunde am Telefon wartet). Außerdem wird untersucht, ob die Modellierung eines solchen Zeitüberbrückungsverhaltens in gesprochenen Dialogsystemen möglich ist und wie solche Fähigkeiten die Benutzererfahrung beeinflussen.

Wir erheben Gesprächsdaten und ermitteln, wie die Sprecher den Dialog am Laufen halten, während sie nach Informationen für ihre(n) Gesprächspartner(in) suchen. Wir identifizieren Gemeinsamkeiten in den Ressourcen, die von verschiedenen Sprechern verwendet werden und schlagen ein Klassifizierungsschema vor. Wir erforschen Strategien, menschliches „Zeitüberbrückung“ zu modellieren, und wir bewerten die Auswirkungen dieses Verhaltens in ein gesprochenes Dialogsystem auf menschliche Zuhörer.
T2  - Zeitgewinn in aufgabenorientierten Sprachdialogsystemen
KW  - dialogue system
KW  - Dialogsystem
KW  - linguistics
KW  - Linguistik
KW  - speech
KW  - Sprache
KW  - dialogue
KW  - Dialog
KW  - time-buying
KW  - Zeitgewinn
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-592806
ER  - 
TY  - GEN
A1  - Offrede, Tom F.
A1  - Jacobi, Jidde
A1  - Rebernik, Teja
A1  - de Jong, Lisanne
A1  - Keulen, Stefanie
A1  - Veenstra, Pauline
A1  - Noiray, Aude
A1  - Wieling, Martijn
T1  - The impact of alcohol on L1 versus L2
T2  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe
N2  - Alcohol intoxication is known to affect many aspects of human behavior and cognition; one of such affected systems is articulation during speech production. Although much research has revealed that alcohol negatively impacts pronunciation in a first language (L1), there is only initial evidence suggesting a potential beneficial effect of inebriation on articulation in a non-native language (L2). The aim of this study was thus to compare the effect of alcohol consumption on pronunciation in an L1 and an L2. Participants who had ingested different amounts of alcohol provided speech samples in their L1 (Dutch) and L2 (English), and native speakers of each language subsequently rated the pronunciation of these samples on their intelligibility (for the L1) and accent nativelikeness (for the L2). These data were analyzed with generalized additive mixed modeling. Participants' blood alcohol concentration indeed negatively affected pronunciation in L1, but it produced no significant effect on the L2 accent ratings. The expected negative impact of alcohol on L1 articulation can be explained by reduction in fine motor control. We present two hypotheses to account for the absence of any effects of intoxication on L2 pronunciation: (1) there may be a reduction in L1 interference on L2 speech due to decreased motor control or (2) alcohol may produce a differential effect on each of the two linguistic subsystems.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 848 
KW  - acute alcohol consumption
KW  - articulation
KW  - speech
KW  - bilingualism
Y1  - 2020
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-540955
SN  - 1866-8364
IS  - 3
ER  - 
TY  - JOUR
A1  - Offrede, Tom F.
A1  - Jacobi, Jidde
A1  - Rebernik, Teja
A1  - de Jong, Lisanne
A1  - Keulen, Stefanie
A1  - Veenstra, Pauline
A1  - Noiray, Aude
A1  - Wieling, Martijn
T1  - The impact of alcohol on L1 versus L2
JF  - Language and Speech
N2  - Alcohol intoxication is known to affect many aspects of human behavior and cognition; one of such affected systems is articulation during speech production. Although much research has revealed that alcohol negatively impacts pronunciation in a first language (L1), there is only initial evidence suggesting a potential beneficial effect of inebriation on articulation in a non-native language (L2). The aim of this study was thus to compare the effect of alcohol consumption on pronunciation in an L1 and an L2. Participants who had ingested different amounts of alcohol provided speech samples in their L1 (Dutch) and L2 (English), and native speakers of each language subsequently rated the pronunciation of these samples on their intelligibility (for the L1) and accent nativelikeness (for the L2). These data were analyzed with generalized additive mixed modeling. Participants' blood alcohol concentration indeed negatively affected pronunciation in L1, but it produced no significant effect on the L2 accent ratings. The expected negative impact of alcohol on L1 articulation can be explained by reduction in fine motor control. We present two hypotheses to account for the absence of any effects of intoxication on L2 pronunciation: (1) there may be a reduction in L1 interference on L2 speech due to decreased motor control or (2) alcohol may produce a differential effect on each of the two linguistic subsystems.
KW  - acute alcohol consumption
KW  - articulation
KW  - speech
KW  - bilingualism
Y1  - 2020
U6  - https://doi.org/10.1177/0023830920953169
SN  - 1756-6053
SN  - 0023-8309
VL  - 64
IS  - 3
SP  - 681
EP  - 692
PB  - SAGE Publications
CY  - Thousand Oaks
ER  - 
TY  - GEN
A1  - Veríssimo, Joao Marques
A1  - Heyer, Vera
A1  - Jacob, Gunnar
A1  - Clahsen, Harald
T1  - Selective effects of age of acquisition on morphological priming
BT  - evidence for a sensitive period
T2  - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe
N2  - Is there an ideal time window for language acquisition after which nativelike
representation and processing are unattainable? Although this question has
been heavily debated, no consensus has been reached. Here, we present
evidence for a sensitive period in language development and show that it is
specific to grammar. We conducted a masked priming task with a group of
Turkish-German bilinguals and examined age of acquisition (AoA) effects on
the processing of complex words. We compared a subtle but meaningful
linguistic contrast, that between grammatical inflection and lexical-based
derivation. The results showed a highly selective AoA effect on inflectional
(but not derivational) priming. In addition, the effect displayed a discontinuity
indicative of a sensitive period: Priming from inflected forms was nativelike
when acquisition started before the age of 5 but declined with increasing
AoA. We conclude that the acquisition of morphological rules expressing
morphosyntactic properties is constrained by maturational factors.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 486 
KW  - visual word recognition
KW  - 2nd-language acquisition
KW  - maturational constraints
KW  - language-acquisition
KW  - 2nd langauge
KW  - speech
KW  - experience
KW  - perception
KW  - english
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-412611
SN  - 1866-8364
IS  - 486
ER  - 
TY  - GEN
A1  - Bhatara, Anjali
A1  - Laukka, Petri
A1  - Boll-Avetisyan, Natalie
A1  - Granjon, Lionel
A1  - Elfenbein, Hillary Anger
A1  - Bänziger, Tanja
T1  - Second language ability and emotional prosody perception
T2  - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe
N2  - The present study examines the effect of language experience on vocal emotion perception in a second language. Native speakers of French with varying levels of self-reported English ability were asked to identify emotions from vocal expressions produced by American actors in a forced-choice task, and to rate their pleasantness, power, alertness and intensity on continuous scales. Stimuli included emotionally expressive English speech (emotional prosody) and non-linguistic vocalizations (affect bursts), and a baseline condition with Swiss-French pseudo-speech. Results revealed effects of English ability on the recognition of emotions in English speech but not in non-linguistic vocalizations. Specifically, higher English ability was associated with less accurate identification of positive emotions, but not with the interpretation of negative emotions. Moreover, higher English ability was associated with lower ratings of pleasantness and power, again only for emotional prosody. This suggests that second language skills may sometimes interfere with emotion recognition from speech prosody, particularly for positive emotions.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 503 
KW  - recognizing emotions
KW  - basic emotions
KW  - recognition
KW  - language
KW  - vocalizations
KW  - speech
KW  - models
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-411860
SN  - 1866-8364
IS  - 503
ER  - 
TY  - JOUR
A1  - Mokari, Payam Ghaffarvand
A1  - Gafos, Adamantios I.
A1  - Williams, Daniel
T1  - Perceptuomotor compatibility effects in vowels
BT  - effects of consonantal context and acoustic proximity of response and distractor
JF  - JASA Express Letters
N2  - In a cue-distractor task, speakers' response times (RTs) were found to speed up when they perceived a distractor syllable whose vowel was identical to the vowel in the syllable they were preparing to utter. At a more fine-grained level, subphonemic congruency between response and distractor-defined by higher number of shared phonological features or higher acoustic proximity-was also found to be predictive of RT modulations. Furthermore, the findings indicate that perception of vowel stimuli embedded in syllables gives rise to robust and more consistent perceptuomotor compatibility effects (compared to isolated vowels) across different response-distractor vowel pairs.
KW  - speech
Y1  - 2021
U6  - https://doi.org/10.1121/10.0003039
SN  - 2691-1191
VL  - 1
IS  - 1
PB  - American Institute of Physics
CY  - Melville
ER  -