TY - JOUR A1 - Hecker, Pascal A1 - Steckhan, Nico A1 - Eyben, Florian A1 - Schuller, Björn Wolfgang A1 - Arnrich, Bert T1 - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends JF - Frontiers in Digital Health N2 - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance. KW - neurological disorders KW - voice KW - speech KW - everyday life KW - multiple modalities KW - machine learning KW - disorder recognition Y1 - 2022 U6 - https://doi.org/10.3389/fdgth.2022.842301 SN - 2673-253X PB - Frontiers Media SA CY - Lausanne, Schweiz ER - TY - GEN A1 - Hecker, Pascal A1 - Steckhan, Nico A1 - Eyben, Florian A1 - Schuller, Björn Wolfgang A1 - Arnrich, Bert T1 - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends T2 - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät N2 - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance. T3 - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 13 KW - neurological disorders KW - voice KW - speech KW - everyday life KW - multiple modalities KW - machine learning KW - disorder recognition Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-581019 IS - 13 ER - TY - GEN A1 - Ott, Susan A1 - Höhle, Barbara T1 - Verb inflection in German-learning children with typical and atypical language acquisition BT - the impact of subsyllabic frequencies T2 - Journal of Child Language N2 - Previous research has shown that high phonotactic frequencies facilitate the production of regularly inflected verbs in English-learning children with specific language impairment (SLI) but not with typical development (TD). We asked whether this finding can be replicated for German, a language with a much more complex inflectional verb paradigm than English. Using an elicitation task, the production of inflected nonce verb forms (3 rd person singular with -t suffix) with either high- or low-frequency subsyllables was tested in sixteen German-learning children with SLI (ages 4;1–5 ;1), sixteen TD-children matched for chronological age (CA) and fourteen TD- children matched for verbal age (VA) (ages 3;0–3 ;11). The findings revealed that children with SLI, but not CA- or VA-children, showed differential performance between the two types of verbs, producing more inflectional errors when the verb forms resulted in low-frequency subsyllables than when they resulted in high-frequency subsyllables, replicating the results from English-learning children. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 530 KW - english past tense KW - phonotactic probability KW - sentence repetition KW - nonword repetition KW - speaking children KW - impairment KW - morphology KW - infants KW - speech KW - words Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-416475 SN - 1866-8364 IS - 530 ER - TY - JOUR A1 - Ott, Susan A1 - Höhle, Barbara T1 - Verb inflection in German-learning children with typical and atypical language acquisition BT - the impact of subsyllabic frequencies JF - Journal of child language N2 - Previous research has shown that high phonotactic frequencies facilitate the production of regularly inflected verbs in English-learning children with specific language impairment (SLI) but not with typical development (TD). We asked whether this finding can be replicated for German, a language with a much more complex inflectional verb paradigm than English. Using an elicitation task, the production of inflected nonce verb forms (3rd person singular with - t suffix) with either high-or low-frequency subsyllables was tested in sixteen German-learning children with SLI (ages 4;1-5;1), sixteen TD-children matched for chronological age (CA) and fourteen TD-children matched for verbal age (VA) (ages 3;0-3;11). The findings revealed that children with SLI, but not CA-or VA-children, showed differential performance between the two types of verbs, producing more inflectional errors when the verb forms resulted in low-frequency subsyllables than when they resulted in high-frequency subsyllables, replicating the results from English-learning children. KW - english past tense KW - sentence repetition KW - nonword repetition KW - speaking children KW - impairment KW - morphology KW - infants KW - speech KW - words Y1 - 2012 U6 - https://doi.org/10.1017/S030500091200027X SN - 0305-0009 VL - 40 IS - 1 SP - 169 EP - 192 PB - Cambridge University Press CY - New York ER - TY - THES A1 - López Gambino, Maria Soledad T1 - Time Buying in Task-Oriented Spoken Dialogue Systems N2 - This dissertation focuses on the handling of time in dialogue. Specifically, it investigates how humans bridge time, or “buy time”, when they are expected to convey information that is not yet available to them (e.g. a travel agent searching for a flight in a long list while the customer is on the line, waiting). It also explores the feasibility of modeling such time-bridging behavior in spoken dialogue systems, and it examines how endowing such systems with more human-like time-bridging capabilities may affect humans’ perception of them. The relevance of time-bridging in human-human dialogue seems to stem largely from a need to avoid lengthy pauses, as these may cause both confusion and discomfort among the participants of a conversation (Levinson, 1983; Lundholm Fors, 2015). However, this avoidance of prolonged silence is at odds with the incremental nature of speech production in dialogue (Schlangen and Skantze, 2011): Speakers often start to verbalize their contribution before it is fully formulated, and sometimes even before they possess the information they need to provide, which may result in them running out of content mid-turn. In this work, we elicit conversational data from humans, to learn how they avoid being silent while they search for information to convey to their interlocutor. We identify commonalities in the types of resources employed by different speakers, and we propose a classification scheme. We explore ways of modeling human time-buying behavior computationally, and we evaluate the effect on human listeners of embedding this behavior in a spoken dialogue system. Our results suggest that a system using conversational speech to bridge time while searching for information to convey (as humans do) can provide a better experience in several respects than one which remains silent for a long period of time. However, not all speech serves this purpose equally: Our experiments also show that a system whose time-buying behavior is more varied (i.e. which exploits several categories from the classification scheme we developed and samples them based on information from human data) can prevent overestimation of waiting time when compared, for example, with a system that repeatedly asks the interlocutor to wait (even if these requests for waiting are phrased differently each time). Finally, this research shows that it is possible to model human time-buying behavior on a relatively small corpus, and that a system using such a model can be preferred by participants over one employing a simpler strategy, such as randomly choosing utterances to produce during the wait —even when the utterances used by both strategies are the same. N2 - Die zentralen Themen dieser Arbeit sind Zeit und Dialog. Insbesondere wird untersucht, wie Menschen Zeit gewinnen oder „Zeit kaufen“, wenn sie Informationen übermitteln müssen, die ihnen noch nicht zur Verfügung stehen (z. B. ein Reisebüroangestellter, der in einer langen Liste nach einem Flug sucht, während der Kunde am Telefon wartet). Außerdem wird untersucht, ob die Modellierung eines solchen Zeitüberbrückungsverhaltens in gesprochenen Dialogsystemen möglich ist und wie solche Fähigkeiten die Benutzererfahrung beeinflussen. Wir erheben Gesprächsdaten und ermitteln, wie die Sprecher den Dialog am Laufen halten, während sie nach Informationen für ihre(n) Gesprächspartner(in) suchen. Wir identifizieren Gemeinsamkeiten in den Ressourcen, die von verschiedenen Sprechern verwendet werden und schlagen ein Klassifizierungsschema vor. Wir erforschen Strategien, menschliches „Zeitüberbrückung“ zu modellieren, und wir bewerten die Auswirkungen dieses Verhaltens in ein gesprochenes Dialogsystem auf menschliche Zuhörer. T2 - Zeitgewinn in aufgabenorientierten Sprachdialogsystemen KW - dialogue system KW - Dialogsystem KW - linguistics KW - Linguistik KW - speech KW - Sprache KW - dialogue KW - Dialog KW - time-buying KW - Zeitgewinn Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-592806 ER - TY - GEN A1 - Offrede, Tom F. A1 - Jacobi, Jidde A1 - Rebernik, Teja A1 - de Jong, Lisanne A1 - Keulen, Stefanie A1 - Veenstra, Pauline A1 - Noiray, Aude A1 - Wieling, Martijn T1 - The impact of alcohol on L1 versus L2 T2 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe N2 - Alcohol intoxication is known to affect many aspects of human behavior and cognition; one of such affected systems is articulation during speech production. Although much research has revealed that alcohol negatively impacts pronunciation in a first language (L1), there is only initial evidence suggesting a potential beneficial effect of inebriation on articulation in a non-native language (L2). The aim of this study was thus to compare the effect of alcohol consumption on pronunciation in an L1 and an L2. Participants who had ingested different amounts of alcohol provided speech samples in their L1 (Dutch) and L2 (English), and native speakers of each language subsequently rated the pronunciation of these samples on their intelligibility (for the L1) and accent nativelikeness (for the L2). These data were analyzed with generalized additive mixed modeling. Participants' blood alcohol concentration indeed negatively affected pronunciation in L1, but it produced no significant effect on the L2 accent ratings. The expected negative impact of alcohol on L1 articulation can be explained by reduction in fine motor control. We present two hypotheses to account for the absence of any effects of intoxication on L2 pronunciation: (1) there may be a reduction in L1 interference on L2 speech due to decreased motor control or (2) alcohol may produce a differential effect on each of the two linguistic subsystems. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 848 KW - acute alcohol consumption KW - articulation KW - speech KW - bilingualism Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-540955 SN - 1866-8364 IS - 3 ER - TY - JOUR A1 - Offrede, Tom F. A1 - Jacobi, Jidde A1 - Rebernik, Teja A1 - de Jong, Lisanne A1 - Keulen, Stefanie A1 - Veenstra, Pauline A1 - Noiray, Aude A1 - Wieling, Martijn T1 - The impact of alcohol on L1 versus L2 JF - Language and Speech N2 - Alcohol intoxication is known to affect many aspects of human behavior and cognition; one of such affected systems is articulation during speech production. Although much research has revealed that alcohol negatively impacts pronunciation in a first language (L1), there is only initial evidence suggesting a potential beneficial effect of inebriation on articulation in a non-native language (L2). The aim of this study was thus to compare the effect of alcohol consumption on pronunciation in an L1 and an L2. Participants who had ingested different amounts of alcohol provided speech samples in their L1 (Dutch) and L2 (English), and native speakers of each language subsequently rated the pronunciation of these samples on their intelligibility (for the L1) and accent nativelikeness (for the L2). These data were analyzed with generalized additive mixed modeling. Participants' blood alcohol concentration indeed negatively affected pronunciation in L1, but it produced no significant effect on the L2 accent ratings. The expected negative impact of alcohol on L1 articulation can be explained by reduction in fine motor control. We present two hypotheses to account for the absence of any effects of intoxication on L2 pronunciation: (1) there may be a reduction in L1 interference on L2 speech due to decreased motor control or (2) alcohol may produce a differential effect on each of the two linguistic subsystems. KW - acute alcohol consumption KW - articulation KW - speech KW - bilingualism Y1 - 2020 U6 - https://doi.org/10.1177/0023830920953169 SN - 1756-6053 SN - 0023-8309 VL - 64 IS - 3 SP - 681 EP - 692 PB - SAGE Publications CY - Thousand Oaks ER - TY - GEN A1 - Veríssimo, Joao Marques A1 - Heyer, Vera A1 - Jacob, Gunnar A1 - Clahsen, Harald T1 - Selective effects of age of acquisition on morphological priming BT - evidence for a sensitive period T2 - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe N2 - Is there an ideal time window for language acquisition after which nativelike representation and processing are unattainable? Although this question has been heavily debated, no consensus has been reached. Here, we present evidence for a sensitive period in language development and show that it is specific to grammar. We conducted a masked priming task with a group of Turkish-German bilinguals and examined age of acquisition (AoA) effects on the processing of complex words. We compared a subtle but meaningful linguistic contrast, that between grammatical inflection and lexical-based derivation. The results showed a highly selective AoA effect on inflectional (but not derivational) priming. In addition, the effect displayed a discontinuity indicative of a sensitive period: Priming from inflected forms was nativelike when acquisition started before the age of 5 but declined with increasing AoA. We conclude that the acquisition of morphological rules expressing morphosyntactic properties is constrained by maturational factors. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 486 KW - visual word recognition KW - 2nd-language acquisition KW - maturational constraints KW - language-acquisition KW - 2nd langauge KW - speech KW - experience KW - perception KW - english Y1 - 2018 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-412611 SN - 1866-8364 IS - 486 ER - TY - GEN A1 - Bhatara, Anjali A1 - Laukka, Petri A1 - Boll-Avetisyan, Natalie A1 - Granjon, Lionel A1 - Elfenbein, Hillary Anger A1 - Bänziger, Tanja T1 - Second language ability and emotional prosody perception T2 - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe N2 - The present study examines the effect of language experience on vocal emotion perception in a second language. Native speakers of French with varying levels of self-reported English ability were asked to identify emotions from vocal expressions produced by American actors in a forced-choice task, and to rate their pleasantness, power, alertness and intensity on continuous scales. Stimuli included emotionally expressive English speech (emotional prosody) and non-linguistic vocalizations (affect bursts), and a baseline condition with Swiss-French pseudo-speech. Results revealed effects of English ability on the recognition of emotions in English speech but not in non-linguistic vocalizations. Specifically, higher English ability was associated with less accurate identification of positive emotions, but not with the interpretation of negative emotions. Moreover, higher English ability was associated with lower ratings of pleasantness and power, again only for emotional prosody. This suggests that second language skills may sometimes interfere with emotion recognition from speech prosody, particularly for positive emotions. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 503 KW - recognizing emotions KW - basic emotions KW - recognition KW - language KW - vocalizations KW - speech KW - models Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-411860 SN - 1866-8364 IS - 503 ER - TY - JOUR A1 - Mokari, Payam Ghaffarvand A1 - Gafos, Adamantios I. A1 - Williams, Daniel T1 - Perceptuomotor compatibility effects in vowels BT - effects of consonantal context and acoustic proximity of response and distractor JF - JASA Express Letters N2 - In a cue-distractor task, speakers' response times (RTs) were found to speed up when they perceived a distractor syllable whose vowel was identical to the vowel in the syllable they were preparing to utter. At a more fine-grained level, subphonemic congruency between response and distractor-defined by higher number of shared phonological features or higher acoustic proximity-was also found to be predictive of RT modulations. Furthermore, the findings indicate that perception of vowel stimuli embedded in syllables gives rise to robust and more consistent perceptuomotor compatibility effects (compared to isolated vowels) across different response-distractor vowel pairs. KW - speech Y1 - 2021 U6 - https://doi.org/10.1121/10.0003039 SN - 2691-1191 VL - 1 IS - 1 PB - American Institute of Physics CY - Melville ER -