Refine
Year of publication
- 2020 (53) (remove)
Document Type
- Article (41)
- Postprint (7)
- Doctoral Thesis (2)
- Bachelor Thesis (1)
- Master's Thesis (1)
- Review (1)
Language
- English (53) (remove)
Is part of the Bibliography
- yes (53)
Keywords
- German (4)
- clefts (4)
- psycholinguistics (4)
- Akan (3)
- bilingualism (3)
- definite pseudoclefts (3)
- morphology (3)
- speech (3)
- English (2)
- Hungarian focus (2)
Institute
- Department Linguistik (53) (remove)
This thesis aims to investigate the visualization approaches in the field of annotated discourse relations and to find a solution that meets the requirements best by comparing different programming tools. The subject of this research are coherence relations, which have several properties that can be challenging for many visualization methods. The thesis presents five different visualization options from both the application and the development perspective. The initially tested simple HTML approaches as well as the software package displaCy show the insufficient level for the visualization purposes of this work. The alternative implementation with D3 would optimally meet the requirements but goes beyond the scope of the project. The main method chosen in this thesis was implemented as a single web application and uses the brat annotation tool, which fulfills most of the defined requirements for the representation of the coherence relations. The application graphically displays the coherence relations annotated in the text and offers a filter function for different relation types.
We present novel experimental evidence on the availability and the status of exhaustivity inferences with focus partitioning in German, English, and Hungarian. Results suggest that German and English focus-background clefts and Hungarian focus share important properties, (É. Kiss 1998, 1999; Szabolcsi 1994; Percus 1997; Onea & Beaver 2009). Those constructions are anaphoric devices triggering an existence presupposition. EXH-inferences are not obligatory in such constructions in English, German, or Hungarian, against some previous literature (Percus 1997; Büring & Križ 2013; É. Kiss 1998), but in line with pragmatic analyses of EXH-inferences in clefts (Horn 1981, 2016; Pollard & Yasavul 2016). The cross-linguistic differences in the distribution of EXH-inferences are attributed to properties of the Hungarian number marking system.
We present novel experimental evidence on the availability and the status of exhaustivity inferences with focus partitioning in German, English, and Hungarian. Results suggest that German and English focus-background clefts and Hungarian focus share important properties, (É. Kiss 1998, 1999; Szabolcsi 1994; Percus 1997; Onea & Beaver 2009). Those constructions are anaphoric devices triggering an existence presupposition. EXH-inferences are not obligatory in such constructions in English, German, or Hungarian, against some previous literature (Percus 1997; Büring & Križ 2013; É. Kiss 1998), but in line with pragmatic analyses of EXH-inferences in clefts (Horn 1981, 2016; Pollard & Yasavul 2016). The cross-linguistic differences in the distribution of EXH-inferences are attributed to properties of the Hungarian number marking system.
Reflecting in written form on one's teaching enactments has been considered a facilitator for teachers' professional growth in university-based preservice teacher education. Writing a structured reflection can be facilitated through external feedback. However, researchers noted that feedback in preservice teacher education often relies on holistic, rather than more content-based, analytic feedback because educators oftentimes lack resources (e.g., time) to provide more analytic feedback. To overcome this impediment to feedback for written reflection, advances in computer technology can be of use. Hence, this study sought to utilize techniques of natural language processing and machine learning to train a computer-based classifier that classifies preservice physics teachers' written reflections on their teaching enactments in a German university teacher education program. To do so, a reflection model was adapted to physics education. It was then tested to what extent the computer-based classifier could accurately classify the elements of the reflection model in segments of preservice physics teachers' written reflections. Multinomial logistic regression using word count as a predictor was found to yield acceptable average human-computer agreement (F1-score on held-out test dataset of 0.56) so that it might fuel further development towards an automated feedback tool that supplements existing holistic feedback for written reflections with data-based, analytic feedback.
In this paper, we address some controversially debated empirical questions concerning object fronting in German by a series of acceptability rating studies. We investigated three kinds of factors: (i) properties of the subject (given/new, pronoun/full DP), (ii) emphasis, (iii) register. The first factor is predicted to play a crucial role by models in which object fronting possibilities are limited by prosodic properties. Two experiments provide converging evidence for a systematic effect of this factor: we find that the relative acceptability of object fronting across subjects that require an accent (new DPs) is lower than across deaccentable subjects (pronouns and given DPs). Other models predict object fronting across full phrases (but not across pronouns) to be limited to an emphatic interpretation. This prediction is also borne out, suggesting that both types of models capture an empirically valid generalization and can be seen as complementing each other rather than competing with each other. Finally, we find support for the view that informal register facilitates object fronting. In sum, our experiments contribute to clarifying the empirical basis concerning a phenomenon influenced by a range of interacting factors. This, in turn, informs theoretical approaches to the prefield position and helps to identify factors that need to be carefully controlled in this field of research.
Gender stereotypes influence subjective beliefs about the world, and this is reflected in our use of language. But do gender biases in language transparently reflect subjective beliefs? Or is the process of translating thought to language itself biased? During the 2016 United States (N = 24,863) and 2017 United Kingdom (N = 2,609) electoral campaigns, we compared participants' beliefs about the gender of the next head of government with their use and interpretation of pronouns referring to the next head of government. In the United States, even when the female candidate was expected to win, she pronouns were rarely produced and induced substantial comprehension disruption. In the United Kingdom, where the incumbent female candidate was heavily favored, she pronouns were preferred in production but yielded no comprehension advantage. These and other findings suggest that the language system itself is a source of implicit biases above and beyond previously known biases, such as those measured by the Implicit Association Test.
Gender stereotypes influence subjective beliefs about the world, and this is reflected in our use of language. But do gender biases in language transparently reflect subjective beliefs? Or is the process of translating thought to language itself biased? During the 2016 United States (N = 24,863) and 2017 United Kingdom (N = 2,609) electoral campaigns, we compared participants' beliefs about the gender of the next head of government with their use and interpretation of pronouns referring to the next head of government. In the United States, even when the female candidate was expected to win, she pronouns were rarely produced and induced substantial comprehension disruption. In the United Kingdom, where the incumbent female candidate was heavily favored, she pronouns were preferred in production but yielded no comprehension advantage. These and other findings suggest that the language system itself is a source of implicit biases above and beyond previously known biases, such as those measured by the Implicit Association Test.
A commonly used approach to parameter estimation in computational models is the so-called grid search procedure: the entire parameter space is searched in small steps to determine the parameter value that provides the best fit to the observed data. This approach has several disadvantages: first, it can be computationally very expensive; second, one optimal point value of the parameter is reported as the best fit value; we cannot quantify our uncertainty about the parameter estimate. In the main journal article that this methods article accompanies (Jager et al., 2020, Interference patterns in subject-verb agreement and reflexives revisited: A large-sample study, Journal of Memory and Language), we carried out parameter estimation using Approximate Bayesian Computation (ABC), which is a Bayesian approach that allows us to quantify our uncertainty about the parameter's values given data. This customization has the further advantage that it allows us to generate both prior and posterior predictive distributions of reading times from the cue-based retrieval model of Lewis and Vasishth, 2005. <br /> Instead of the conventional method of using grid search, we use Approximate Bayesian Computation (ABC) for parameter estimation in the [4] model. <br /> The ABC method of parameter estimation has the advantage that the uncertainty of the parameter can be quantified.
This study compares the development of prosodic processing in French- and German-learning infants. The emergence of language-specific perception of phrase boundaries was directly tested using the same stimuli across these two languages. French-learning (Experiment 1, 2) and German-learning 6- and 8-month-olds (Experiment 3) listened to the same French noun sequences with or without major prosodic boundaries ([Loulou et Manou] [et Nina]; [Loulou et Manou et Nina], respectively). The boundaries were either naturally cued (Experiment 1), or cued exclusively by pitch and duration (Experiment 2, 3). French-learning 6- and 8-month-olds both perceived the natural boundary, but neither perceived the boundary when only two cues were present. In contrast, German-learning infants develop from not perceiving the two-cue boundary at 6 months to perceiving it at 8 months, just like German-learning 8-month-olds listening to German (Wellmann, Holzgrefe, Truckenbrodt, Wartenburger, & Hohle, 2012). In a control experiment (Experiment 4), we found little difference between German and French adult listeners, suggesting that later, French listeners catch up with German listeners. Taken together, these cross-linguistic differences in the perception of identical stimuli provide direct evidence for language-specific development of prosodic boundary perception.
Child characteristics, family factors, and preschool factors are all found to affect the rate of bilingual children's vocabulary development in heritage language (HL). However, what remains unknown is the relative importance of these three sets of factors in HL vocabulary growth. The current study explored the complex issue with 457 Singaporean preschool children who are speaking either Mandarin, Malay, or Tamil as their HL. A series of internal factors (e.g., non-verbal intelligence) and external factors (e.g., maternal educational level) were used to predict children's HL vocabulary growth over a year at preschool with linear mixed effects models. The results demonstrated that external factors (i.e., family and preschool factors) are relatively more important than child characteristics in enhancing bilingual children's HL vocabulary growth. Specifically, children's language input quantity (i.e., home language dominance), input quality (e.g., number of books in HL), and HL input quantity at school (i.e., the time between two waves of tests at preschool) predict the participants' HL vocabulary growth, with initial vocabulary controlled. The relative importance of external factors in bilingual children's HL vocabulary development is attributed to the general bilingual setting in Singapore, where HL is taken as a subject to learn at preschool and children have fairly limited exposure to HL in general. The limited amount of input might not suffice to trigger the full expression of internal resources. Our findings suggest the crucial roles that caregivers and preschools play in early HL education, and the necessity of more parental involvement in early HL learning in particular.
The effect of decay and lexical uncertainty on processing long-distance dependencies in reading
(2020)
To make sense of a sentence, a reader must keep track of dependent relationships between words, such as between a verb and its particle (e.g. turn the music down). In languages such as German, verb-particle dependencies often span long distances, with the particle only appearing at the end of the clause. This means that it may be necessary to process a large amount of intervening sentence material before the full verb of the sentence is known. To facilitate processing, previous studies have shown that readers can preactivate the lexical information of neighbouring upcoming words, but less is known about whether such preactivation can be sustained over longer distances. We asked the question, do readers preactivate lexical information about long-distance verb particles? In one self-paced reading and one eye tracking experiment, we delayed the appearance of an obligatory verb particle that varied only in the predictability of its lexical identity. We additionally manipulated the length of the delay in order to test two contrasting accounts of dependency processing: that increased distance between dependent elements may sharpen expectation of the distant word and facilitate its processing (an antilocality effect), or that it may slow processing via temporal activation decay (a locality effect). We isolated decay by delaying the particle with a neutral noun modifier containing no information about the identity of the upcoming particle, and no known sources of interference or working memory load. Under the assumption that readers would preactivate the lexical representations of plausible verb particles, we hypothesised that a smaller number of plausible particles would lead to stronger preactivation of each particle, and thus higher predictability of the target. This in turn should have made predictable target particles more resistant to the effects of decay than less predictable target particles. The eye tracking experiment provided evidence that higher predictability did facilitate reading times, but found evidence against any effect of decay or its interaction with predictability. The self-paced reading study provided evidence against any effect of predictability or temporal decay, or their interaction. In sum, we provide evidence from eye movements that readers preactivate long-distance lexical content and that adding neutral sentence information does not induce detectable decay of this activation. The findings are consistent with accounts suggesting that delaying dependency resolution may only affect processing if the intervening information either confirms expectations or adds to working memory load, and that temporal activation decay alone may not be a major predictor of processing time.
Argumentation mining is a subfield of Computational Linguistics that aims (primarily) at automatically finding arguments and their structural components in natural language text. We provide a short introduction to this field, intended for an audience with a limited computational background. After explaining the subtasks involved in this problem of deriving the structure of arguments, we describe two other applications that are popular in computational linguistics: sentiment analysis and stance detection. From the linguistic viewpoint, they concern the semantics of evaluation in language. In the final part of the paper, we briefly examine the roles that these two tasks play in argumentation mining, both in current practice, and in possible future systems.
The notion of coherence relations is quite widely accepted in general, but concrete proposals differ considerably on the questions of how they should be motivated, which relations are to be assumed, and how they should be defined. This paper takes a "bottom-up" perspective by assessing the contribution made by linguistic signals (connectives), using insights from the relevant literature as well as verification by practical text annotation. We work primarily with the German language here and focus on the realm of contrast. Thus, we suggest a new inventory of contrastive connective functions and discuss their relationship to contrastive coherence relations that have been proposed in earlier work.
This paper addresses the relation between syllable structure and inter-segmental temporal coordination. The data examined are Electromagnetic Articulometry recordings from six speakers of Central Peninsular Spanish (henceforth, Spanish), producing words beginning with the clusters /pl, bl, kl, gl, p(sic), k(sic), t(sic)/ as well as corresponding unclustered sonorant-initial words in three vowel contexts /a, e, o/. In our results, we find evidence for a global organization of the segments involved in these combinations. This is reflected in a number of ways: shortening of the prevocalic sonorant in the cluster-initial case compared to the unclustered case, reorganization of the relative timing of the internal CV subsequence (in a CCV) in the obstruent-lateral context, early vowel initiation, and a strong compensatory relation between the duration of the obstruent-to-lateral transition and the duration of the lateral. In other words, we find that the global organization presiding over the segments partaking in these tautosyllabic CCVs is pleiotropic, that is, simultaneously expressed over a set of different phonetic parameters rather than via a privileged metric such as c-center stability or any other such given single measure (employed in prior works).
Among theories of human language comprehension, cue-based memory retrieval has proven to be a useful framework for understanding when and how processing difficulty arises in the resolution of long-distance dependencies. Most previous work in this area has assumed that very general retrieval cues like [+subject] or [+singular] do the work of identifying (and sometimes misidentifying) a retrieval target in order to establish a dependency between words. However, recent work suggests that general, handpicked retrieval cues like these may not be enough to explain illusions of plausibility (Cunnings & Sturt, 2018), which can arise in sentences like The letter next to the porcelain plate shattered. Capturing such retrieval interference effects requires lexically specific features and retrieval cues, but handpicking the features is hard to do in a principled way and greatly increases modeler degrees of freedom. To remedy this, we use well-established word embedding methods for creating distributed lexical feature representations that encode information relevant for retrieval using distributed retrieval cue vectors. We show that the similarity between the feature and cue vectors (a measure of plausibility) predicts total reading times in Cunnings and Sturt's eye-tracking data. The features can easily be plugged into existing parsing models (including cue-based retrieval and self-organized parsing), putting very different models on more equal footing and facilitating future quantitative comparisons.
Gender-inclusive language has evolved into a much-debated topic during the past years, discussed interdisciplinarily from theoretical to psycholinguistics, sociology, and economy – and by anyone who uses language.
Studies on German that primarily relied on questionnaires (reviewed in Braun et al. 2005), cloze tests (Klein 1988), and categorisation tasks with picture matching (Irmen & Köhncke 1996) disqualify the generically used masculine forms as pseudo-generic – failing their grammatically prescribed function to include referents of any Gender. Gender-balanced expressions (pair and split forms like Lehrer und Lehrerinnen) make explicit reference to female presence and participation, and thus elevate a more equitable interpretation.
Online methods to investigate the processing of Gender-sensitive language are surprisingly rare among research on the phenomenon, except for reaction time measures (Irmen & Köhncke 1996, Irmen & Kaczmarek 2000) and eye-tracking in reading (Irmen & Schumann 2011).
In addition, Gender-neutral language (GNL) has not been focused on in the majority of experiments, and when it was among the stimuli, results were inconclusive (De Backer & De Cuypere 2012) or found such alternatives to be ineffective (resembling masculine generics, Braun et al. 2005), despite the fact that guidelines on non-discriminatory language use commonly recommend these.
Gender-neutral (GN) expressions for personal reference in German include
• nominalised participles; nominalisations in general: Interessierte, Lehrende
• collective singulars: Publikum, Kollegium
• compounds (e.g., with a notion of “-person”): Ansprechpersonen, Lehrkräfte
• paraphrases that background a (gendered) subject: e.g., passives, relatives
In a visual world eye-tracking study, the comprehension of plural generics using masculine nouns and GN forms was tested for roles and occupations.
In complex stimulus scenarios, reference had to be established to referent images presented on a screen. At the end of each item, a question was asked in order to (re)identify the image that matched the referents of the respective setting best. Images depicted 1) a single person (protagonist), 2) an all-female group, 3) an all-male group, 4) a mixed Gender group of female and male members. The group referents were introduced with either a) masculine nouns (die Lehrer), b) female-specific feminine nouns (die Lehrerinnen), or c) one of the upper three nominal GN variants (die Lehrkräfte).
Results confirm the frequent male bias in masculine forms that are used as generics, that is, their male-specific interpretation. Furthermore, stereotypicality of nouns had an impact on responses. The GN alternatives, which are generally known to aim for indefinite reference (“marked” for Gender-fair language) were found to be most qualified to elicit mixed Gender group interpretations. When reference was established with GN terms, an inclusive response was consistently elicited. This was both indicated by eye movements and response proportions, but to a different extent depending on the particular GN noun type. Concepts that abstract from Gender in their linguistic forms (“neutralising” it) appear to be more inclusive, and thus better candidates for generic reference than masculines.
The paper investigates Turkish texts from heritage speakers of Turkish in Germany in a pseudo-longitudinal setting, looking at pupils' texts from the 5th, 7th, 10th and 12th grades. Two types of dynamics are identified in the advanced acquisition(1) of Turkish orthography in the heritage context. One is the dynamic of language contact, where in certain areas of the orthography, we find a re-interpretation of Turkish principles according to the German model. However, this changes as the pupils grow up. The second dynamic is the heritage situation. The heritage situation on one side leads to the establishment of new practices, and it also leads to a higher degree of variability of spelling solutions in those areas, where the orthographic system of Turkish poses challenges to every writer, whether monolingual and growing up in Turkey or heritage speaker.
Factorial experiments in research on memory, language, and in other areas are often analyzed using analysis of variance (ANOVA). However, for effects with more than one numerator degrees of freedom, e.g., for experimental factors with more than two levels, the ANOVA omnibus F-test is not informative about the source of a main effect or interaction. Because researchers typically have specific hypotheses about which condition means differ from each other, a priori contrasts (i.e., comparisons planned before the sample means are known) between specific conditions or combinations of conditions are the appropriate way to represent such hypotheses in the statistical model. Many researchers have pointed out that contrasts should be "tested instead of, rather than as a supplement to, the ordinary 'omnibus' F test" (Hays, 1973, p. 601). In this tutorial, we explain the mathematics underlying different kinds of contrasts (i.e., treatment, sum, repeated, polynomial, custom, nested, interaction contrasts), discuss their properties, and demonstrate how they are applied in the R System for Statistical Computing (R Core Team, 2018). In this context, we explain the generalized inverse which is needed to compute the coefficients for contrasts that test hypotheses that are not covered by the default set of contrasts. A detailed understanding of contrast coding is crucial for successful and correct specification in linear models (including linear mixed models). Contrasts defined a priori yield far more useful confirmatory tests of experimental hypotheses than standard omnibus F-tests. Reproducible code is available from https://osf.io/7ukf6/.
While much attention has been devoted to the cognition of aging multilingual individuals, little is known about how age affects their grammatical processing. We assessed subject-verb number-agreement processing in sixty native (L1) and sixty non-native (L2) speakers of German (age: 18-84) using a binary-choice sentence-completion task, along with various individual-differences tests. Our results revealed differential effects of age on L1 and L2 speakers' accuracy and reaction times (RTs). L1 speakers' RTs increased with age, and they became more susceptible to attraction errors. In contrast, L2 speakers' RTs decreased, once age-related slowing was controlled for, and their overall accuracy increased. We interpret this as resulting from increased L2 exposure. Moreover, L2 speakers' accuracy/RT patterns were more strongly affected by cognitive variables (working memory, interference control) than L1 speakers'. Our findings show that as regards bilinguals' grammatical processing ability, aging is associated with both gains (in experience) and losses (in cognitive abilities).