Refine
Year of publication
Document Type
- Doctoral Thesis (138) (remove)
Keywords
- Spracherwerb (14)
- language acquisition (13)
- Satzverarbeitung (8)
- Informationsstruktur (7)
- information structure (7)
- psycholinguistics (7)
- sentence processing (7)
- Psycholinguistik (6)
- eye-tracking (6)
- prosody (6)
Institute
- Department Linguistik (138) (remove)
"Wortabruf im Handumdrehen"?
(2017)
Successful sentence comprehension requires the comprehender to correctly figure out who did what to whom. For example, in the sentence John kicked the ball, the comprehender has to figure out who did the action of kicking and what was being kicked. This process of identifying and connecting the syntactically-related words in a sentence is called dependency completion. What are the cognitive constraints that determine dependency completion? A widely-accepted theory is cue-based retrieval. The theory maintains that dependency completion is driven by a content-addressable search for the co-dependents in memory. The cue-based retrieval explains a wide range of empirical data from several constructions including subject-verb agreement, subject-verb non-agreement, plausibility mismatch configurations, and negative polarity items.
However, there are two major empirical challenges to the theory: (i) Grammatical sentences’ data from subject-verb number agreement dependencies, where the theory predicts a slowdown at the verb in sentences like the key to the cabinet was rusty compared to the key to the cabinets was rusty, but the data are inconsistent with this prediction; and, (ii) Data from antecedent-reflexive dependencies, where a facilitation in reading times is predicted at the reflexive in the bodybuilder who worked with the trainers injured themselves vs. the bodybuilder who worked with the trainer injured themselves, but the data do not show a facilitatory effect.
The work presented in this dissertation is dedicated to building a more general theory of dependency completion that can account for the above two datasets without losing the original empirical coverage of the cue-based retrieval assumption. In two journal articles, I present computational modeling work that addresses the above two empirical challenges.
To explain the grammatical sentences’ data from subject-verb number agreement dependencies, I propose a new model that assumes that the cue-based retrieval operates on a probabilistically distorted representation of nouns in memory (Article I). This hybrid distortion-plus-retrieval model was compared against the existing candidate models using data from 17 studies on subject-verb number agreement in 4 languages. I find that the hybrid model outperforms the existing models of number agreement processing suggesting that the cue-based retrieval theory must incorporate a feature distortion assumption.
To account for the absence of facilitatory effect in antecedent-reflexive dependencies, I propose an individual difference model, which was built within the cue-based retrieval framework (Article II). The model assumes that individuals may differ in how strongly they weigh a syntactic cue over a number cue. The model was fitted to data from two studies on antecedent-reflexive dependencies, and the participant-level cue-weighting was estimated. We find that one-fourth of the participants, in both studies, weigh the syntactic cue higher than the number cue in processing reflexive dependencies and the remaining participants weigh the two cues equally. The result indicates that the absence of predicted facilitatory effect at the level of grouped data is driven by some, not all, participants who weigh syntactic cues higher than the number cue. More generally, the result demonstrates that the assumption of differential cue weighting is important for a theory of dependency completion processes. This differential cue weighting idea was independently supported by a modeling study on subject-verb non-agreement dependencies (Article III).
Overall, the cue-based retrieval, which is a general theory of dependency completion, needs to incorporate two new assumptions: (i) the nouns stored in memory can undergo probabilistic feature distortion, and (ii) the linguistic cues used for retrieval can be weighted differentially. This is the cumulative result of the modeling work presented in this dissertation.
The dissertation makes an important theoretical contribution: Sentence comprehension in humans is driven by a mechanism that assumes cue-based retrieval, probabilistic feature distortion, and differential cue weighting. This insight is theoretically important because there is some independent support for these three assumptions in sentence processing and the broader memory literature. The modeling work presented here is also methodologically important because for the first time, it demonstrates (i) how the complex models of sentence processing can be evaluated using data from multiple studies simultaneously, without oversimplifying the models, and (ii) how the inferences drawn from the individual-level behavior can be used in theory development.
The aim of this dissertation was to conduct a larger-scale cross-linguistic empirical investigation of similarity-based interference effects in sentence comprehension.
Interference studies can offer valuable insights into the mechanisms that are involved in long-distance dependency completion.
Many studies have investigated similarity-based interference effects, showing that syntactic and semantic information are employed during long-distance dependency formation (e.g., Arnett & Wagers, 2017; Cunnings & Sturt, 2018; Van Dyke, 2007, Van Dyke & Lewis, 2003; Van Dyke & McElree, 2011). Nevertheless, there are some important open questions in the interference literature that are critical to our understanding of the constraints involved in dependency resolution.
The first research question concerns the relative timing of syntactic and semantic interference in online sentence comprehension. Only few interference studies have investigated this question, and, to date, there is not enough data to draw conclusions with regard to their time course (Van Dyke, 2007; Van Dyke & McElree, 2011).
Our first cross-linguistic study explores the relative timing of syntactic and semantic interference in two eye-tracking reading experiments that implement the study design used in Van Dyke (2007). The first experiment tests English sentences. The second, larger-sample experiment investigates the two interference types in German.
Overall, the data suggest that syntactic and semantic interference can arise simultaneously during retrieval.
The second research question concerns a special case of semantic interference: We investigate whether cue-based retrieval interference can be caused by semantically similar items which are not embedded in a syntactic structure.
This second interference study builds on a landmark study by Van Dyke & McElree (2006). The study design used in their study is unique in that it is able to pin down the source of interference as a consequence of cue overload during retrieval, when semantic retrieval cues do not uniquely match the retrieval target. Unlike most other interference studies, this design is able to rule out encoding interference as an alternative explanation. Encoding accounts postulate that it is not cue overload at the retrieval site but the erroneous encoding of similar linguistic items in memory that leads to interference (Lewandowsky et al., 2008; Oberauer & Kliegl, 2006). While Van Dyke & McElree (2006) reported cue-based retrieval interference from sentence-external distractors, the evidence for this effect was weak. A subsequent study did not show interference of this type (Van Dyke et al., 2014). Given these inconclusive findings, further research is necessary to investigate semantic cue-based retrieval interference.
The second study in this dissertation provides a larger-scale cross-linguistic investigation of cue-based retrieval interference from sentence-external items. Three larger-sample eye-tracking studies in English, German, and Russian tested cue-based interference in the online processing of filler-gap dependencies. This study further extends the previous research by investigating interference in each language under varying task demands (Logačev & Vasishth, 2016; Swets et al., 2008).
Overall, we see some very modest support for proactive cue-based retrieval interference in English. Unexpectedly, this was observed only under a low task demand. In German and Russian, there is some evidence against the interference effect. It is possible that interference is attenuated in languages with richer case marking.
In sum, the cross-linguistic experiments on the time course of syntactic and semantic interference from sentence-internal distractors support existing evidence of syntactic and semantic interference during sentence comprehension. Our data further show that both types of interference effects can arise simultaneously. Our cross-linguistic experiments investigating semantic cue-based retrieval interference from sentence-external distractors suggest that this type of interference may arise only in specific linguistic contexts.
Adverb positioning is guided by syntactic, semantic, and pragmatic considerations and is subject to cross-linguistic as well as language-specific variation. The goal of the thesis is to identify the factors that determine adverb placement in general (Part I) as well as in constructions in which the adverb's sister constituent is deprived of its phonetic material by movement or ellipsis (gap constructions, Part II) and to provide an Optimality Theoretic approach to the contrasts in the effects of these factors on the distribution of adverbs in English, French, and German. In Optimality Theory (Prince & Smolensky 1993), grammaticality is defined as optimal satisfaction of a hierarchy of violable constraints: for a given input, a set of output candidates are produced out of which that candidate is selected as grammatical output which optimally satisfies the constraint hierarchy. Since grammaticality crucially relies on the hierarchic relations of the constraints, cross-linguistic variation can be traced back to differences in the language-specific constraint rankings. Part I shows how diverse phenomena of adverb placement can be captured by corresponding constraints and their relative rankings: - contrasts in the linearization of adverbs and verbs/auxiliaries in English and French - verb placement in German and the filling of the prefield position - placement of focus-sensitive adverbs - fronting of topical arguments and adverbs Part II extends the analysis to a particular phenomenon of adverb positioning: the avoidance of adverb attachment to a phonetically empty constituent (gap). English and French are similar in that the acceptability of pre-gap adverb placement depends on the type of adverb, its scope, and the syntactic construction (English: wh-movement vs. topicalization / VP Fronting / VP Ellipsis, inverted vs. non-inverted clauses; French: CLLD vs. Cleft, simple vs. periphrastic tense). Yet, the two languages differ in which strategies a specific type of adverb may pursue to escape placement in front of a certain type of gap. In contrast to English and French, placement of an adverb in front of a gap never gives rise to ungrammaticality in German. Rather, word ordering has to obey the syntactic, semantic, and pragmatic principles discussed in Part I; whether or not it results in adverb attachment to a phonetically empty constituent seems to be irrelevant: though constraints are active in every language, the emergence of a visible effect of their requirements in a given language depends on their relative ranking. The complex interaction of the diverse factors as well as their divergent effects on adverb placement in the various languages are accounted for by the universal constraints and their language-specific hierarchic relations in the OT framework.
Age of acquisition (AOA) is a psycholinguistic variable that significantly influences behavioural measures (response times and accuracy rates) in tasks that require lexical and semantic processing. Its origin is – unlike the origin of semantic typicality (TYP), which is assumed at the semantic level – controversially discussed. Different theories propose AOA effects to originate either at the semantic level or at the link between semantics and phonology (lemma-level).
The dissertation aims at investigating the influence of AOA and its interdependence with the semantic variable TYP on particularly semantic processing in order to pinpoint the origin of AOA effects. Therefore, three studies have been conducted that considered the variables AOA and TYP in semantic processing tasks (category verifications and animacy decisions) by means of behavioural and partly electrophysiological (ERP) data and in different populations (healthy young and elderly participants and in semantically impaired individuals with aphasia (IWA)).
The behavioural and electrophysiological data of the three studies provide evidence for distinct processing levels of the variables AOA and TYP. The data further support previous assumptions on a semantic origin for TYP but question the same for AOA. The findings, however, support an origin of AOA effects at the transition between the word form (phonology) and the semantic level that can be captured at the behavioural but not at the electrophysiological level.
This dissertation explores whether the processing of ellipsis is affected by changes in the complexity of the antecedent, either due to added linguistic material or to the presence of a temporary ambiguity. Murphy (1985) hypothesized that ellipsis is resolved via a string copying procedure when the antecedent is within the same sentence, and that copying longer strings takes more time. Such an account also implies that the antecedent is copied without its structure, which in turn implies that recomputing its syntax and semantics may be necessary at the ellipsis gap. Alternatively, several accounts predict null effects of antecedent complexity, as well as no reparsing. These either involve a structure copying mechanism that is cost-free and whose finishing time is thus independent of the form of the antecedent (Frazier & Clifton, 2001), treat ellipsis as a pointer into content-addressable memory with direct access (Martin & McElree, 2008, 2009), or assume that one structure is ‘shared’ between antecedent and gap (Frazier & Clifton, 2005).
In a self-paced reading study on German sluicing, temporarily ambiguous garden-path clauses were used as antecedents, but no evidence of reparsing in the form of a slowdown at the ellipsis site was found. Instead, results suggest that antecedents which had been reanalyzed from an initially incorrect structure were easier to retrieve at the gap. This finding that can be explained within the framework of cue-based retrieval parsing (Lewis & Vasishth, 2005), where additional syntactic operations on a structure yield memory reactivation effects.
Two further self-paced reading studies on German bare argument ellipsis and English verb phrase ellipsis investigated if adding linguistic content to the antecedent would increase processing times for the ellipsis, and whether insufficiently demanding comprehension tasks may have been responsible for earlier null results (Frazier & Clifton, 2000; Martin & McElree, 2008). It has also been suggested that increased antecedent complexity should shorten rather than lengthen retrieval times by providing more unique memory features (Hofmeister, 2011). Both experiments failed to yield reliable evidence that antecedent complexity affects ellipsis processing times in either direction, irrespectively of task demands.
Finally, two eye-tracking studies probed more deeply into the proposed reactivation-induced speedup found in the first experiment. The first study used three different kinds of French garden-path sentences as antecedents, with two of them failing to yield evidence for reactivation. Moreover, the third sentence type showed evidence suggesting that having failed to assign a structure to the antecedent leads to a slowdown at the ellipsis site, as well as regressions towards the ambiguous part of the sentence. The second eye-tracking study used the same materials as the initial self-paced reading study on German, with results showing a pattern similar to the one originally observed, with some notable differences.
Overall, the experimental results are compatible with the view that adding linguistic material to the antecedent has no or very little effect on the ease with which ellipsis is resolved, which is consistent with the predictions of cost-free copying, pointer-based approaches and structure sharing. Additionally, effects of the antecedent’s parsing history on ellipsis processing may be due to reactivation, the availability of multiple representations in memory, or complete failure to retrieve a matching target.
This project describes the nominal, verbal and ‘truncation’ systems of Awing and explains the syntactic and semantic functions of the multifunctional l<-><-> (LE) morpheme in copular and wh-focused constructions. Awing is a Bantu Grassfields language spoken in the North West region of Cameroon. The work begins with morphological processes viz. deverbals, compounding, reduplication, borrowing and a thorough presentation of the pronominal system and takes on verbal categories viz. tense, aspect, mood, verbal extensions, negation, adverbs and triggers of a homorganic N(asal)-prefix that attaches to the verb and other verbal categories. Awing grammar also has a very unusual phenomenon whereby nouns and verbs take long and short forms. A chapter entitled truncation is dedicated to the phenomenon. It is observed that the truncation process does not apply to bare singular NPs, proper names and nouns derived via morphological processes. On the other hand, with the exception of the 1st person non-emphatic possessive determiner and the class 7 noun prefix, nouns generally take the truncated form with modifiers (i.e., articles, demonstratives and other possessives). It is concluded that nominal truncation depicts movement within the DP system (Abney 1987). Truncation of the verb occurs in three contexts: a mass/plurality conspiracy (or lattice structuring in terms of Link 1983) between the verb and its internal argument (i.e., direct object); a means to align (exhaustive) focus (in terms of Fery’s 2013), and a means to form polar questions.
The second part of the work focuses on the role of the LE morpheme in copular and wh-focused clauses. Firstly, the syntax of the Awing copular clause is presented and it is shown that copular clauses in Awing have ‘subject-focus’ vs ‘topic-focus’ partitions and that the LE morpheme indirectly relates such functions. Semantically, it is shown that LE does not express contrast or exhaustivity in copular clauses. Turning to wh-constructions, the work adheres to Hamblin’s (1973) idea that the meaning of a question is the set of its possible answers and based on Rooth’s (1985) underspecified semantic notion of alternative focus, concludes that the LE morpheme is not a Focus Marker (FM) in Awing: LE does not generate or indicate the presence of alternatives (Krifka 2007); The LE morpheme can associate with wh-elements as a focus-sensitive operator with semantic import that operates on the focus alternatives by presupposing an exhaustive answer, among other notions. With focalized categories, the project further substantiates the claim in Fominyam & Šimík (2017), namely that exhaustivity is part of the semantics of the LE morpheme and not derived via contextual implicature, via a number of diagnostics. Hence, unlike in copular clauses, the LE morpheme with wh-focused categories is analysed as a morphological exponent of a functional head Exh corresponding to Horvath's (2010) EI (Exhaustive Identification). The work ends with the syntax of verb focus and negation and modifies the idea in Fominyam & Šimík (2017), namely that the focalized verb that associates with the exhaustive (LE) particle is a lower copy of the finite verb that has been moved to Agr. It is argued that the LE-focused verb ‘cluster’ is an instantiation of adjunction. The conclusion is that verb doubling with verb focus in Awing is neither a realization of two copies of one and the same verb (Fominyam and Šimík 2017), nor a result of a copy triggered by a focus marker (Aboh and Dyakonova 2009). Rather, the focalized copy is said to be merged directly as the complement of LE forming a type of adjoining cluster.
The main goal of this thesis is to explore the feasibility of using cross-lingual annotation projection as a method of alleviating the task of manual coreference annotation.
To reach our goal, we build a first trilingual parallel coreference corpus that encompasses multiple genres. For the annotation of the corpus, we develop common coreference annotation guidelines that are applicable to three languages (English, German, Russian) and include a novel domain-independent typology of bridging relations as well as state-of-the-art near-identity categories.
Thereafter, we design and perform several annotation projection experiments. In the first experiment, we implement a direct projection method with only one source language. Our results indicate that, already in a knowledge-lean scenario, our projection approach is superior to the most closely related work of Postolache et al. (2006). Since the quality of the resulting annotations is to a high degree dependent on the word alignment, we demonstrate how using limited syntactic information helps to further improve mention extraction on the target side. As a next step, in our second experiment, we show how exploiting two source languages helps to improve the quality of target annotations for both language pairs by concatenating annotations projected from two source languages. Finally, we assess the projection quality in a fully automatic scenario (using automatically produced source annotations), and propose a pilot experiment on manual projection of bridging pairs.
For each of the experiments, we carry out an in-depth error analysis, and we conclude that noisy word alignments, translation divergences and morphological and syntactic differences between languages are responsible for projection errors. We systematically compare and evaluate our projection methods, and we investigate the errors both qualitatively and quantitatively in order to identify problematic cases. Finally, we discuss the applicability of our method to coreference annotations and propose several avenues of future research.
The aim of this thesis is to develop approaches to automatically recognise the structure of argumentation in short monological texts. This amounts to identifying the central claim of the text, supporting premises, possible objections, and counter-objections to these objections, and connecting them correspondingly to a structure that adequately describes the argumentation presented in the text.
The first step towards such an automatic analysis of the structure of argumentation is to know how to represent it. We systematically review the literature on theories of discourse, as well as on theories of the structure of argumentation against a set of requirements and desiderata, and identify the theory of J. B. Freeman (1991, 2011) as a suitable candidate to represent argumentation structure. Based on this, a scheme is derived that is able to represent complex argumentative structures and can cope with various segmentation issues typically occurring in authentic text.
In order to empirically test our scheme for reliability of annotation, we conduct several annotation experiments, the most important of which assesses the agreement in reconstructing argumentation structure. The results show that expert annotators produce very reliable annotations, while the results of non-expert annotators highly depend on their training in and commitment to the task.
We then introduce the 'microtext' corpus, a collection of short argumentative texts. We report on the creation, translation, and annotation of it and provide a variety of statistics. It is the first parallel corpus (with a German and English version) annotated with argumentation structure, and -- thanks to the work of our colleagues -- also the first annotated according to multiple theories of (global) discourse structure.
The corpus is then used to develop and evaluate approaches to automatically predict argumentation structures in a series of six studies: The first two of them focus on learning local models for different aspects of argumentation structure. In the third study, we develop the main approach proposed in this thesis for predicting globally optimal argumentation structures: the 'evidence graph' model. This model is then systematically compared to other approaches in the fourth study, and achieves state-of-the-art results on the microtext corpus. The remaining two studies aim to demonstrate the versatility and elegance of the proposed approach by predicting argumentation structures of different granularity from text, and finally by using it to translate rhetorical structure representations into argumentation structures.