Refine
Has Fulltext
- yes (14)
Year of publication
Document Type
- Doctoral Thesis (14)
Language
- English (14)
Is part of the Bibliography
- yes (14) (remove)
Keywords
- Satzverarbeitung (14) (remove)
The evaluation of process-oriented cognitive theories through time-ordered observations is crucial for the advancement of cognitive science. The findings presented herein integrate insights from research on eye-movement control and sentence comprehension during reading, addressing challenges in modeling time-ordered data, statistical inference, and interindividual variability. Using kernel density estimation and a pseudo-marginal likelihood for fixation durations and locations, a likelihood implementation of the SWIFT model of eye-movement control during reading (Engbert et al., Psychological Review, 112, 2005, pp. 777–813) is proposed. Within the broader framework of data assimilation, Bayesian parameter inference with adaptive Markov Chain Monte Carlo techniques is facilitated for reliable model fitting. Across the different studies, this framework has shown to enable reliable parameter recovery from simulated data and prediction of experimental summary statistics. Despite its complexity, SWIFT can be fitted within a principled Bayesian workflow, capturing interindividual differences and modeling experimental effects on reading across different geometrical alterations of text. Based on these advancements, the integrated dynamical model SEAM is proposed, which combines eye-movement control, a traditionally psychological research area, and post-lexical language processing in the form of cue-based memory retrieval (Lewis & Vasishth, Cognitive Science, 29, 2005, pp. 375–419), typically the purview of psycholinguistics. This proof-of-concept integration marks a significant step forward in natural language comprehension during reading and suggests that the presented methodology can be useful to develop complex cognitive dynamical models that integrate processes at levels of perception, higher cognition, and (oculo-)motor control. These findings collectively advance process-oriented cognitive modeling and highlight the importance of Bayesian inference, individual differences, and interdisciplinary integration for a holistic understanding of reading processes. Implications for theory and methodology, including proposals for model comparison and hierarchical parameter inference, are briefly discussed.
This thesis investigates the processing of non-canonical word orders and whether non-canonical orders involving object topicalizations, midfield scrambling and particle verbs are treated the same by native (L1) and non-native (L2) speakers. The two languages investigated are Norwegian and German.
32 L1 Norwegian and 32 L1 German advanced learners of Norwegian were tested in two experiments on object topicalization in Norwegian. The results from the online self-paced reading task and the offline agent identification task show that both groups are able to identify the non-canonical word order and show a facilitatory effect of animate subjects in their reanalysis. Similarly high error rates in the agent identification task suggest that globally unambiguous object topicalizations are a challenging structure for L1 and L2 speakers alike.
The same participants were also tested in two experiments on particle placement in Norwegian, again using a self-paced reading task, this time combined with an acceptability rating task. In the acceptability rating L1 and L2 speakers show the same preference for the verb-adjacent placement of the particle over the non-adjacent placement after the direct object. However, this preference for adjacency is only found in the L1 group during online processing, whereas the L2 group shows no preference for either order.
Another set of experiments tested 33 L1 German and 39 L1 Slavic advanced learners of German on object scrambling in ditransitive clauses in German. Non-native speakers accept both object orders and show neither a preference for either order nor a processing advantage for the canonical order. The L1 group, in contrast, shows a small, but significant preference for the canonical dative-first order in the judgment and the reading task.
The same participants were also tested in two experiments on the application of the split rule in German particle verbs. Advanced L2 speakers of German are able to identify particle verbs and can apply the split rule in V2 contexts in an acceptability judgment task in the same way as L1 speakers. However, unlike the L1 group, the L2 group is not sensitive to the grammaticality manipulation during online processing. They seem to be sensitive to the additional lexical information provided by the particle, but are unable to relate the split particle to the preceding verb and recognize the ungrammaticality in non-V2 contexts.
Taken together, my findings suggest that non-canonical word orders are not per se more difficult to identify for L2 speakers than L1 speakers and can trigger the same reanalysis processes as in L1 speakers. I argue that L2 speakers’ ability to identify a non-canonical word order depends on how the non-canonicity is signaled (case marking vs. surface word order), on the constituents involved (identical vs. different word types), and on the impact of the word order change on sentence meaning. Non-canonical word orders that are signaled by morphological case marking and cause no change to the sentence’s content are hard to detect for L2 speakers.
A large body of research now supports the presence of both syntactic and lexical predictions in sentence processing. Lexical predictions, in particular, are considered to indicate a deep level of predictive processing that extends past the structural features of a necessary word (e.g. noun), right down to the phonological features of the lexical identity of a specific word (e.g. /kite/; DeLong et al., 2005). However, evidence for lexical predictions typically focuses on predictions in very local environments, such as the adjacent word or words (DeLong et al., 2005; Van Berkum et al., 2005; Wicha et al., 2004). Predictions in such local environments may be indistinguishable from lexical priming, which is transient and uncontrolled, and as such may prime lexical items that are not compatible with the context (e.g. Kukona et al., 2014). Predictive processing has been argued to be a controlled process, with top-down information guiding preactivation of plausible upcoming lexical items (Kuperberg & Jaeger, 2016). One way to distinguish lexical priming from prediction is to demonstrate that preactivated lexical content can be maintained over longer distances.
In this dissertation, separable German particle verbs are used to demonstrate that preactivation of lexical items can be maintained over multi-word distances. A self-paced reading time and an eye tracking experiment provide some support for the idea that particle preactivation triggered by a verb and its context can be observed by holding the sentence context constant and manipulating the predictabilty of the particle. Although evidence of an effect of particle predictability was only seen in eye tracking, this is consistent with previous evidence suggesting that predictive processing facilitates only some eye tracking measures to which the self-paced reading modality may not be sensitive (Staub, 2015; Rayner1998). Interestingly, manipulating the distance between the verb and the particle did not affect reading times, suggesting that the surprisal-predicted faster reading times at long distance may only occur when the additional distance is created by information that adds information about the lexical identity of a distant element (Levy, 2008; Grodner & Gibson, 2005). Furthermore, the results provide support for models proposing that temporal decay is not major influence on word processing (Lewandowsky et al., 2009; Vasishth et al., 2019).
In the third and fourth experiments, event-related potentials were used as a method for detecting specific lexical predictions. In the initial ERP experiment, we found some support for the presence of lexical predictions when the sentence context constrained the number of plausible particles to a single particle. This was suggested by a frontal post-N400 positivity (PNP) that was elicited when a lexical prediction had been violated, but not to violations when more than one particle had been plausible. The results of this study were highly consistent with previous research suggesting that the PNP might be a much sought-after ERP marker of prediction failure (DeLong et al., 2011; DeLong et al., 2014; Van Petten & Luka, 2012; Thornhill & Van Petten, 2012; Kuperberg et al., 2019). However, a second experiment in a larger sample experiment failed to replicate the effect, but did suggest the relationship of the PNP to predictive processing may not yet be fully understood. Evidence for long-distance lexical predictions was inconclusive.
The conclusion drawn from the four experiments is that preactivation of the lexical entries of plausible upcoming particles did occur and was maintained over long distances. The facilitatory effect of this preactivation at the particle site therefore did not appear to be the result of transient lexical priming. However, the question of whether this preactivation can also lead to lexical predictions of a specific particle remains unanswered. Of particular interest to future research on predictive processing is further characterisation of the PNP. Implications for models of sentence processing may be the inclusion of long-distance lexical predictions, or the possibility that preactivation of lexical material can facilitate reading times and ERP amplitude without commitment to a specific lexical item.
Successful sentence comprehension requires the comprehender to correctly figure out who did what to whom. For example, in the sentence John kicked the ball, the comprehender has to figure out who did the action of kicking and what was being kicked. This process of identifying and connecting the syntactically-related words in a sentence is called dependency completion. What are the cognitive constraints that determine dependency completion? A widely-accepted theory is cue-based retrieval. The theory maintains that dependency completion is driven by a content-addressable search for the co-dependents in memory. The cue-based retrieval explains a wide range of empirical data from several constructions including subject-verb agreement, subject-verb non-agreement, plausibility mismatch configurations, and negative polarity items.
However, there are two major empirical challenges to the theory: (i) Grammatical sentences’ data from subject-verb number agreement dependencies, where the theory predicts a slowdown at the verb in sentences like the key to the cabinet was rusty compared to the key to the cabinets was rusty, but the data are inconsistent with this prediction; and, (ii) Data from antecedent-reflexive dependencies, where a facilitation in reading times is predicted at the reflexive in the bodybuilder who worked with the trainers injured themselves vs. the bodybuilder who worked with the trainer injured themselves, but the data do not show a facilitatory effect.
The work presented in this dissertation is dedicated to building a more general theory of dependency completion that can account for the above two datasets without losing the original empirical coverage of the cue-based retrieval assumption. In two journal articles, I present computational modeling work that addresses the above two empirical challenges.
To explain the grammatical sentences’ data from subject-verb number agreement dependencies, I propose a new model that assumes that the cue-based retrieval operates on a probabilistically distorted representation of nouns in memory (Article I). This hybrid distortion-plus-retrieval model was compared against the existing candidate models using data from 17 studies on subject-verb number agreement in 4 languages. I find that the hybrid model outperforms the existing models of number agreement processing suggesting that the cue-based retrieval theory must incorporate a feature distortion assumption.
To account for the absence of facilitatory effect in antecedent-reflexive dependencies, I propose an individual difference model, which was built within the cue-based retrieval framework (Article II). The model assumes that individuals may differ in how strongly they weigh a syntactic cue over a number cue. The model was fitted to data from two studies on antecedent-reflexive dependencies, and the participant-level cue-weighting was estimated. We find that one-fourth of the participants, in both studies, weigh the syntactic cue higher than the number cue in processing reflexive dependencies and the remaining participants weigh the two cues equally. The result indicates that the absence of predicted facilitatory effect at the level of grouped data is driven by some, not all, participants who weigh syntactic cues higher than the number cue. More generally, the result demonstrates that the assumption of differential cue weighting is important for a theory of dependency completion processes. This differential cue weighting idea was independently supported by a modeling study on subject-verb non-agreement dependencies (Article III).
Overall, the cue-based retrieval, which is a general theory of dependency completion, needs to incorporate two new assumptions: (i) the nouns stored in memory can undergo probabilistic feature distortion, and (ii) the linguistic cues used for retrieval can be weighted differentially. This is the cumulative result of the modeling work presented in this dissertation.
The dissertation makes an important theoretical contribution: Sentence comprehension in humans is driven by a mechanism that assumes cue-based retrieval, probabilistic feature distortion, and differential cue weighting. This insight is theoretically important because there is some independent support for these three assumptions in sentence processing and the broader memory literature. The modeling work presented here is also methodologically important because for the first time, it demonstrates (i) how the complex models of sentence processing can be evaluated using data from multiple studies simultaneously, without oversimplifying the models, and (ii) how the inferences drawn from the individual-level behavior can be used in theory development.