Refine
Year of publication
Document Type
- Article (60) (remove)
Is part of the Bibliography
- yes (60) (remove)
Keywords
- German (6)
- Eye movements (5)
- Reading (5)
- Sentence processing (5)
- interference (5)
- Bayesian data analysis (4)
- Cue-based retrieval (4)
- eye-tracking (4)
- Aphasia (3)
- ERP (3)
Institute
- Department Linguistik (60) (remove)
In eye-movement control during reading, advanced process-oriented models have been developed to reproduce behavioral data. So far, model complexity and large numbers of model parameters prevented rigorous statistical inference and modeling of interindividual differences. Here we propose a Bayesian approach to both problems for one representative computational model of sentence reading (SWIFT; Engbert et al., Psychological Review, 112, 2005, pp. 777-813). We used experimental data from 36 subjects who read the text in a normal and one of four manipulated text layouts (e.g., mirrored and scrambled letters). The SWIFT model was fitted to subjects and experimental conditions individually to investigate between- subject variability. Based on posterior distributions of model parameters, fixation probabilities and durations are reliably recovered from simulated data and reproduced for withheld empirical data, at both the experimental condition and subject levels. A subsequent statistical analysis of model parameters across reading conditions generates model-driven explanations for observable effects between conditions.
We present a computational evaluation of three hypotheses about sources of deficit in sentence comprehension in aphasia: slowed processing, intermittent deficiency, and resource reduction. The ACT-R based Lewis and Vasishth (2005) model is used to implement these three proposals. Slowed processing is implemented as slowed execution time of parse steps; intermittent deficiency as increased random noise in activation of elements in memory; and resource reduction as reduced spreading activation. As data, we considered subject vs. object relative sentences, presented in a self-paced listening modality to 56 individuals with aphasia (IWA) and 46 matched controls. The participants heard the sentences and carried out a picture verification task to decide on an interpretation of the sentence. These response accuracies are used to identify the best parameters (for each participant) that correspond to the three hypotheses mentioned above. We show that controls have more tightly clustered (less variable) parameter values than IWA; specifically, compared to controls, among IWA there are more individuals with slow parsing times, high noise, and low spreading activation. We find that (a) individual IWA show differential amounts of deficit along the three dimensions of slowed processing, intermittent deficiency, and resource reduction, (b) overall, there is evidence for all three sources of deficit playing a role, and (c) IWA have a more variable range of parameter values than controls. An important implication is that it may be meaningless to talk about sources of deficit with respect to an abstract verage IWA; the focus should be on the individual's differential degrees of deficit along different dimensions, and on understanding the causes of variability in deficit between participants.
Traxler, Pickering, and Clifton (1998) found that ambiguous sentences are read faster than their unambiguous counterparts. This so-called ambiguity advantage has presented a major challenge to classical theories of human sentence comprehension (parsing) because its most prominent explanation, in the form of the unrestricted race model (URM), assumes that parsing is non-deterministic. Recently, Swets, Desmet, Clifton, and Ferreira (2008) have challenged the URM. They argue that readers strategically underspecify the representation of ambiguous sentences to save time, unless disambiguation is required by task demands. When disambiguation is required, however, readers assign sentences full structure—and Swets et al. provide experimental evidence to this end. On the basis of their findings, they argue against the URM and in favor of a model of task-dependent sentence comprehension. We show through simulations that the Swets et al. data do not constitute evidence for task-dependent parsing because they can be explained by the URM. However, we provide decisive evidence from a German self-paced reading study consistent with Swets et al.'s general claim about task-dependent parsing. Specifically, we show that under certain conditions, ambiguous sentences can be read more slowly than their unambiguous counterparts, suggesting that the parser may create several parses, when required. Finally, we present the first quantitative model of task-driven disambiguation that subsumes the URM, and we show that it can explain both Swets et al.'s results and our findings.
Among theories of human language comprehension, cue-based memory retrieval has proven to be a useful framework for understanding when and how processing difficulty arises in the resolution of long-distance dependencies. Most previous work in this area has assumed that very general retrieval cues like [+subject] or [+singular] do the work of identifying (and sometimes misidentifying) a retrieval target in order to establish a dependency between words. However, recent work suggests that general, handpicked retrieval cues like these may not be enough to explain illusions of plausibility (Cunnings & Sturt, 2018), which can arise in sentences like The letter next to the porcelain plate shattered. Capturing such retrieval interference effects requires lexically specific features and retrieval cues, but handpicking the features is hard to do in a principled way and greatly increases modeler degrees of freedom. To remedy this, we use well-established word embedding methods for creating distributed lexical feature representations that encode information relevant for retrieval using distributed retrieval cue vectors. We show that the similarity between the feature and cue vectors (a measure of plausibility) predicts total reading times in Cunnings and Sturt's eye-tracking data. The features can easily be plugged into existing parsing models (including cue-based retrieval and self-organized parsing), putting very different models on more equal footing and facilitating future quantitative comparisons.
Several studies (e.g., Wicha et al., 2003b; DeLong et al., 2005) have shown that readers use information from the sentential context to predict nouns (or some of their features), and that predictability effects can be inferred from the EEG signal in determiners or adjectives appearing before the predicted noun. While these findings provide evidence for the pre-activation proposal, recent replication attempts together with inconsistencies in the results from the literature cast doubt on the robustness of this phenomenon. Our study presents the first attempt to use the effect of gender on predictability in German to study the pre-activation hypothesis, capitalizing on the fact that all German nouns have a gender and that their preceding determiners can show an unambiguous gender marking when the noun phrase has accusative case. Despite having a relatively large sample size (of 120 subjects), both our preregistered and exploratory analyses failed to yield conclusive evidence for or against an effect of pre-activation. The sign of the effect is, however, in the expected direction: the more unexpected the gender of the determiner, the larger the negativity. The recent, inconclusive replication attempts by Nieuwland et al. (2018) and others also show effects with signs in the expected direction. We conducted a Bayesian random-ef-fects meta-analysis using our data and the publicly available data from these recent replication attempts. Our meta-analysis shows a relatively clear but very small effect that is consistent with the pre-activation account and demonstrates a very important advantage of the Bayesian data analysis methodology: we can incrementally accumulate evidence to obtain increasingly precise estimates of the effect of interest.
Argument-head distance and processing complexity: Explaining both locality and antilocality effects
(2006)
Although proximity between arguments and verbs (locality) is a relatively robust determinant of sentence-processing difficulty (Hawkins 1998, 2001, Gibson 2000), increasing argument-verb distance can also facilitate processing (Konieczny 2000). We present two self-paced reading (SPR) experiments involving Hindi that provide further evidence of antilocality, and a third SPR experiment which suggests that similarity-based interference can attenuate this distance-based facilitation. A unified explanation of interference, locality, and antilocality effects is proposed via an independently motivated theory of activation decay and retrieval interference (Anderson et al. 2004).*
Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. Although LMMs have many advantages over ANOVA, like ANOVAs, setting them up for data analysis also requires some care. One simple option, when numerically possible, is to fit the full variance covariance structure of random effects (the maximal model; Barr, Levy, Scheepers & Tily, 2013), presumably to keep Type I error down to the nominal a in the presence of random effects. Although it is true that fitting a model with only random intercepts may lead to higher Type I error, fitting a maximal model also has a cost: it can lead to a significant loss of power. We demonstrate this with simulations and suggest that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data. (C) 2017 The Authors. Published by Elsevier Inc.
This tutorial analyzes voice onset time (VOT) data from Dongbei (Northeastern) Mandarin Chinese and North American English to demonstrate how Bayesian linear mixed models can be fit using the programming language Stan via the R package brms. Through this case study, we demonstrate some of the advantages of the Bayesian framework: researchers can (i) flexibly define the underlying process that they believe to have generated the data; (ii) obtain direct information regarding the uncertainty about the parameter that relates the data to the theoretical question being studied; and (iii) incorporate prior knowledge into the analysis. Getting started with Bayesian modeling can be challenging, especially when one is trying to model one’s own (often unique) data. It is difficult to see how one can apply general principles described in textbooks to one’s own specific research problem. We address this barrier to using Bayesian methods by providing three detailed examples, with source code to allow easy reproducibility. The examples presented are intended to give the reader a flavor of the process of model-fitting; suggestions for further study are also provided. All data and code are available from: https://osf.io/g4zpv.
Background: In addition to the canonical subject-verb-object (SVO) word order, German also allows for non-canonical order (OVS), and the case-marking system supports thematic role interpretation. Previous eye-tracking studies (Kamide et al., 2003; Knoeferle, 2007) have shown that unambiguous case information in non-canonical sentences is processed incrementally. For individuals with agrammatic aphasia, comprehension of non-canonical sentences is at chance level (Burchert et al., 2003). The trace deletion hypothesis (Grodzinsky 1995, 2000) claims that this is due to structural impairments in syntactic representations, which force the individual with aphasia (IWA) to apply a guessing strategy. However, recent studies investigating online sentence processing in aphasia (Caplan et al., 2007; Dickey et al., 2007) found that divergences exist in IWAs' sentence-processing routines depending on whether they comprehended non-canonical sentences correctly or not, pointing rather to a processing deficit explanation. Aims: The aim of the current study was to investigate agrammatic IWAs' online and offline sentence comprehension simultaneously in order to reveal what online sentence-processing strategies they rely on and how these differ from controls' processing routines. We further asked whether IWAs' offline chance performance for non-canonical sentences does indeed result from guessing. Methods Procedures: We used the visual-world paradigm and measured eye movements (as an index of online sentence processing) of controls (N = 8) and individuals with aphasia (N = 7) during a sentence-picture matching task. Additional offline measures were accuracy and reaction times. Outcomes Results: While the offline accuracy results corresponded to the pattern predicted by the TDH, IWAs' eye movements revealed systematic differences depending on the response accuracy. Conclusions: These findings constitute evidence against attributing IWAs' chance performance for non-canonical structures to mere guessing. Instead, our results support processing deficit explanations and characterise the agrammatic parser as deterministic and inefficient: it is slowed down, affected by intermittent deficiencies in performing syntactic operations, and fails to compute reanalysis even when one is detected.
Dynamical models make specific assumptions about cognitive processes that generate human behavior. In data assimilation, these models are tested against timeordered data. Recent progress on Bayesian data assimilation demonstrates that this approach combines the strengths of statistical modeling of individual differences with the those of dynamical cognitive models.
Scanpaths have played an important role in classic research on reading behavior. Nevertheless, they have largely been neglected in later research perhaps due to a lack of suitable analytical tools. Recently, von der Malsburg and Vasishth (2011) proposed a new measure for quantifying differences between scanpaths and demonstrated that this measure can recover effects that were missed with the traditional eyetracking measures. However, the sentences used in that study were difficult to process and scanpath effects accordingly strong. The purpose of the present study was to test the validity, sensitivity, and scope of applicability of the scanpath measure, using simple sentences that are typically read from left to right. We derived predictions for the regularity of scanpaths from the literature on oculomotor control, sentence processing, and cognitive aging and tested these predictions using the scanpath measure and a large database of eye movements. All predictions were confirmed: Sentences with short words and syntactically more difficult sentences elicited more irregular scanpaths. Also, older readers produced more irregular scanpaths than younger readers. In addition, we found an effect that was not reported earlier: Syntax had a smaller influence on the eye movements of older readers than on those of young readers. We discuss this interaction of syntactic parsing cost with age in terms of shifts in processing strategies and a decline of executive control as readers age. Overall, our results demonstrate the validity and sensitivity of the scanpath measure and thus establish it as a productive and versatile tool for reading research.
In two self-paced reading experiments, we investigated the effect of changes in antecedent complexity on processing times for ellipsis. Pointer- or “sharing”-based approaches to ellipsis processing (Frazier & Clifton 2001, 2005; Martin & McElree 2008) predict no effect of antecedent complexity on reading times at the ellipsis site while other accounts predict increased antecedent complexity to either slow down processing (Murphy 1985) or to speed it up (Hofmeister 2011). Experiment 1 manipulated antecedent complexity and elision, yielding evidence against a speedup at the ellipsis site and in favor of a null effect. In order to investigate possible superficial processing on part of participants, Experiment 2 manipulated the amount of attention required to correctly respond to end-of-sentence comprehension probes, yielding evidence against a complexity-induced slowdown at the ellipsis site. Overall, our results are compatible with pointer-based approaches while casting doubt on the notion that changes antecedent complexity lead to measurable differences in ellipsis processing speed.
Previous studies have suggested that distinctive case marking on noun phrases reduces attraction effects in production, i.e., the tendency to produce a verb that agrees with a nonsubject noun. An important open question is whether attraction effects are modulated by case information in sentence comprehension. To address this question, we conducted three attraction experiments in Armenian, a language with a rich and productive case system. The experiments showed clear attraction effects, and they also revealed an overall role of case marking such that participants showed faster response and reading times when the nouns in the sentence had different case. However, we found little indication that distinctive case marking modulated attraction effects. We present a theoretical proposal of how case and number information may be used differentially during agreement licensing in comprehension. More generally, this work sheds light on the nature of the retrieval cues deployed when completing morphosyntactic dependencies.
In this paper we examine the effect of uncertainty on readers’ predictions about meaning. In particular, we were interested in how uncertainty might influence the likelihood of committing to a specific sentence meaning. We conducted two event-related potential (ERP) experiments using particle verbs such as turn down and manipulated uncertainty by constraining the context such that readers could be either highly certain about the identity of a distant verb particle, such as turn the bed […] down, or less certain due to competing particles, such as turn the music […] up/down. The study was conducted in German, where verb particles appear clause-finally and may be separated from the verb by a large amount of material. We hypothesised that this separation would encourage readers to predict the particle, and that high certainty would make prediction of a specific particle more likely than lower certainty. If a specific particle was predicted, this would reflect a strong commitment to sentence meaning that should incur a higher processing cost if the prediction is wrong. If a specific particle was less likely to be predicted, commitment should be weaker and the processing cost of a wrong prediction lower. If true, this could suggest that uncertainty discourages predictions via an unacceptable cost-benefit ratio. However, given the clear predictions made by the literature, it was surprisingly unclear whether the uncertainty manipulation affected the two ERP components studied, the N400 and the PNP. Bayes factor analyses showed that evidence for our a priori hypothesised effect sizes was inconclusive, although there was decisive evidence against a priori hypothesised effect sizes larger than 1μV for the N400 and larger than 3μV for the PNP. We attribute the inconclusive finding to the properties of verb-particle dependencies that differ from the verb-noun dependencies in which the N400 and PNP are often studied.
In this paper we examine the effect of uncertainty on readers' predictions about meaning. In particular, we were interested in how uncertainty might influence the likelihood of committing to a specific sentence meaning. We conducted two event-related potential (ERP) experiments using particle verbs such as turn down and manipulated uncertainty by constraining the context such that readers could be either highly certain about the identity of a distant verb particle, such as turn the bed [...] down, or less certain due to competing particles, such as turn the music [...] up/down. The study was conducted in German, where verb particles appear clause-finally and may be separated from the verb by a large amount of material. We hypothesised that this separation would encourage readers to predict the particle, and that high certainty would make prediction of a specific particle more likely than lower certainty. If a specific particle was predicted, this would reflect a strong commitment to sentence meaning that should incur a higher processing cost if the prediction is wrong. If a specific particle was less likely to be predicted, commitment should be weaker and the processing cost of a wrong prediction lower. If true, this could suggest that uncertainty discourages predictions via an unacceptable cost-benefit ratio. However, given the clear predictions made by the literature, it was surprisingly unclear whether the uncertainty manipulation affected the two ERP components studied, the N400 and the PNP. Bayes factor analyses showed that evidence for our a priori hypothesised effect sizes was inconclusive, although there was decisive evidence against a priori hypothesised effect sizes larger than 1 mu Vfor the N400 and larger than 3 mu V for the PNP. We attribute the inconclusive finding to the properties of verb-particle dependencies that differ from the verb-noun dependencies in which the N400 and PNP are often studied.
We used Chinese prenominal relative clauses (RCs) to test the predictions of two competing accounts of sentence comprehension difficulty: the experience-based account of Levy () and the Dependency Locality Theory (DLT; Gibson, ). Given that in Chinese RCs, a classifier and/or a passive marker BEI can be added to the sentence-initial position, we manipulated the presence/absence of classifiers and the presence/absence of BEI, such that BEI sentences were passivized subject-extracted RCs, and no-BEI sentences were standard object-extracted RCs. We conducted two self-paced reading experiments, using the same critical stimuli but somewhat different filler items. Reading time patterns from both experiments showed facilitative effects of BEI within and beyond RC regions, and delayed facilitative effects of classifiers, suggesting that cues that occur before a clear signal of an upcoming RC can help Chinese comprehenders to anticipate RC structures. The data patterns are not predicted by the DLT, but they are consistent with the predictions of experience-based theories.
What is the processing cost of being garden-pathed by a temporary syntactic ambiguity? We argue that comparing average reading times in garden-path versus non-garden-path sentences is not enough to answer this question. Trial-level contaminants such as inattention, the fact that garden pathing may occur non-deterministically in the ambiguous condition, and "triage" (rejecting the sentence without reanalysis; Fodor & Inoue, 2000) lead to systematic underestimates of the true cost of garden pathing. Furthermore, the "pure" garden-path effect due to encountering an unexpected word needs to be separated from the additional cost of syntactic reanalysis. To get more realistic estimates for the individual processing costs of garden pathing and syntactic reanalysis, we implement a novel computational model that includes trial-level contaminants as probabilistically occurring latent cognitive processes. The model shows a good predictive fit to existing reading time and judgment data. Furthermore, the latent-process approach captures differences between noun phrase/zero complement (NP/Z) garden-path sentences and semantically biased reduced relative clause (RRC) garden-path sentences: The NP/Z garden path occurs nearly deterministically but can be mostly eliminated by adding a comma. By contrast, the RRC garden path occurs with a lower probability, but disambiguation via semantic plausibility is not always effective.
While it is widely acknowledged in the formal semantic literature that both the truth-functional focus particle only and it-clefts convey exhaustiveness, the nature and source of exhaustiveness effects with it-clefts remain contested. We describe a questionnaire study (n = 80) and an event-related brain potentials (ERP) study (n = 16) that investigated the violation of exhaustiveness in German only-foci versus it-clefts. The offline study showed that a violation of exhaustivity with only is less acceptable than the violation with it-clefts, suggesting a difference in the nature of exhaustivity interpretation in the two environments. The ERP-results confirm that this difference can be seen in online processing as well: a violation of exhaustiveness in only-foci elicited a centro-posterior positivity (600-800ms), whereas a violation in it-clefts induced a globally distributed N400 pattern (400-600ms). The positivity can be interpreted as a reanalysis process and more generally as a process of context updating. The N400 effect in it-clefts is interpreted as indexing a cancelation process that is functionally distinct from the only case. The ERP study is, to our knowledge, the first evidence from an online experimental paradigm which shows that the violation of exhaustiveness involves different underlying processes in the two structural environments.
Given the replication crisis in cognitive science, it is important to consider what researchers need to do in order to report results that are reliable. We consider three changes in current practice that have the potential to deliver more realistic and robust claims. First, the planned experiment should be divided into two stages, an exploratory stage and a confirmatory stage. This clear separation allows the researcher to check whether any results found in the exploratory stage are robust. The second change is to carry out adequately powered studies. We show that this is imperative if we want to obtain realistic estimates of effects in psycholinguistics. The third change is to use Bayesian data-analytic methods rather than frequentist ones; the Bayesian framework allows us to focus on the best estimates we can obtain of the effect, rather than rejecting a strawman null. As a case study, we investigate number interference effects in German. Number feature interference is predicted by cue-based retrieval models of sentence processing (Van Dyke & Lewis, 2003; Vasishth & Lewis, 2006), but it has shown inconsistent results. We show that by implementing the three changes mentioned, suggestive evidence emerges that is consistent with the predicted number interference effects.
Factorial experiments in research on memory, language, and in other areas are often analyzed using analysis of variance (ANOVA). However, for effects with more than one numerator degrees of freedom, e.g., for experimental factors with more than two levels, the ANOVA omnibus F-test is not informative about the source of a main effect or interaction. Because researchers typically have specific hypotheses about which condition means differ from each other, a priori contrasts (i.e., comparisons planned before the sample means are known) between specific conditions or combinations of conditions are the appropriate way to represent such hypotheses in the statistical model. Many researchers have pointed out that contrasts should be "tested instead of, rather than as a supplement to, the ordinary 'omnibus' F test" (Hays, 1973, p. 601). In this tutorial, we explain the mathematics underlying different kinds of contrasts (i.e., treatment, sum, repeated, polynomial, custom, nested, interaction contrasts), discuss their properties, and demonstrate how they are applied in the R System for Statistical Computing (R Core Team, 2018). In this context, we explain the generalized inverse which is needed to compute the coefficients for contrasts that test hypotheses that are not covered by the default set of contrasts. A detailed understanding of contrast coding is crucial for successful and correct specification in linear models (including linear mixed models). Contrasts defined a priori yield far more useful confirmatory tests of experimental hypotheses than standard omnibus F-tests. Reproducible code is available from https://osf.io/7ukf6/.