Refine
Year of publication
Document Type
- Article (78)
- Postprint (16)
- Review (3)
- Monograph/Edited Volume (2)
- Part of Periodical (2)
- Other (1)
Keywords
- German (10)
- eye-tracking (9)
- interference (8)
- locality (8)
- Eye movements (7)
- Bayesian data analysis (6)
- Reading (6)
- individual differences (6)
- self-paced reading (6)
- sentence processing (6)
Institute
Linear mixed-effects models have increasingly replaced mixed-model analyses of variance for statistical inference in factorial psycholinguistic experiments. Although LMMs have many advantages over ANOVA, like ANOVAs, setting them up for data analysis also requires some care. One simple option, when numerically possible, is to fit the full variance covariance structure of random effects (the maximal model; Barr, Levy, Scheepers & Tily, 2013), presumably to keep Type I error down to the nominal a in the presence of random effects. Although it is true that fitting a model with only random intercepts may lead to higher Type I error, fitting a maximal model also has a cost: it can lead to a significant loss of power. We demonstrate this with simulations and suggest that for typical psychological and psycholinguistic data, higher power is achieved without inflating Type I error rate if a model selection criterion is used to select a random effect structure that is supported by the data. (C) 2017 The Authors. Published by Elsevier Inc.
We present the fundamental ideas underlying statistical hypothesis testing using the frequentist framework. We start with a simple example that builds up the one-sample t-test from the beginning, explaining important concepts such as the sampling distribution of the sample mean, and the iid assumption. Then, we examine the meaning of the p-value in detail and discuss several important misconceptions about what a p-value does and does not tell us. This leads to a discussion of Type I, II error and power, and Type S and M error. An important conclusion from this discussion is that one should aim to carry out appropriately powered studies. Next, we discuss two common issues that we have encountered in psycholinguistics and linguistics: running experiments until significance is reached and the ‘garden-of-forking-paths’ problem discussed by Gelman and others. The best way to use frequentist methods is to run appropriately powered studies, check model assumptions, clearly separate exploratory data analysis from planned comparisons decided upon before the study was run, and always attempt to replicate results.
Argument-head distance and processing complexity: Explaining both locality and antilocality effects
(2006)
Although proximity between arguments and verbs (locality) is a relatively robust determinant of sentence-processing difficulty (Hawkins 1998, 2001, Gibson 2000), increasing argument-verb distance can also facilitate processing (Konieczny 2000). We present two self-paced reading (SPR) experiments involving Hindi that provide further evidence of antilocality, and a third SPR experiment which suggests that similarity-based interference can attenuate this distance-based facilitation. A unified explanation of interference, locality, and antilocality effects is proposed via an independently motivated theory of activation decay and retrieval interference (Anderson et al. 2004).*
With the arrival of the R packages nlme and lme4, linear mixed models (LMMs) have come to be widely used in experimentally-driven areas like psychology, linguistics, and cognitive science. This tutorial provides a practical introduction to fitting LMMs in a Bayesian framework using the probabilistic programming language Stan. We choose Stan (rather than WinBUGS or JAGS) because it provides an elegant and scalable framework for fitting models in most of the standard applications of LMMs. We ease the reader into fitting increasingly complex LMMs, using a two-condition repeated measures self-paced reading study.
Sentence comprehension requires that the comprehender work out who did what to whom. This process has been characterized as retrieval from memory. This review summarizes the quantitative predictions and empirical coverage of the two existing computational models of retrieval and shows how the predictive performance of these two competing models can be tested against a benchmark data-set. We also show how computational modeling can help us better understand sources of variability in both unimpaired and impaired sentence comprehension.
This tutorial analyzes voice onset time (VOT) data from Dongbei (Northeastern) Mandarin Chinese and North American English to demonstrate how Bayesian linear mixed models can be fit using the programming language Stan via the R package brms. Through this case study, we demonstrate some of the advantages of the Bayesian framework: researchers can (i) flexibly define the underlying process that they believe to have generated the data; (ii) obtain direct information regarding the uncertainty about the parameter that relates the data to the theoretical question being studied; and (iii) incorporate prior knowledge into the analysis. Getting started with Bayesian modeling can be challenging, especially when one is trying to model one’s own (often unique) data. It is difficult to see how one can apply general principles described in textbooks to one’s own specific research problem. We address this barrier to using Bayesian methods by providing three detailed examples, with source code to allow easy reproducibility. The examples presented are intended to give the reader a flavor of the process of model-fitting; suggestions for further study are also provided. All data and code are available from: https://osf.io/g4zpv.
Within quantitative phonetics, it is common practice to draw conclusions based on statistical significance alone Using incomplete neutralization of final devoicing in German as a case study, we illustrate the problems with this approach. If researchers find a significant acoustic difference between voiceless and devoiced obstruents, they conclude that neutralization is incomplete, and if they find no significant difference, they conclude that neutralization is complete. However, such strong claims regarding the existence or absence of an effect based on significant results alone can be misleading. Instead, the totality of available evidence should be brought to bear on the question. Towards this end, we synthesize the evidence from 14 studies on incomplete neutralization in German using a Bayesian random-effects meta-analysis. Our meta-analysis provides evidence in favor of incomplete neutralization. We conclude with some suggestions for improving the quality of future research on phonetic phenomena: ensure that sample sizes allow for high-precision estimates of the effect; avoid the temptation to deploy researcher degrees of freedom when analyzing data; focus on estimates of the parameter of interest and the uncertainty about that parameter; attempt to replicate effects found; and, whenever possible, make both the data and analysis available publicly. (c) 2018 Elsevier Ltd. All rights reserved.
It is well-known in statistics (e.g., Gelman & Carlin, 2014) that treating a result as publishable just because the p-value is less than 0.05 leads to overoptimistic expectations of replicability. These effects get published, leading to an overconfident belief in replicability. We demonstrate the adverse consequences of this statistical significance filter by conducting seven direct replication attempts (268 participants in total) of a recent paper (Levy & Keller, 2013). We show that the published claims are so noisy that even non-significant results are fully compatible with them. We also demonstrate the contrast between such small-sample studies and a larger-sample study; the latter generally yields a less noisy estimate but also a smaller effect magnitude, which looks less compelling but is more realistic. We reiterate several suggestions from the methodology literature for improving current practices.
Research on similarity-based interference has provided extensive evidence that the formation of dependencies between non-adjacent words relies on a cue-based retrieval mechanism. There are two different models that can account for one of the main predictions of interference, i.e., a slowdown at a retrieval site, when several items share a feature associated with a retrieval cue: Lewis and Vasishth’s (2005) activation-based model and McElree’s (2000) direct-access model. Even though these two models have been used almost interchangeably, they are based on different assumptions and predict differences in the relationship between reading times and response accuracy. The activation-based model follows the assumptions of the ACT-R framework, and its retrieval process behaves as a lognormal race between accumulators of evidence with a single variance. Under this model, accuracy of the retrieval is determined by the winner of the race and retrieval time by its rate of accumulation. In contrast, the direct-access model assumes a model of memory where only the probability of retrieval can be affected, while the retrieval time is drawn from the same distribution; in this model, differences in latencies are a by-product of the possibility of backtracking and repairing incorrect retrievals. We implemented both models in a Bayesian hierarchical framework in order to evaluate them and compare them. The data show that correct retrievals take longer than incorrect ones, and this pattern is better fit under the direct-access model than under the activation-based model. This finding does not rule out the possibility that retrieval may be behaving as a race model with assumptions that follow less closely the ones from the ACT-R framework. By introducing a modification of the activation model, i.e., by assuming that the accumulation of evidence for retrieval of incorrect items is not only slower but noisier (i.e., different variances for the correct and incorrect items), the model can provide a fit as good as the one of the direct-access model. This first ever computational evaluation of alternative accounts of retrieval processes in sentence processing opens the way for a broader investigation of theories of dependency completion.
We present a computational evaluation of three hypotheses about sources of deficit in sentence comprehension in aphasia: slowed processing, intermittent deficiency, and resource reduction. The ACT-R based Lewis and Vasishth (2005) model is used to implement these three proposals. Slowed processing is implemented as slowed execution time of parse steps; intermittent deficiency as increased random noise in activation of elements in memory; and resource reduction as reduced spreading activation. As data, we considered subject vs. object relative sentences, presented in a self-paced listening modality to 56 individuals with aphasia (IWA) and 46 matched controls. The participants heard the sentences and carried out a picture verification task to decide on an interpretation of the sentence. These response accuracies are used to identify the best parameters (for each participant) that correspond to the three hypotheses mentioned above. We show that controls have more tightly clustered (less variable) parameter values than IWA; specifically, compared to controls, among IWA there are more individuals with slow parsing times, high noise, and low spreading activation. We find that (a) individual IWA show differential amounts of deficit along the three dimensions of slowed processing, intermittent deficiency, and resource reduction, (b) overall, there is evidence for all three sources of deficit playing a role, and (c) IWA have a more variable range of parameter values than controls. An important implication is that it may be meaningless to talk about sources of deficit with respect to an abstract verage IWA; the focus should be on the individual's differential degrees of deficit along different dimensions, and on understanding the causes of variability in deficit between participants.
Given the replication crisis in cognitive science, it is important to consider what researchers need to do in order to report results that are reliable. We consider three changes in current practice that have the potential to deliver more realistic and robust claims. First, the planned experiment should be divided into two stages, an exploratory stage and a confirmatory stage. This clear separation allows the researcher to check whether any results found in the exploratory stage are robust. The second change is to carry out adequately powered studies. We show that this is imperative if we want to obtain realistic estimates of effects in psycholinguistics. The third change is to use Bayesian data-analytic methods rather than frequentist ones; the Bayesian framework allows us to focus on the best estimates we can obtain of the effect, rather than rejecting a strawman null. As a case study, we investigate number interference effects in German. Number feature interference is predicted by cue-based retrieval models of sentence processing (Van Dyke & Lewis, 2003; Vasishth & Lewis, 2006), but it has shown inconsistent results. We show that by implementing the three changes mentioned, suggestive evidence emerges that is consistent with the predicted number interference effects.
In this paper we examine the effect of uncertainty on readers’ predictions about meaning. In particular, we were interested in how uncertainty might influence the likelihood of committing to a specific sentence meaning. We conducted two event-related potential (ERP) experiments using particle verbs such as turn down and manipulated uncertainty by constraining the context such that readers could be either highly certain about the identity of a distant verb particle, such as turn the bed […] down, or less certain due to competing particles, such as turn the music […] up/down. The study was conducted in German, where verb particles appear clause-finally and may be separated from the verb by a large amount of material. We hypothesised that this separation would encourage readers to predict the particle, and that high certainty would make prediction of a specific particle more likely than lower certainty. If a specific particle was predicted, this would reflect a strong commitment to sentence meaning that should incur a higher processing cost if the prediction is wrong. If a specific particle was less likely to be predicted, commitment should be weaker and the processing cost of a wrong prediction lower. If true, this could suggest that uncertainty discourages predictions via an unacceptable cost-benefit ratio. However, given the clear predictions made by the literature, it was surprisingly unclear whether the uncertainty manipulation affected the two ERP components studied, the N400 and the PNP. Bayes factor analyses showed that evidence for our a priori hypothesised effect sizes was inconclusive, although there was decisive evidence against a priori hypothesised effect sizes larger than 1μV for the N400 and larger than 3μV for the PNP. We attribute the inconclusive finding to the properties of verb-particle dependencies that differ from the verb-noun dependencies in which the N400 and PNP are often studied.
We used Chinese prenominal relative clauses (RCs) to test the predictions of two competing accounts of sentence comprehension difficulty: the experience-based account of Levy () and the Dependency Locality Theory (DLT; Gibson, ). Given that in Chinese RCs, a classifier and/or a passive marker BEI can be added to the sentence-initial position, we manipulated the presence/absence of classifiers and the presence/absence of BEI, such that BEI sentences were passivized subject-extracted RCs, and no-BEI sentences were standard object-extracted RCs. We conducted two self-paced reading experiments, using the same critical stimuli but somewhat different filler items. Reading time patterns from both experiments showed facilitative effects of BEI within and beyond RC regions, and delayed facilitative effects of classifiers, suggesting that cues that occur before a clear signal of an upcoming RC can help Chinese comprehenders to anticipate RC structures. The data patterns are not predicted by the DLT, but they are consistent with the predictions of experience-based theories.
In 2019 the Journal of Memory and Language instituted an open data and code policy; this policy requires that, as a rule, code and data be released at the latest upon publication. How effective is this policy? We compared 59 papers published before, and 59 papers published after, the policy took effect. After the policy was in place, the rate of data sharing increased by more than 50%. We further looked at whether papers published under the open data policy were reproducible, in the sense that the published results should be possible to regenerate given the data, and given the code, when code was provided. For 8 out of the 59 papers, data sets were inaccessible. The reproducibility rate ranged from 34% to 56%, depending on the reproducibility criteria. The strongest predictor of whether an attempt to reproduce would be successful is the presence of the analysis code: it increases the probability of reproducing reported results by almost 40%. We propose two simple steps that can increase the reproducibility of published papers: share the analysis code, and attempt to reproduce one's own analysis using only the shared materials.
Intuitively, strongly constraining contexts should lead to stronger probabilistic representations of sentences in memory. Encountering unexpected words could therefore be expected to trigger costlier shifts in these representations than expected words. However, psycholinguistic measures commonly used to study probabilistic processing, such as the N400 event-related potential (ERP) component, are sensitive to word predictability but not to contextual constraint. Some research suggests that constraint-related processing cost may be measurable via an ERP positivity following the N400, known as the anterior post-N400 positivity (PNP). The PNP is argued to reflect update of a sentence representation and to be distinct from the posterior P600, which reflects conflict detection and reanalysis. However, constraint-related PNP findings are inconsistent. We sought to conceptually replicate Federmeier et al. (2007) and Kuperberg et al. (2020), who observed that the PNP, but not the N400 or the P600, was affected by constraint at unexpected but plausible words. Using a pre-registered design and statistical approach maximising power, we demonstrated a dissociated effect of predictability and constraint: strong evidence for predictability but not constraint in the N400 window, and strong evidence for constraint but not predictability in the later window. However, the constraint effect was consistent with a P600 and not a PNP, suggesting increased conflict between a strong representation and unexpected input rather than greater update of the representation. We conclude that either a simple strong/weak constraint design is not always sufficient to elicit the PNP, or that previous PNP constraint findings could be an artifact of smaller sample size.
What is the processing cost of being garden-pathed by a temporary syntactic ambiguity? We argue that comparing average reading times in garden-path versus non-garden-path sentences is not enough to answer this question. Trial-level contaminants such as inattention, the fact that garden pathing may occur non-deterministically in the ambiguous condition, and "triage" (rejecting the sentence without reanalysis; Fodor & Inoue, 2000) lead to systematic underestimates of the true cost of garden pathing. Furthermore, the "pure" garden-path effect due to encountering an unexpected word needs to be separated from the additional cost of syntactic reanalysis. To get more realistic estimates for the individual processing costs of garden pathing and syntactic reanalysis, we implement a novel computational model that includes trial-level contaminants as probabilistically occurring latent cognitive processes. The model shows a good predictive fit to existing reading time and judgment data. Furthermore, the latent-process approach captures differences between noun phrase/zero complement (NP/Z) garden-path sentences and semantically biased reduced relative clause (RRC) garden-path sentences: The NP/Z garden path occurs nearly deterministically but can be mostly eliminated by adding a comma. By contrast, the RRC garden path occurs with a lower probability, but disambiguation via semantic plausibility is not always effective.
Cue-based retrieval theories in sentence processing predict two classes of interference effect: (i) Inhibitory interference is predicted when multiple items match a retrieval cue: cue-overloading leads to an overall slowdown in reading time; and (ii) Facilitatory interference arises when a retrieval target as well as a distractor only partially match the retrieval cues; this partial matching leads to an overall speedup in retrieval time. Inhibitory interference effects are widely observed, but facilitatory interference apparently has an exception: reflexives have been claimed to show no facilitatory interference effects. Because the claim is based on underpowered studies, we conducted a large-sample experiment that investigated both facilitatory and inhibitory interference. In contrast to previous studies, we find facilitatory interference effects in reflexives. We also present a quantitative evaluation of the cue-based retrieval model of Engelmann, Jager, and Vasishth (2019).
We explore the interaction between oculomotor control and language comprehension on the sentence level using two well-tested computational accounts of parsing difficulty. Previous work (Boston, Hale, Vasishth, & Kliegl, 2011) has shown that surprisal (Hale, 2001; Levy, 2008) and cue-based memory retrieval (Lewis & Vasishth, 2005) are significant and complementary predictors of reading time in an eyetracking corpus. It remains an open question how the sentence processor interacts with oculomotor control. Using a simple linking hypothesis proposed in Reichle, Warren, and McConnell (2009), we integrated both measures with the eye movement model EMMA (Salvucci, 2001) inside the cognitive architecture ACT-R (Anderson et al., 2004). We built a reading model that could initiate short Time Out regressions (Mitchell, Shen, Green, & Hodgson, 2008) that compensate for slow postlexical processing. This simple interaction enabled the model to predict the re-reading of words based on parsing difficulty. The model was evaluated in different configurations on the prediction of frequency effects on the Potsdam Sentence Corpus. The extension of EMMA with postlexical processing improved its predictions and reproduced re-reading rates and durations with a reasonable fit to the data. This demonstration, based on simple and independently motivated assumptions, serves as a foundational step toward a precise investigation of the interaction between high-level language processing and eye movement control.
Eye movement data have proven to be very useful for investigating human sentence processing. Eyetracking research has addressed a wide range of questions, such as recovery mechanisms following garden-pathing, the timing of processes driving comprehension, the role of anticipation and expectation in parsing, the role of semantic, pragmatic, and prosodic information, and so on. However, there are some limitations regarding the inferences that can be made on the basis of eye movements. One relates to the nontrivial interaction between parsing and the eye movement control system which complicates the interpretation of eye movement data. Detailed computational models that integrate parsing with eye movement control theories have the potential to unpack the complexity of eye movement data and can therefore aid in the interpretation of eye movements. Another limitation is the difficulty of capturing spatiotemporal patterns in eye movements using the traditional word-based eyetracking measures. Recent research has demonstrated the relevance of these patterns and has shown how they can be analyzed. In this review, we focus on reading, and present examples demonstrating how eye movement data reveal what events unfold when the parser runs into difficulty, and how the parsing system interacts with eye movement control. WIREs Cogn Sci 2013, 4:125134. doi: 10.1002/wcs.1209 For further resources related to this article, please visit the WIREs website.
What theories best characterise the parsing processes triggered upon encountering ambiguity, and what effects do these processes have on eye movement patterns in reading? The present eye-tracking study, which investigated processing of attachment ambiguities of an adjunct in Spanish, suggests that readers sometimes underspecify attachment to save memory resources, consistent with the good-enough account of parsing. Our results confirm a surprising prediction of the good-enough account: high-capacity readers commit to an attachment decision more often than low-capacity participants, leading to more errors and a greater need to reanalyse in garden-path sentences. These results emerged only when we separated functionally different types of regressive eye movements using a scanpath analysis; conventional eye-tracking measures alone would have led to different conclusions. The scanpath analysis also showed that rereading was the dominant strategy for recovering from garden-pathing. Our results may also have broader implications for models of reading processes: reanalysis effects in eye movements occurred late, which suggests that the coupling of oculo-motor control and the parser may not be as tight as assumed in current computational models of eye movement control in reading.
A general fact about language is that subject relative clauses are easier to process than object relative clauses. Recently, several self-paced reading studies have presented surprising evidence that object relatives in Chinese are easier to process than subject relatives. We carried out three self-paced reading experiments that attempted to replicate these results. Two of our three studies found a subject-relative preference, and the third study found an object-relative advantage. Using a random effects bayesian meta-analysis of fifteen studies (including our own), we show that the overall current evidence for the subject-relative advantage is quite strong (approximate posterior probability of a subject-relative advantage given the data: 78-80%). We argue that retrieval/integration based accounts would have difficulty explaining all three experimental results. These findings are important because they narrow the theoretical space by limiting the role of an important class of explanation-retrieval/integration cost-at least for relative clause processing in Chinese.
Eye fixation durations during normal reading correlate with processing difficulty, but the specific cognitive mechanisms reflected in these measures are not well understood. This study finds support in German readers' eye fixations for two distinct difficulty metrics: surprisal, which reflects the change in probabilities across syntactic analyses as new words are integrated; and retrieval, which quantifies comprehension difficulty in terms of working memory constraints. We examine the predictions of both metrics using a family of dependency parsers indexed by an upper limit on the number of candidate syntactic analyses they retain at successive words. Surprisal models all fixation measures and regression probability. By contrast, retrieval does not model any measure in serial processing. As more candidate analyses are considered in parallel at each word, retrieval can account for the same measures as surprisal. This pattern suggests an important role for ranked parallelism in theories of sentence comprehension.
Multiple focus
(2009)
This paper presents the results of an experimental study on multiple focus configurations, that is, structures containing two nested focus-sensitive operators plus two foci supposed to associate with those operators. There has been controversial discussion in the semantic literature regarding whether or not an interpretation is acceptable that corresponds to this association. While the data are unclear, the issue is of considerable theoretical significance, as it distinguishes between the available theories of focus interpretation. Some theories (e. g. Rooth's 1992) predict such a pattern of association with focus to be impossible, while others (such as Wold's 1996) predict it to be acceptable. The results of our study show the data to be unacceptable rather than acceptable, favouring important aspects of the theory of focus interpretation developed by Rooth.
Parsing costs as predictors of reading difficulty : an evaluation using the Potsdam Sentence Corpus
(2008)
Background: In addition to the canonical subject-verb-object (SVO) word order, German also allows for non-canonical order (OVS), and the case-marking system supports thematic role interpretation. Previous eye-tracking studies (Kamide et al., 2003; Knoeferle, 2007) have shown that unambiguous case information in non-canonical sentences is processed incrementally. For individuals with agrammatic aphasia, comprehension of non-canonical sentences is at chance level (Burchert et al., 2003). The trace deletion hypothesis (Grodzinsky 1995, 2000) claims that this is due to structural impairments in syntactic representations, which force the individual with aphasia (IWA) to apply a guessing strategy. However, recent studies investigating online sentence processing in aphasia (Caplan et al., 2007; Dickey et al., 2007) found that divergences exist in IWAs' sentence-processing routines depending on whether they comprehended non-canonical sentences correctly or not, pointing rather to a processing deficit explanation. Aims: The aim of the current study was to investigate agrammatic IWAs' online and offline sentence comprehension simultaneously in order to reveal what online sentence-processing strategies they rely on and how these differ from controls' processing routines. We further asked whether IWAs' offline chance performance for non-canonical sentences does indeed result from guessing. Methods Procedures: We used the visual-world paradigm and measured eye movements (as an index of online sentence processing) of controls (N = 8) and individuals with aphasia (N = 7) during a sentence-picture matching task. Additional offline measures were accuracy and reaction times. Outcomes Results: While the offline accuracy results corresponded to the pattern predicted by the TDH, IWAs' eye movements revealed systematic differences depending on the response accuracy. Conclusions: These findings constitute evidence against attributing IWAs' chance performance for non-canonical structures to mere guessing. Instead, our results support processing deficit explanations and characterise the agrammatic parser as deterministic and inefficient: it is slowed down, affected by intermittent deficiencies in performing syntactic operations, and fails to compute reanalysis even when one is detected.
Background: In behavioural tests of sentence comprehension in aphasia, correct and incorrect responses are often randomly distributed. Such a pattern of chance performance is a typical trait of Broca's aphasia, but can be found in other aphasic syndromes as well. Many researchers have argued that chance behaviour is the result of a guessing strategy, which is adopted in the face of a syntactic breakdown in sentence processing. Aims: Capitalising on new evidence from recent studies investigating online sentence comprehension in aphasia using the visual world paradigm, the aim of this paper is to review the concept of chance performance as a reflection of a syntactic impairment in sentence processing and to re-examine the conventional interpretation of chance performance as a guessing behaviour. Main Contribution: Based on a review of recent evidence from visual world paradigm studies, we argue that the assumption of chance performance equalling guessing is not necessarily compatible with actual real-time parsing procedures in people with aphasia. We propose a reinterpretation of the concept of chance performance by assuming that there are two distinct processing mechanisms underlying sentence comprehension in aphasia. Correct responses are always the result of normal-like parsing mechanisms, even in those cases where the overall performance pattern is at chance. Incorrect responses, on the other hand, are the result of intermittent deficiencies of the parser. Hence the random guessing behaviour that persons with aphasia often display does not necessarily reflect a syntactic breakdown in sentence comprehension and a random selection between alternatives. Instead it should be regarded as a result of temporal deficient parsing procedures in otherwise normal-like comprehension routines. Conclusion: Our conclusion is that the consideration of behavioural offline data alone may not be sufficient to interpret a performance in language tests and subsequently draw theoretical conclusions about language impairments. Rather it is important to call on additional data from online studies that look at language processing in real time in order to gain a comprehensive picture about syntactic comprehension abilities of people with aphasia and possible underlying deficits.
Eye-movement research on implicit prosody has found effects of lexical stress on syntactic ambiguity resolution, suggesting that metrical well-formedness constraints interact with syntactic category assignment. Building on these findings, the present eyetracking study investigates whether contextual bias can modulate the effects of metrical structure on syntactic ambiguity resolution in silent reading. Contextual bias and potential stress-clash in the ambiguous region were crossed in a 2 x 2 design. Participants read biased context sentences followed by temporarily ambiguous test sentences. In the three-word ambiguous region, main effects of lexical stress were dominant, while early effects of context were absent. Potential stress clash yielded a significant increase in first-pass regressions and re-reading probability across the three words. In the disambiguating region, the disambiguating word itself showed increased processing difficulty (lower skipping and increased re-reading probability) when the disambiguation engendered a stress clash configuration, while the word immediately following showed main effects of context in those same measures. Taken together, effects of lexical stress upon eye movements were swift and pervasive across first-pass and second-pass measures, while effects of context were relatively delayed. These results indicate a strong role for implicit meter in guiding parsing, one that appears insensitive to higher-level constraints. Our findings are problematic for two classes of models, the two-stage garden-path model and the constraint-based competition-integration model, but can be explained by a variation on the two-stage model, the unrestricted race model.
Many comprehension theories assert that increasing the distance between elements participating in a linguistic relation (e.g., a verb and a noun phrase argument) increases the difficulty of establishing that relation during on-line comprehension. Such locality effects are expected to increase reading times and are thought to reveal properties and limitations of the short-term memory system that supports comprehension. Despite their theoretical importance and putative ubiquity, however, evidence for on-line locality effects is quite narrow linguistically and methodologically: It is restricted almost exclusively to self-paced reading of complex structures involving a particular class of syntactic relation. We present 4 experiments (2 self-paced reading and 2 eyetracking experiments) that demonstrate locality effects in the course of establishing subject-verb dependencies; locality effects are seen even in materials that can be read quickly and easily. These locality effects are observable in the earliest possible eye-movement measures and are of much shorter duration than previously reported effects. To account for the observed empirical patterns, we outline a processing model of the adaptive control of button pressing and eye movements. This model makes progress toward the goal of eliminating linking assumptions between memory constructs and empirical measures in favor of explicit theories of the coordinated control of motor responses and parsing.
While it is widely acknowledged in the formal semantic literature that both the truth-functional focus particle only and it-clefts convey exhaustiveness, the nature and source of exhaustiveness effects with it-clefts remain contested. We describe a questionnaire study (n = 80) and an event-related brain potentials (ERP) study (n = 16) that investigated the violation of exhaustiveness in German only-foci versus it-clefts. The offline study showed that a violation of exhaustivity with only is less acceptable than the violation with it-clefts, suggesting a difference in the nature of exhaustivity interpretation in the two environments. The ERP-results confirm that this difference can be seen in online processing as well: a violation of exhaustiveness in only-foci elicited a centro-posterior positivity (600-800ms), whereas a violation in it-clefts induced a globally distributed N400 pattern (400-600ms). The positivity can be interpreted as a reanalysis process and more generally as a process of context updating. The N400 effect in it-clefts is interpreted as indexing a cancelation process that is functionally distinct from the only case. The ERP study is, to our knowledge, the first evidence from an online experimental paradigm which shows that the violation of exhaustiveness involves different underlying processes in the two structural environments.
Which repair strategy does the language system deploy when it gets garden-pathed, and what can regressive eye movements in reading tell us about reanalysis strategies? Several influential eye-tracking studies on syntactic reanalysis (Frazier & Rayner, 1982; Meseguer, Carreiras, & Clifton, 2002; Mitchell, Shen, Green, & Hodgson, 2008) have addressed this question by examining scanpaths, i.e., sequential patterns of eye fixations. However, in the absence of a suitable method for analyzing scanpaths, these studies relied on simplified dependent measures that are arguably ambiguous and hard to interpret. We address the theoretical question of repair strategy by developing a new method that quantifies scanpath similarity. Our method reveals several distinct fixation strategies associated with reanalysis that went undetected in a previously published data set (Meseguer et al., 2002). One prevalent pattern suggests re-parsing of the sentence, a strategy that has been discussed in the literature (Frazier & Rayner, 1982); however, readers differed tremendously in how they orchestrated the various fixation strategies. Our results suggest that the human parsing system non-deterministically adopts different strategies when confronted with the disambiguating material in garden-path sentences.
Expectation-driven facilitation (Hale, 2001; Levy, 2008) and locality-driven retrieval difficulty (Gibson, 1998, 2000; Lewis & Vasishth, 2005) are widely recognized to be two critical factors in incremental sentence processing; there is accumulating evidence that both can influence processing difficulty. However, it is unclear whether and how expectations and memory interact. We first confirm a key prediction of the expectation account: a Hindi self-paced reading study shows that when an expectation for an upcoming part of speech is dashed, building a rarer structure consumes more processing time than building a less rare structure. This is a strong validation of the expectation-based account. In a second study, we show that when expectation is strong, i.e., when a particular verb is predicted, strong facilitation effects are seen when the appearance of the verb is delayed; however, when expectation is weak, i.e., when only the part of speech "verb' is predicted but a particular verb is not predicted, the facilitation disappears and a tendency towards a locality effect is seen. The interaction seen between expectation strength and distance shows that strong expectations cancel locality effects, and that weak expectations allow locality effects to emerge.
In explicit memory recall and recognition tasks, elaboration and contextual isolation both facilitate memory performance. Here, we investigate these effects in the context of sentence processing: targets for retrieval during online sentence processing of English object relative clause constructions differ in the amount of elaboration associated with the target noun phrase, or the homogeneity of superficial features (text color). Experiment 1 shows that greater elaboration for targets during the encoding phase reduces reading times at retrieval sites, but elaboration of non-targets has considerably weaker effects. Experiment 2 illustrates that processing isolated superficial features of target noun phrases-here, a green word in a sentence with words colored white-does not lead to enhanced memory performance, despite triggering longer encoding times. These results are interpreted in the light of the memory models of Nairne, 1990, 2001, 2006, which state that encoding remnants contribute to the set of retrieval cues that provide the basis for similarity-based interference effects.
Eye fixation durations during normal reading correlate with processing difficulty, but the specific cognitive mechanisms reflected in these measures are not well understood. This study finds support in German readers' eye fixations for two distinct difficulty metrics: surprisal, which reflects the change in probabilities across syntactic analyses as new words are integrated; and retrieval, which quantifies comprehension difficulty in terms of working memory constraints. We examine the predictions of both metrics using a family of dependency parsers indexed by an upper limit on the number of candidate syntactic analyses they retain at successive words. Surprisal models all fixation measures and regression probability. By contrast, retrieval does not model any measure in serial processing. As more candidate analyses are considered in parallel at each word, retrieval can account for the same measures as surprisal. This pattern suggests an important role for ranked parallelism in theories of sentence comprehension.
In two self-paced reading experiments, we investigated the effect of changes in antecedent complexity on processing times for ellipsis. Pointer- or “sharing”-based approaches to ellipsis processing (Frazier & Clifton 2001, 2005; Martin & McElree 2008) predict no effect of antecedent complexity on reading times at the ellipsis site while other accounts predict increased antecedent complexity to either slow down processing (Murphy 1985) or to speed it up (Hofmeister 2011). Experiment 1 manipulated antecedent complexity and elision, yielding evidence against a speedup at the ellipsis site and in favor of a null effect. In order to investigate possible superficial processing on part of participants, Experiment 2 manipulated the amount of attention required to correctly respond to end-of-sentence comprehension probes, yielding evidence against a complexity-induced slowdown at the ellipsis site. Overall, our results are compatible with pointer-based approaches while casting doubt on the notion that changes antecedent complexity lead to measurable differences in ellipsis processing speed.
Comprehension of non-canonical sentences can be difficult for individuals with aphasia (IWA). It is still unclear to which extent morphological cues like case marking or verb inflection may influence IWA's performance or even help to override deficits in sentence comprehension. Until now, studies have mainly used offline methods to draw inferences about syntactic deficits and, so far, only a few studies have looked at online syntactic processing in aphasia. We investigated sentence processing in German-speaking IWA by combining an offline (sentence-picture matching) and an online (eye-tracking in the visual-world paradigm) method. Our goal was to determine whether IWA are capable of using inflectional morphology (number-agreement markers on verbs and case markers in noun phrases) as a cue to sentence interpretation. We report results of two visual-world experiments using German reversible SVO and OVS sentences. In each study, there were eight IWA and 20 age-matched controls. Experiment 1 targeted the role of unambiguous case morphology, while Experiment 2 looked at processing of number-agreement cues at the verb in caseambiguous sentences. IWA showed deficits in using both types of morphological markers as a cue to non-canonical sentence interpretation and the results indicate that in aphasia, processing of case-marking cues is more vulnerable as compared to verbagreement morphology. We ascribe this finding to the higher cue reliability of agreement cues, which renders them more resistant against impairments in aphasia. However, the online data revealed that IWA are in principle capable of successfully computing morphological cues, but the integration of morphological information is delayed as compared to age-matched controls. Furthermore, we found striking differences between controls and IWA regarding subject-before-object parsing predictions. While in case-unambiguous sentences IWA showed evidence for early subjectbefore-object parsing commitments, they exhibited no straightforward subject-first prediction in case-ambiguous sentences, although controls did so for ambiguous structures. IWA delayed their parsing decisions in case-ambiguous sentences until unambiguous morphological information, such as a subject-verbnumber-agreement cue, was available. We attribute the results for IWA to deficits in predictive processes based on morphosyntactic cues during sentence comprehension. The results indicate that IWA adopt a wait-and-see strategy and initiate prediction of upcoming syntactic structure only when unambiguous case or agreement cues are available. (C) 2015 Elsevier Ltd. All rights reserved.
There is a wealth of evidence showing that increasing the distance between an argument and its head leads to more processing effort, namely, locality effects: these are usually associated with constraints in working memory (DLT: Gibson, 2000: activation-based model: Lewis and Vasishth, 2005). In SOV languages, however, the opposite effect has been found: antilocality (see discussion in Levy et al., 2013). Antilocality effects can be explained by the expectation based approach as proposed by Levy (2008) or by the activation-based model of sentence processing as proposed by Lewis and Vasishth (2005). We report an eye-tracking and a self-paced reading study with sentences in Spanish together with measures of individual differences to examine the distinction between expectation- and memory based accounts, and within memory-based accounts the further distinction between DLT and the activation-based model. The experiments show that (i) antilocality effects as predicted by the expectation account appear only for high-capacity readers; (ii) increasing dependency length by interposing material that modifies the head of the dependency (the verb) produces stronger facilitation than increasing dependency length with material that does not modify the head; this is in agreement with the activation-based model but not with the expectation account; and (iii) a possible outcome of memory load on low-capacity readers is the increase in regressive saccades (locality effects as predicted by memory-based accounts) or, surprisingly, a speedup in the self-paced reading task; the latter consistent with good-enough parsing (Ferreira et al., 2002). In sum, the study suggests that individual differences in working memory capacity play a role in dependency resolution, and that some of the aspects of dependency resolution can be best explained with the activation-based model together with a prediction component.
We conducted two eye-tracking experiments investigating the processing of the Mandarin reflexive ziji in order to tease apart structurally constrained accounts from standard cue-based accounts of memory retrieval. In both experiments, we tested whether structurally inaccessible distractors that fulfill the animacy requirement of ziji influence processing times at the reflexive. In Experiment 1, we manipulated animacy of the antecedent and a structurally inaccessible distractor intervening between the antecedent and the reflexive. In conditions where the accessible antecedent mismatched the animacy cue, we found inhibitory interference whereas in antecedent-match conditions, no effect of the distractor was observed. In Experiment 2, we tested only antecedent-match configurations and manipulated locality of the reflexive-antecedent binding (Mandarin allows non-local binding). Participants were asked to hold three distractors (animate vs. inanimate nouns) in memory while reading the target sentence. We found slower reading times when animate distractors were held in memory (inhibitory interference). Moreover, we replicated the locality effect reported in previous studies. These results are incompatible with structure-based accounts. However, the cue-based ACT-R model of Lewis and Vasishth (2005) cannot explain the observed pattern either. We therefore extend the original ACT-R model and show how this model not only explains the data presented in this article, but is also able to account for previously unexplained patterns in the literature on reflexive processing.
This is the first attempt at characterizing reading difficulty in Hindi using naturally occurring sentences. We created the Potsdam-Allahabad Hindi Eyetracking Corpus by recording eye-movement data from 30 participants at the University of Allahabad, India. The target stimuli were 153 sentences selected from the beta version of the Hindi-Urdu treebank. We find that word- or low-level predictors (syllable length, unigram and bigram frequency) affect first-pass reading times, regression path duration, total reading time, and outgoing saccade length. An increase in syllable length results in longer fixations, and an increase in word unigram and bigram frequency leads to shorter fixations. Longer syllable length and higher frequency lead to longer outgoing saccades. We also find that two predictors of sentence comprehension difficulty, integration and storage cost, have an effect on reading difficulty. Integration cost (Gibson, 2000) was approximated by calculating the distance (in words) between a dependent and head; and storage cost (Gibson, 2000), which measures difficulty of maintaining predictions, was estimated by counting the number of predicted heads at each point in the sentence. We find that integration cost mainly affects outgoing saccade length, and storage cost affects total reading times and outgoing saccade length. Thus, word-level predictors have an effect in both early and late measures of reading time, while predictors of sentence comprehension difficulty tend to affect later measures. This is, to our knowledge, the first demonstration using eye-tracking that both integration and storage cost influence reading difficulty.
In two self-paced reading experiments, we investigated the effect of changes in antecedent complexity on processing times for ellipsis. Pointer- or “sharing”-based approaches to ellipsis processing (Frazier & Clifton 2001, 2005; Martin & McElree 2008) predict no effect of antecedent complexity on reading times at the ellipsis site while other accounts predict increased antecedent complexity to either slow down processing (Murphy 1985) or to speed it up (Hofmeister 2011). Experiment 1 manipulated antecedent complexity and elision, yielding evidence against a speedup at the ellipsis site and in favor of a null effect. In order to investigate possible superficial processing on part of participants, Experiment 2 manipulated the amount of attention required to correctly respond to end-of-sentence comprehension probes, yielding evidence against a complexity-induced slowdown at the ellipsis site. Overall, our results are compatible with pointer-based approaches while casting doubt on the notion that changes antecedent complexity lead to measurable differences in ellipsis processing speed.
Chinese relative clauses are an important test case for pitting the predictions of expectation-based accounts against those of memory-based theories. The memory-based accounts predict that object relatives are easier to process than subject relatives because, in object relatives, the distance between the relative clause verb and the head noun is shorter. By contrast, expectation-based accounts such as surprisal predict that the less frequent object relative should be harder to process. In previous studies on Chinese relative clause comprehension, local ambiguities may have rendered a comparison between relative clause types uninterpretable. We designed experimental materials in which no local ambiguities confound the comparison. We ran two experiments (self-paced reading and eye-tracking) to compare reading difficulty in subject and object relatives which were placed either in subject or object modifying position. The evidence from our studies is consistent with the predictions of expectation-based accounts but not with those of memory-based theories. (C) 2014 Elsevier Inc. All rights reserved.
In a self-paced reading study on German sluicing, Paape (Paape, 2016) found that reading times were shorter at the ellipsis site when the antecedent was a temporarily ambiguous garden-path structure. As a post-hoc explanation of this finding, Paape assumed that the antecedent’s memory representation was reactivated during syntactic reanalysis, making it easier to retrieve. In two eye tracking experiments, we subjected the reactivation hypothesis to further empirical scrutiny. Experiment 1, carried out in French, showed no evidence in favor in the reactivation hypothesis. Instead, results for one out of the three types of garden-path sentences that were tested suggest that subjects sometimes failed to resolve the temporary ambiguity in the antecedent clause, and subsequently failed to resolve the ellipsis. The results of Experiment 2, a conceptual replication of Paape’s (Paape, 2016) original study carried out in German, are compatible with the reactivation hypothesis, but leave open the possibility that the observed speedup for ambiguous antecedents may be due to occasional retrievals of an incorrect structure.
Understanding a sentence and integrating it into the discourse depends upon the identification of its focus, which, in spoken German, is marked by accentuation. In the case of written language, which lacks explicit cues to accent, readers have to draw on other kinds of information to determine the focus. We study the joint or interactive effects of two kinds of information that have no direct representation in print but have each been shown to be influential in the reader's text comprehension: (i) the (low-level) rhythmic-prosodic structure that is based on the distribution of lexically stressed syllables, and (ii) the (high-level) discourse context that is grounded in the memory of previous linguistic content. Systematically manipulating these factors, we examine the way readers resolve a syntactic ambiguity involving the scopally ambiguous focus operator auch (engl. "too") in both oral (Experiment 1) and silent reading (Experiment 2). The results of both experiments attest that discourse context and local linguistic rhythm conspire to guide the syntactic and, concomitantly, the focus-structural analysis of ambiguous sentences. We argue that reading comprehension requires the (implicit) assignment of accents according to the focus structure and that, by establishing a prominence profile, the implicit prosodic rhythm directly affects accent assignment.
It has been proposed that in online sentence comprehension the dependency between a reflexive pronoun such as himself/herself and its antecedent is resolved using exclusively syntactic constraints. Under this strictly syntactic search account, Principle A of the binding theory which requires that the antecedent c-command the reflexive within the same clause that the reflexive occurs in constrains the parser's search for an antecedent. The parser thus ignores candidate antecedents that might match agreement features of the reflexive (e.g., gender) but are ineligible as potential antecedents because they are in structurally illicit positions. An alternative possibility accords no special status to structural constraints: in addition to using Principle A, the parser also uses non-structural cues such as gender to access the antecedent. According to cue -based retrieval theories of memory (e.g., Lewis and Vasishth, 2005), the use of non-structural cues should result in increased retrieval times and occasional errors when candidates partially match the cues, even if the candidates are in structurally illicit positions. In this paper, we first show how the retrieval processes that underlie the reflexive binding are naturally realized in the Lewis and Vasishth (2005) model. We present the predictions of the model under the assumption that both structural and non-structural cues are used during retrieval, and provide a critical analysis of previous empirical studies that failed to find evidence for the use of non-structural cues, suggesting that these failures may be Type II errors. We use this analysis and the results of further modeling to motivate a new empirical design that we use in an eye tracking study. The results of this study confirm the key predictions of the model concerning the use of non-structural cues, and are inconsistent with the strictly syntactic search account. These results present a challenge for theories advocating the infallibility of the human parser in the case of reflexive resolution, and provide support for the inclusion of agreement features such as gender in the set of retrieval cues.
SOPARSE predicts so-called local coherence effects: locally plausible but globally impossible parses of substrings can exert a distracting influence during sentence processing. Additionally, it predicts digging-in effects: the longer the parser stays committed to a particular analysis, the harder it becomes to inhibit that analysis. We investigated the interaction of these two predictions using German sentences. Results from a self-paced reading study show that the processing difficulty caused by a local coherence can be reduced by first allowing the globally correct parse to become entrenched, which supports SOPARSE’s assumptions.
Understanding a sentence and integrating it into the discourse depends upon the identification of its focus, which, in spoken German, is marked by accentuation. In the case of written language, which lacks explicit cues to accent, readers have to draw on other kinds of information to determine the focus. We study the joint or interactive
effects of two kinds of information that have no direct representation in print but have each been shown to be influential in the reader’s text comprehension: (i) the (low-level)rhythmic-prosodic structure that is based on the distribution of lexically stressed syllables, and (ii) the (high-level) discourse context that is grounded in the memory of previous linguistic content. Systematically manipulating these factors, we examine the way readers resolve a syntactic ambiguity involving the scopally ambiguous focus operator auch (engl. “too”) in both oral (Experiment 1) and silent reading (Experiment 2). The results of both experiments attest that discourse context and local linguistic rhythm conspire to guide the syntactic and, oncomitantly, the focus-structural analysis of ambiguous sentences. We argue that reading comprehension requires the (implicit) assignment of accents according to the focus structure and that, by establishing a prominence profile, the implicit prosodic rhythm directly affects accent assignment.