Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus

  • The surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram and bigram frequency, word length, and empirically-derived word predictability; the so-called “early” and “late” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-significant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difficuThe surprisal of a word on a probabilistic grammar constitutes a promising complexity metric for human sentence comprehension difficulty. Using two different grammar types, surprisal is shown to have an effect on fixation durations and regression probabilities in a sample of German readers’ eye movements, the Potsdam Sentence Corpus. A linear mixed-effects model was used to quantify the effect of surprisal while taking into account unigram and bigram frequency, word length, and empirically-derived word predictability; the so-called “early” and “late” measures of processing difficulty both showed an effect of surprisal. Surprisal is also shown to have a small but statistically non-significant effect on empirically-derived predictability itself. This work thus demonstrates the importance of including parsing costs as a predictor of comprehension difficulty in models of reading, and suggests that a simple identification of syntactic parsing costs with early measures and late measures with durations of post-syntactic events may be difficult to uphold.show moreshow less

Download full text files

Export metadata

  • Export Bibtex
  • Export RIS
  • Export XML

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Marisa Ferrara Boston, John Hale, Reinhold Kliegl, Umesh Patil, Shravan Vasishth
URN:urn:nbn:de:kobv:517-opus-57139
Series (Serial Number):Postprints der Universität Potsdam : Humanwissenschaftliche Reihe, ISSN 1866-8364 (paper 253)
Document Type:Postprint
Language:English
Date of Publication (online):2011/12/13
Year of Completion:2008
Publishing Institution:Universität Potsdam
Release Date:2011/12/13
Source:Journal of Eye Movement Research. - ISSN 1995-8692. - 2 (2008), 1, S. 1-12
Organizational units:Humanwissenschaftliche Fakultät / Institut für Psychologie
Extern / Extern
Dewey Decimal Classification:4 Sprache / 40 Sprache / 400 Sprache
Licence (German):License LogoKeine Nutzungslizenz vergeben - es gilt das deutsche Urheberrecht
Notes extern:first published in:
Journal of eye movement research. 2 (2008), 1, S. 1-12