TY - CHAP A1 - Demske, Ulrike A1 - Logacev, Pavel A1 - Goldschmidt, Katrin T1 - POS-Tagging Historical Corpora: The Case of Early New High German T2 - Proceedings of the thirteenth workshop on treebanks and linguistic theories (TLT 13) N2 - A key problem in automatic annotation of historical corpora is inconsistent spelling. Because the spelling of some word forms can differ between texts, a language model trained on already annotated treebanks may fail to recognize known word forms due to differences in spelling. In the present work, we explore the feasibility of an unsupervised method for spelling-adjustment for the purpose of improved part of speech (POS) tagging. To this end, we present a method for spelling normalization based on weighted edit distances, which exploits within-text spelling variation. We then evaluate the improvement in taging accuracy resulting from between-texts spelling normalization in two tagging experiments on several Early New High German (ENHG) texts. Y1 - 2014 VL - 2014 SP - 103 EP - 112 PB - TALAR - Tübingen Archive of Language Resources CY - Tübingen ER - TY - JOUR A1 - Logacev, Pavel A1 - Vasishth, Shravan T1 - Understanding underspecification: A comparison of two computational implementations JF - The quarterly journal of experimental psychology N2 - Swets et al. (2008. Underspecification of syntactic ambiguities: Evidence from self-paced reading. Memory and Cognition, 36(1), 201–216) presented evidence that the so-called ambiguity advantage [Traxler et al. (1998 Traxler, M. J., Pickering, M. J., & Clifton, C. (1998). Adjunct attachment is not a form of lexical ambiguity resolution. Journal of Memory and Language, 39(4), 558–592. doi: 10.1006/jmla.1998.2600[CrossRef], [Web of Science ®], [Google Scholar]). Adjunct attachment is not a form of lexical ambiguity resolution. Journal of Memory and Language, 39(4), 558–592], which has been explained in terms of the Unrestricted Race Model, can equally well be explained by assuming underspecification in ambiguous conditions driven by task-demands. Specifically, if comprehension questions require that ambiguities be resolved, the parser tends to make an attachment: when questions are about superficial aspects of the target sentence, readers tend to pursue an underspecification strategy. It is reasonable to assume that individual differences in strategy will play a significant role in the application of such strategies, so that studying average behaviour may not be informative. In order to study the predictions of the good-enough processing theory, we implemented two versions of underspecification: the partial specification model (PSM), which is an implementation of the Swets et al. proposal, and a more parsimonious version, the non-specification model (NSM). We evaluate the relative fit of these two kinds of underspecification to Swets et al.’s data; as a baseline, we also fitted three models that assume no underspecification. We find that a model without underspecification provides a somewhat better fit than both underspecification models, while the NSM model provides a better fit than the PSM. We interpret the results as lack of unambiguous evidence in favour of underspecification; however, given that there is considerable existing evidence for good-enough processing in the literature, it is reasonable to assume that some underspecification might occur. Under this assumption, the results can be interpreted as tentative evidence for NSM over PSM. More generally, our work provides a method for choosing between models of real-time processes in sentence comprehension that make qualitative predictions about the relationship between several dependent variables. We believe that sentence processing research will greatly benefit from a wider use of such methods. KW - Computational modelling KW - Underspecification KW - Shallow processing Y1 - 2016 U6 - https://doi.org/10.1080/17470218.2015.1134602 SN - 1747-0218 SN - 1747-0226 VL - 69 SP - 996 EP - 1012 PB - BioMed Central CY - Abingdon ER - TY - JOUR A1 - Logacev, Pavel A1 - Vasishth, Shravan T1 - A Multiple-Channel Model of Task-Dependent Ambiguity Resolution in Sentence Comprehension JF - Cognitive science : a multidisciplinary journal of anthropology, artificial intelligence, education, linguistics, neuroscience, philosophy, psychology ; journal of the Cognitive Science Society N2 - Traxler, Pickering, and Clifton (1998) found that ambiguous sentences are read faster than their unambiguous counterparts. This so-called ambiguity advantage has presented a major challenge to classical theories of human sentence comprehension (parsing) because its most prominent explanation, in the form of the unrestricted race model (URM), assumes that parsing is non-deterministic. Recently, Swets, Desmet, Clifton, and Ferreira (2008) have challenged the URM. They argue that readers strategically underspecify the representation of ambiguous sentences to save time, unless disambiguation is required by task demands. When disambiguation is required, however, readers assign sentences full structure—and Swets et al. provide experimental evidence to this end. On the basis of their findings, they argue against the URM and in favor of a model of task-dependent sentence comprehension. We show through simulations that the Swets et al. data do not constitute evidence for task-dependent parsing because they can be explained by the URM. However, we provide decisive evidence from a German self-paced reading study consistent with Swets et al.'s general claim about task-dependent parsing. Specifically, we show that under certain conditions, ambiguous sentences can be read more slowly than their unambiguous counterparts, suggesting that the parser may create several parses, when required. Finally, we present the first quantitative model of task-driven disambiguation that subsumes the URM, and we show that it can explain both Swets et al.'s results and our findings. KW - Sentence processing KW - Ambiguity KW - Parallel processing KW - Cognitive modeling KW - Unrestricted race model KW - URM KW - Underspecification KW - Good-enough processing Y1 - 2016 U6 - https://doi.org/10.1111/cogs.12228 SN - 0364-0213 SN - 1551-6709 VL - 40 SP - 266 EP - 298 PB - Wiley-Blackwell CY - Hoboken ER - TY - THES A1 - Logačev, Pavel T1 - Underspecification and parallel processing in sentence comprehension T1 - Unterspezifikation und parallele Verarbeitung im Satzverständnis N2 - The aim of the present thesis is to answer the question to what degree the processes involved in sentence comprehension are sensitive to task demands. A central phenomenon in this regard is the so-called ambiguity advantage, which is the finding that ambiguous sentences can be easier to process than unambiguous sentences. This finding may appear counterintuitive, because more meanings should be associated with a higher computational effort. Currently, two theories exist that can explain this finding. The Unrestricted Race Model (URM) by van Gompel et al. (2001) assumes that several sentence interpretations are computed in parallel, whenever possible, and that the first interpretation to be computed is assigned to the sentence. Because the duration of each structure-building process varies from trial to trial, the parallelism in structure-building predicts that ambiguous sentences should be processed faster. This is because when two structures are permissible, the chances that some interpretation will be computed quickly are higher than when only one specific structure is permissible. Importantly, the URM is not sensitive to task demands such as the type of comprehension questions being asked. A radically different proposal is the strategic underspecification model by Swets et al. (2008). It assumes that readers do not attempt to resolve ambiguities unless it is absolutely necessary. In other words, they underspecify. According the strategic underspecification hypothesis, all attested replications of the ambiguity advantage are due to the fact that in those experiments, readers were not required to fully understand the sentence. In this thesis, these two models of the parser’s actions at choice-points in the sentence are presented and evaluated. First, it is argued that the Swets et al.’s (2008) evidence against the URM and in favor of underspecification is inconclusive. Next, the precise predictions of the URM as well as the underspecification model are refined. Subsequently, a self-paced reading experiment involving the attachment of pre-nominal relative clauses in Turkish is presented, which provides evidence against strategical underspecification. A further experiment is presented which investigated relative clause attachment in German using the speed-accuracy tradeoff (SAT) paradigm. The experiment provides evidence against strategic underspecification and in favor of the URM. Furthermore the results of the experiment are used to argue that human sentence comprehension is fallible, and that theories of parsing should be able to account for that fact. Finally, a third experiment is presented, which provides evidence for the sensitivity to task demands in the treatment of ambiguities. Because this finding is incompatible with the URM, and because the strategic underspecification model has been ruled out, a new model of ambiguity resolution is proposed: the stochastic multiple-channel model of ambiguity resolution (SMCM). It is further shown that the quantitative predictions of the SMCM are in agreement with experimental data. In conclusion, it is argued that the human sentence comprehension system is parallel and fallible, and that it is sensitive to task-demands. N2 - Das Ziel der vorliegenden Arbeit ist es zu untersuchen zu welchem Grad Satzverständnis kontextabhängig ist. In anderen Worten, werden die mentalen Prozesse die zum Satzverständnis beitragen davon beeinflusst mit welchem Ziel ein Satz gelesen wird? Ein in diesem Hinblick zentrales Phänomen ist die sogenannte ambiguity advantage, wonach ambige Sätze schneller gelesen werden als eindeutige. Dies erscheint zunächst kontraintuitiv, denn die Erstellung mehrerer Bedeutungen müsste mit einem höheren Verarbeitungsaufwand verbunden sein. Im Moment existieren zwei Theorien, die diesen Effekt erklären können: Das Unrestricted Race Model (URM; van Gompel, Pickering, and Traxler, 2000) basiert auf der Annahme, daß Leser, wann immer möglich, mehrere Interpretationen eines Satzes gleichzeitig zu erstellen versuchen. Sobald die erste Interpretation erstellt wurde, wird diese als die finale Interpretation des aktuellen Inputs akzeptiert, und die Erstellung weiterer Interpretationen wird terminiert. Weil die Dauer jedes Strukturerstellungsprozesses variiert, führt dieser Interpretationsmechanismus dazu daß Sätze mit mehreren Bedeutungen schneller verarbeitet werden. Wenn zwei Satzstrukturen zulässig sind, ist die Wahrscheinlichkeit höher daß zumindest eine von beiden relativ schnell berechnet wird als wenn nur eine Struktur zulässig ist. Dieses Modell nimmt keine Einflüsse von Verständnisaufgaben auf die Verarbeitungsstrategie an. Einen gänzlich anderen Erklärungsansatz verfolgt das strategische Unterspezifizierungsmodell von Swets et al. (2008). Hier wird angenommen daß Leser Ambiguitäten nur dann auflösen, wenn es unbedingt notwendig ist. Wenn es nicht notwendig ist, unterspezifizieren sie stattdessen. Laut dem Unterspezifizierungsmodell sind alle bisherigen Replikationen der ambiguity advantage der Tatsache geschuldet, daß in diesen Experimenten nur oberflächliche Fragen gestellt wurden, die keine Ambiguitätsauflösung erforderten. Wäre Disambiguierung erforderlich gewesen, wäre die Verarbeitung ambiger Sätze langsamer. In der vorliegenden Arbeit werden diese beiden Modelle der Ambiguitätsauflösung diskutiert und empirisch evaluiert. Zunächst wird diskutiert warum die Daten von Swets et al.'s (2008) Experiment keine Evidenz für Unterspezifikation darstellen. Als nächstes werden die präzisen quantitativen Vorhersagen des URM und des Unterspezifizierungsmodells diskutiert. Es werden die Resultate eines self-paced reading Experiments mit pränominalen Relativsätzen im Türkischen vorgestellt, welche nicht mit dem Unterspezifizierungsmodell kompatibel sind. Als nächstes werden die Resultate eines weiteren Experiments vorgestellt, welches den Prozess der Relativsatzanbindung im Deutschen im Speed-Accuracy Tradeoff Paradigma (SAT) untersucht. Die Resultate sind mit dem URM, aber nicht mit strategischer Unterspezifikation vereinbar. Des weiteren wird ein drittes Experiment vorgestellt, welches zeigt daß Parsingstrategien von den Gesichtspunkten abhängen unter denen Leser einen Satz lesen. Um alle experimentellen Ergebnisse in dieser Arbeit zu erklären, wird ein neues Modell der Disambiguierung vorgestellt: das Stochastic Multiple-Channel Model (SMCM). Es wird des weiteren gezeigt, daß die quantitativen Vorhersagen des SMCM mit den experimentellen Daten übereinstimmen. KW - psycholinguistics KW - sentence processing KW - Psycholinguistik KW - Satzverarbeitung Y1 - 2014 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-82047 ER - TY - GEN A1 - Logačev, Pavel A1 - Vasishth, Shravan T1 - Understanding underspecification BT - A comparison of two computational implementations N2 - Swets et al. (2008. Underspecification of syntactic ambiguities: Evidence from self-paced reading. Memory and Cognition, 36(1), 201–216) presented evidence that the so-called ambiguity advantage [Traxler et al. (1998). Adjunct attachment is not a form of lexical ambiguity resolution. Journal of Memory and Language, 39(4), 558–592], which has been explained in terms of the Unrestricted Race Model, can equally well be explained by assuming underspecification in ambiguous conditions driven by task-demands. Specifically, if comprehension questions require that ambiguities be resolved, the parser tends to make an attachment: when questions are about superficial aspects of the target sentence, readers tend to pursue an underspecification strategy. It is reasonable to assume that individual differences in strategy will play a significant role in the application of such strategies, so that studying average behaviour may not be informative. In order to study the predictions of the good-enough processing theory, we implemented two versions of underspecification: the partial specification model (PSM), which is an implementation of the Swets et al. proposal, and a more parsimonious version, the non-specification model (NSM). We evaluate the relative fit of these two kinds of underspecification to Swets et al.’s data; as a baseline, we also fitted three models that assume no underspecification. We find that a model without underspecification provides a somewhat better fit than both underspecification models, while the NSM model provides a better fit than the PSM. We interpret the results as lack of unambiguous evidence in favour of underspecification; however, given that there is considerable existing evidence for good-enough processing in the literature, it is reasonable to assume that some underspecification might occur. Under this assumption, the results can be interpreted as tentative evidence for NSM over PSM. More generally, our work provides a method for choosing between models of real-time processes in sentence comprehension that make qualitative predictions about the relationship between several dependent variables. We believe that sentence processing research will greatly benefit from a wider use of such methods. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 295 KW - Computational modelling KW - Underspecification KW - Shallow processing Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-93441 SP - 996 EP - 1012 ER - TY - GEN A1 - Nicenboim, Bruno A1 - Logacev, Pavel A1 - Gattei, Carolina A1 - Vasishth, Shravan T1 - When High-Capacity Readers Slow Down and Low-Capacity Readers Speed Up BT - Working Memory and Locality Effects N2 - We examined the effects of argument-head distance in SVO and SOV languages (Spanish and German), while taking into account readers' working memory capacity and controlling for expectation (Levy, 2008) and other factors. We predicted only locality effects, that is, a slowdown produced by increased dependency distance (Gibson, 2000; Lewis and Vasishth, 2005). Furthermore, we expected stronger locality effects for readers with low working memory capacity. Contrary to our predictions, low-capacity readers showed faster reading with increased distance, while high-capacity readers showed locality effects. We suggest that while the locality effects are compatible with memory-based explanations, the speedup of low-capacity readers can be explained by an increased probability of retrieval failure. We present a computational model based on ACT-R built under the previous assumptions, which is able to give a qualitative account for the present data and can be tested in future research. Our results suggest that in some cases, interpreting longer RTs as indexing increased processing difficulty and shorter RTs as facilitation may be too simplistic: The same increase in processing difficulty may lead to slowdowns in high-capacity readers and speedups in low-capacity ones. Ignoring individual level capacity differences when investigating locality effects may lead to misleading conclusions. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 288 KW - locality KW - working memory capacity KW - individual differences KW - Spanish KW - German KW - ACT-R Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-90663 SP - 1 EP - 24 ER - TY - JOUR A1 - Nicenboim, Bruno A1 - Logacev, Pavel A1 - Gattei, Carolina A1 - Vasishth, Shravan T1 - When High-Capacity Readers Slow Down and Low-Capacity Readers Speed Up BT - Working Memory and Locality Effects JF - Frontiers in psychology N2 - We examined the effects of argument-head distance in SVO and SOV languages (Spanish and German), while taking into account readers' working memory capacity and controlling for expectation (Levy, 2008) and other factors. We predicted only locality effects, that is, a slowdown produced by increased dependency distance (Gibson, 2000; Lewis and Vasishth, 2005). Furthermore, we expected stronger locality effects for readers with low working memory capacity. Contrary to our predictions, low-capacity readers showed faster reading with increased distance, while high-capacity readers showed locality effects. We suggest that while the locality effects are compatible with memory-based explanations, the speedup of low-capacity readers can be explained by an increased probability of retrieval failure. We present a computational model based on ACT-R built under the previous assumptions, which is able to give a qualitative account for the present data and can be tested in future research. Our results suggest that in some cases, interpreting longer RTs as indexing increased processing difficulty and shorter RTs as facilitation may be too simplistic: The same increase in processing difficulty may lead to slowdowns in high-capacity readers and speedups in low-capacity ones. Ignoring individual level capacity differences when investigating locality effects may lead to misleading conclusions. KW - locality KW - working memory capacity KW - individual differences KW - Spanish KW - German KW - ACT-R Y1 - 2016 U6 - https://doi.org/10.3389/fpsyg.2016.00280 SN - 1664-1078 VL - 7 SP - 1 EP - 24 PB - Frontiers Research Foundation CY - Lausanne ER -