TY - JOUR A1 - Vasishth, Shravan A1 - Nicenboim, Bruno T1 - Statistical Methods for Linguistic Research: Foundational Ideas - Part I JF - Language and linguistics compass N2 - We present the fundamental ideas underlying statistical hypothesis testing using the frequentist framework. We start with a simple example that builds up the one-sample t-test from the beginning, explaining important concepts such as the sampling distribution of the sample mean, and the iid assumption. Then, we examine the meaning of the p-value in detail and discuss several important misconceptions about what a p-value does and does not tell us. This leads to a discussion of Type I, II error and power, and Type S and M error. An important conclusion from this discussion is that one should aim to carry out appropriately powered studies. Next, we discuss two common issues that we have encountered in psycholinguistics and linguistics: running experiments until significance is reached and the ‘garden-of-forking-paths’ problem discussed by Gelman and others. The best way to use frequentist methods is to run appropriately powered studies, check model assumptions, clearly separate exploratory data analysis from planned comparisons decided upon before the study was run, and always attempt to replicate results. Y1 - 2016 U6 - https://doi.org/10.1111/lnc3.12201 SN - 1749-818X VL - 10 SP - 349 EP - 369 PB - Wiley-Blackwell CY - Hoboken ER - TY - JOUR A1 - Nicenboim, Bruno A1 - Vasishth, Shravan T1 - Statistical methods for linguistic research: Foundational Ideas-Part II JF - Language and linguistics compass N2 - We provide an introductory review of Bayesian data analytical methods, with a focus on applications for linguistics, psychology, psycholinguistics, and cognitive science. The empirically oriented researcher will benefit from making Bayesian methods part of their statistical toolkit due to the many advantages of this framework, among them easier interpretation of results relative to research hypotheses and flexible model specification. We present an informal introduction to the foundational ideas behind Bayesian data analysis, using, as an example, a linear mixed models analysis of data from a typical psycholinguistics experiment. We discuss hypothesis testing using the Bayes factor and model selection using cross-validation. We close with some examples illustrating the flexibility of model specification in the Bayesian framework. Suggestions for further reading are also provided. Y1 - 2016 U6 - https://doi.org/10.1111/lnc3.12207 SN - 1749-818X VL - 10 SP - 591 EP - 613 PB - Wiley-Blackwell CY - Hoboken ER - TY - GEN A1 - Nicenboim, Bruno A1 - Logacev, Pavel A1 - Gattei, Carolina A1 - Vasishth, Shravan T1 - When High-Capacity Readers Slow Down and Low-Capacity Readers Speed Up BT - Working Memory and Locality Effects N2 - We examined the effects of argument-head distance in SVO and SOV languages (Spanish and German), while taking into account readers' working memory capacity and controlling for expectation (Levy, 2008) and other factors. We predicted only locality effects, that is, a slowdown produced by increased dependency distance (Gibson, 2000; Lewis and Vasishth, 2005). Furthermore, we expected stronger locality effects for readers with low working memory capacity. Contrary to our predictions, low-capacity readers showed faster reading with increased distance, while high-capacity readers showed locality effects. We suggest that while the locality effects are compatible with memory-based explanations, the speedup of low-capacity readers can be explained by an increased probability of retrieval failure. We present a computational model based on ACT-R built under the previous assumptions, which is able to give a qualitative account for the present data and can be tested in future research. Our results suggest that in some cases, interpreting longer RTs as indexing increased processing difficulty and shorter RTs as facilitation may be too simplistic: The same increase in processing difficulty may lead to slowdowns in high-capacity readers and speedups in low-capacity ones. Ignoring individual level capacity differences when investigating locality effects may lead to misleading conclusions. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 288 KW - locality KW - working memory capacity KW - individual differences KW - Spanish KW - German KW - ACT-R Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-90663 SP - 1 EP - 24 ER - TY - JOUR A1 - Nicenboim, Bruno A1 - Logacev, Pavel A1 - Gattei, Carolina A1 - Vasishth, Shravan T1 - When High-Capacity Readers Slow Down and Low-Capacity Readers Speed Up BT - Working Memory and Locality Effects JF - Frontiers in psychology N2 - We examined the effects of argument-head distance in SVO and SOV languages (Spanish and German), while taking into account readers' working memory capacity and controlling for expectation (Levy, 2008) and other factors. We predicted only locality effects, that is, a slowdown produced by increased dependency distance (Gibson, 2000; Lewis and Vasishth, 2005). Furthermore, we expected stronger locality effects for readers with low working memory capacity. Contrary to our predictions, low-capacity readers showed faster reading with increased distance, while high-capacity readers showed locality effects. We suggest that while the locality effects are compatible with memory-based explanations, the speedup of low-capacity readers can be explained by an increased probability of retrieval failure. We present a computational model based on ACT-R built under the previous assumptions, which is able to give a qualitative account for the present data and can be tested in future research. Our results suggest that in some cases, interpreting longer RTs as indexing increased processing difficulty and shorter RTs as facilitation may be too simplistic: The same increase in processing difficulty may lead to slowdowns in high-capacity readers and speedups in low-capacity ones. Ignoring individual level capacity differences when investigating locality effects may lead to misleading conclusions. KW - locality KW - working memory capacity KW - individual differences KW - Spanish KW - German KW - ACT-R Y1 - 2016 U6 - https://doi.org/10.3389/fpsyg.2016.00280 SN - 1664-1078 VL - 7 SP - 1 EP - 24 PB - Frontiers Research Foundation CY - Lausanne ER - TY - THES A1 - Nicenboim, Bruno T1 - Dependency resolution as a retrieval process T1 - Dependenzauflösung als ein Gedächtnisabrufsprozess BT - experimental evidence and computational modeling BT - experimentelle Evidenz und komputationelle Modellierung N2 - My thesis focused on the predictions of the activation-based model of Lewis and Vasishth (2005) to investigate the evidence for the use of the memory system in the formation of non-local dependencies in sentence comprehension. The activation-based model, which follows the Adaptive Control of Thought-Rational framework (ACT-R; Anderson et al., 2004), has been used to explain locality effects and similarity-based interference by assuming that dependencies are resolved by a cue-based retrieval mechanism, and that the retrieval mechanism is affected by decay and interference. Both locality effects and (inhibitory) similarity-based interference cause increased difficulty (e.g., longer reading times) at the site of the dependency completion where a retrieval is assumed: (I) Locality effects are attributed to the increased difficulty in the retrieval of a dependent when the distance from its retrieval site is increased. (II) Similarity-based interference is attributed to the retrieval being affected by the presence of items which have similar features as the dependent that needs to be retrieved. In this dissertation, I investigated some findings problematic to the activation-based model, namely, facilitation where locality effects are expected (e.g., Levy, 2008), and the lack of similarity-based interference from the number feature in grammatical sentences (e.g., Wagers et al., 2009). In addition, I used individual differences in working memory capacity and reading fluency as a way to validate the theories investigated (Underwood, 1975), and computational modeling to achieve a more precise account of the phenomena. Regarding locality effects, by using self-paced reading and eye-tracking-while reading methods with Spanish and German data, this dissertation yielded two main findings: (I) Locality effects seem to be modulated by working memory capacity, with high-capacity participants showing expectation-driven facilitation. (II) Once expectations and other potential confounds are controlled using baselines, with increased distance, high-capacity readers can show a slow-down (i.e., locality effects) and low-capacity readers can show a speedup. While the locality effects are compatible with the activation-based model, simulations show that the speedup of low-capacity readers can only be accounted for by changing some of the assumptions of the activation-based model. Regarding similarity-based interference, two relatively high-powered self-paced reading experiments in German using grammatical sentences yielded a slowdown at the verb as predicted by the activation-based model. This provides evidence in favor of dependency creation via cue-based retrieval, and in contrast with the view that cue-based retrieval is a reanalysis mechanism (Wagers et al., 2009). Finally, the same experimental results that showed inhibitory interference from the number feature are used for a finer grain evaluation of the retrieval process. Besides Lewis and Vasishth’s (2005) activation-based model, also McElree’s (2000) direct-access model can account for inhibitory interference. These two models assume a cue-based retrieval mechanism to build dependencies, but they are based on different assumptions. I present a computational evaluation of the predictions of these two theories of retrieval. The models were compared by implementing them in a Bayesian hierarchical framework. The evaluation of the models reveals that some aspects of the data fit better under the direct access model than under the activation-based model. However, a simple extension of the activation-based model provides a comparable fit to the direct access model. This serves as a proof of concept showing potential ways to improve the original activation-based model. In conclusion, this thesis adds to the body of evidence that argues for the use of the general memory system in dependency resolution, and in particular for a cue-based retrieval mechanism. However, it also shows that some of the default assumptions inherited from ACT-R in the activation-based model need to be revised. N2 - Die vorliegende Dissertation befasst sich mit dem Aktivierungsmodell von Lewis und Vasishth (2005) um die Evidenz für die Verwendung des Arbeitsgedächtnisses bei der Bildung nicht-lokaler Dependenzen in der menschlichen Satzverarbeitung zu untersuchen. Das Aktivierungsmodell, welches auf der ‘Adaptive Control of Thought-Rational’ (ACT-R; Anderson et al., 2004) aufbaut, wird in der Literatur herangezogen, um Lokalitätseffekte und Interferenz durch Ähnlichkeit mit einem von Interferenz und Gedächtnisverfall betroffenen merkmalsbasierten Gedächtnisabrufmechanismus zu erklären. Sowohl Lokalitätseffekte als auch (inhibitorische) Interferenz durch Ähnlichkeit führen zu einer erhöhten Verarbeitungsschwierigkeit (z.B. längere Lesezeiten) an der Stelle, wo die Dependenz gebildet wird und daher ein Gedächtnisabruf anzunehmen ist: (I) Lokalitätseffekte werden durch die erhöhte Schwierigkeit erklärt, die mit dem Abruf des ersten Teils einer Dependenz einhergeht, wenn dessen Distanz zu der Stelle, die den Gedächtnisabruf auslöst (d.h. der zweite Teil der Dependenz), vergrößert wird. (II) Interferenz durch Ähnlichkeit wird dadurch erklärt, dass der Gedächtnisabruf von der Anwesenheit von Elementen mit denselben Merkmalen wie die des abzurufenden Teils der Dependenz beeinträchtigt wird. In dieser Dissertation untersuche ich einige Erkenntnisse, die das Aktivierungsmodell herausfordern, namentlich fazilitatorische Effekte an Stellen, wo Lokalitätseffekte zu erwarten wären (z.B. Levy, 2008), sowie die Abwesenheit von Interferenz durch Ähnlichkeit in Experimenten, die den Numerus manipulieren (z.B. Wagers et al., 2009). Des Weiteren verwende ich Messwerte der individuellen Unterschiede in der Arbeitsgedächtnisleistung und in der Leseflüssigkeit um die untersuchten Theorien zu validieren, und komputationale Modellierung um ein genaueres Bild der untersuchten Phänomene zu zeichnen zu können. Was die Lokalitätseffekte angeht, so werden in dieser Dissertation hauptsächlich zwei Erkenntnisse vorgestellt, die auf mit Selbst-gesteuertem-Lesen und Eyetracking erhobenen Daten zum Spanischen und Deutschen basieren. (I) Lokalitätseffekte scheinen von der Arbeitsgedächtniskapazität moduliert zu werden: Probanden mit hoher Arbeitsgedächtniskapazität zeigen erwartungsgesteuerte fazilitatorische Effekte. (II) Wenn Erwartungen und andere potentielle Störvariablen durch geeignete Baselines kontrolliert werden, können bei Probanden mit starkem Arbeitsgedächtnis verlangsamte Lesezeiten (d. h., Lokalitätseffekte) und bei Probanden mit schwachem Arbeitsgedächtnis verkürzte Lesezeiten beobachtet werden. Während Lokalitätseffekte mit dem Aktivierungsmodell vereinbar sind, zeigen Simulationen, dass die fazilitatorischen Effekte der Probanden mit schwächerem Arbeitsgedächtnis nur dann von dem Aktivierungsmodell erklärt werden können, wenn einige der Modellannahmen geändert werden. Was Interferenz durch Ähnlichkeit angeht, so werden in dieser Dissertation zwei Experimente mit Selbst-gesteuertem-Lesen zum Deutschen vorgestellt, die eine relativ hohe statistische Teststärke haben. Grammatische Sätze führen hier zu verlangsamten Lesezeiten am Verb, wie es das Aktivierungsmodell vorhersagt. Diese Ergebnisse sind Evidenz für die Bildung von Dependenzen mittels merkmalsbasiertem Gedächtnisabruf und können nicht durch einen wie von Wagers et al. (2009) vorgeschlagenen Reanalysemechanismus erklärt werden. Letztendlich werden dieselben empirischen Daten, die durch den Numerus ausgelöste inhibitorische Interferenz zeigen, für eine detailliertere, simulationsbasierte Betrachtung des Gedächtnisabrufprozesses verwendet. Neben dem Aktivierungsmodell von Lewis und Vasishth (2005) kann auch das Modell eines direkten Gedächtniszugriffs von McElree (2000) die inhibitorische Interferenz erklären. Beide Modelle nehmen für die Bildung von Dependenzen einen merkmalsbasierten Gedächtniszugriffsmechanismus an, aber sie fußen auf unterschiedlichen Annahmen. Ich stelle eine komputationale Evaluation der Vorhersagen dieser beiden Gedächtniszugriffsmodelle vor. Um die beiden Modelle zu vergleichen, werden sie als Bayessche hierarchische Modelle implementiert. Die Evaluation der Modelle zeigt, dass einige Aspekte der empirischen Daten besser von McElrees Modell als von Lewis’ und Vasishths Modell erklärt werden. Eine einfache Erweiterung des Aktivierungsmodells erklärt die Daten jedoch ähnlich gut wie McElrees Modell. Kurz, diese Dissertation liefert weitere Evidenz für die These, dass das allgemeine Gedächtnissystem — und ein merkmalsbasierter Abrufmechanismus im Besonderen — beim Bilden linguistischer Dependenzen Anwendung findet. Es wird jedoch auch gezeigt, dass einige der Standardannahmen, die das Aktivierungsmodell von der ACT-R-Architektur geerbt hat, überdacht und angepasst werden müssen. KW - linguistics KW - working memory KW - computational modeling KW - Sprachwissenschaft KW - Arbeitsgedächtniss KW - komputationale Modellierung Y1 - 2016 ER -