Refine
Has Fulltext
- yes (25) (remove)
Year of publication
Document Type
- Doctoral Thesis (13)
- Postprint (12)
Language
- English (25) (remove)
Keywords
- eye movements (25) (remove)
Institute
- Department Psychologie (11)
- Strukturbereich Kognitionswissenschaften (5)
- Humanwissenschaftliche Fakultät (4)
- Institut für Physik und Astronomie (2)
- Institut für Informatik und Computational Science (1)
- Mathematisch-Naturwissenschaftliche Fakultät (1)
- Potsdam Research Institute for Multilingualism (PRIM) (1)
A central insight from psychological studies on human eye movements is that eye movement patterns are highly individually characteristic. They can, therefore, be used as a biometric feature, that is, subjects can be identified based on their eye movements. This thesis introduces new machine learning methods to identify subjects based on their eye movements while viewing arbitrary content. The thesis focuses on probabilistic modeling of the problem, which has yielded the best results in the most recent literature. The thesis studies the problem in three phases by proposing a purely probabilistic, probabilistic deep learning, and probabilistic deep metric learning approach. In the first phase, the thesis studies models that rely on psychological concepts about eye movements. Recent literature illustrates that individual-specific distributions of gaze patterns can be used to accurately identify individuals. In these studies, models were based on a simple parametric family of distributions. Such simple parametric models can be robustly estimated from sparse data, but have limited flexibility to capture the differences between individuals. Therefore, this thesis proposes a semiparametric model of gaze patterns that is flexible yet robust for individual identification. These patterns can be understood as domain knowledge derived from psychological literature. Fixations and saccades are examples of simple gaze patterns. The proposed semiparametric densities are drawn under a Gaussian process prior centered at a simple parametric distribution. Thus, the model will stay close to the parametric class of densities if little data is available, but it can also deviate from this class if enough data is available, increasing the flexibility of the model. The proposed method is evaluated on a large-scale dataset, showing significant improvements over the state-of-the-art. Later, the thesis replaces the model based on gaze patterns derived from psychological concepts with a deep neural network that can learn more informative and complex patterns from raw eye movement data. As previous work has shown that the distribution of these patterns across a sequence is informative, a novel statistical aggregation layer called the quantile layer is introduced. It explicitly fits the distribution of deep patterns learned directly from the raw eye movement data. The proposed deep learning approach is end-to-end learnable, such that the deep model learns to extract informative, short local patterns while the quantile layer learns to approximate the distributions of these patterns. Quantile layers are a generic approach that can converge to standard pooling layers or have a more detailed description of the features being pooled, depending on the problem. The proposed model is evaluated in a large-scale study using the eye movements of subjects viewing arbitrary visual input. The model improves upon the standard pooling layers and other statistical aggregation layers proposed in the literature. It also improves upon the state-of-the-art eye movement biometrics by a wide margin. Finally, for the model to identify any subject — not just the set of subjects it is trained on — a metric learning approach is developed. Metric learning learns a distance function over instances. The metric learning model maps the instances into a metric space, where sequences of the same individual are close, and sequences of different individuals are further apart. This thesis introduces a deep metric learning approach with distributional embeddings. The approach represents sequences as a set of continuous distributions in a metric space; to achieve this, a new loss function based on Wasserstein distances is introduced. The proposed method is evaluated on multiple domains besides eye movement biometrics. This approach outperforms the state of the art in deep metric learning in several domains while also outperforming the state of the art in eye movement biometrics.
Eye movements serve as a window into ongoing visual-cognitive processes and can thus be used to investigate how people perceive real-world scenes. A key issue for understanding eye-movement control during scene viewing is the roles of central and peripheral vision, which process information differently and are therefore specialized for different tasks (object identification and peripheral target selection respectively). Yet, rather little is known about the contributions of central and peripheral processing to gaze control and how they are coordinated within a fixation during scene viewing. Additionally, the factors determining fixation durations have long been neglected, as scene perception research has mainly been focused on the factors determining fixation locations. The present thesis aimed at increasing the knowledge on how central and peripheral vision contribute to spatial and, in particular, to temporal aspects of eye-movement control during scene viewing. In a series of five experiments, we varied processing difficulty in the central or the peripheral visual field by attenuating selective parts of the spatial-frequency spectrum within these regions. Furthermore, we developed a computational model on how foveal and peripheral processing might be coordinated for the control of fixation duration. The thesis provides three main findings. First, the experiments indicate that increasing processing demands in central or peripheral vision do not necessarily prolong fixation durations; instead, stimulus-independent timing is adapted when processing becomes too difficult. Second, peripheral vision seems to play a prominent role in the control of fixation durations, a notion also implemented in the computational model. The model assumes that foveal and peripheral processing proceed largely in parallel and independently during fixation, but can interact to modulate fixation duration. Thus, we propose that the variation in fixation durations can in part be accounted for by the interaction between central and peripheral processing. Third, the experiments indicate that saccadic behavior largely adapts to processing demands, with a bias of avoiding spatial-frequency filtered scene regions as saccade targets. We demonstrate that the observed saccade amplitude patterns reflect corresponding modulations of visual attention. The present work highlights the individual contributions and the interplay of central and peripheral vision for gaze control during scene viewing, particularly for the control of fixation duration. Our results entail new implications for computational models and for experimental research on scene perception.
A number of recent studies have investigated how syntactic and non-syntactic constraints combine to cue memory retrieval during anaphora resolution. In this paper we investigate how syntactic constraints and gender congruence interact to guide memory retrieval during the resolution of subject pronouns. Subject pronouns are always technically ambiguous, and the application of syntactic constraints on their interpretation depends on properties of the antecedent that is to be retrieved. While pronouns can freely corefer with non-quantified referential antecedents, linking a pronoun to a quantified antecedent is only possible in certain syntactic configurations via variable binding. We report the results from a judgment task and three online reading comprehension experiments investigating pronoun resolution with quantified and non-quantified antecedents. Results from both the judgment task and participants' eye movements during reading indicate that comprehenders freely allow pronouns to corefer with non-quantified antecedents, but that retrieval of quantified antecedents is restricted to specific syntactic environments. We interpret our findings as indicating that syntactic constraints constitute highly weighted cues to memory retrieval during anaphora resolution.
During reading oculomotor processes guide the eyes over the text. The visual information recorded is accessed, evaluated and processed. Only by retrieving the meaning of a word from the long-term memory, as well as through the connection and storage of the information about each individual word, is it possible to access the semantic meaning of a sentence. Therefore memory, and here in particular working memory, plays a pivotal role in the basic processes of reading. The following dissertation investigates to what extent different demands on memory and memory capacity have an effect on eye movement behavior while reading. The frequently used paradigm of the reading span task, in which test subjects read and evaluate individual sentences, was used for the experimental review of the research questions. The results speak for the fact that working memory processes have a direct effect on various eye movement measurements. Thus a high working memory load, for example, reduced the perceptual span while reading. The lower the individual working memory capacity of the reader was, the stronger was the influence of the working memory load on the processing of the sentence.
While the influence of spatial-numerical associations in number categorization tasks has been well established, their role in mental arithmetic is less clear. It has been hypothesized that mental addition leads to rightward and upward shifts of spatial attention (along the "mental number line"), whereas subtraction leads to leftward and downward shifts. We addressed this hypothesis by analyzing spontaneous eye movements during mental arithmetic. Participants solved verbally presented arithmetic problems (e.g., 2 + 7, 8-3) aloud while looking at a blank screen. We found that eye movements reflected spatial biases in the ongoing mental operation: Gaze position shifted more upward when participants solved addition compared to subtraction problems, and the horizontal gaze position was partly determined by the magnitude of the operands. Interestingly, the difference between addition and subtraction trials was driven by the operator (plus vs. minus) but was not influenced by the computational process. Thus, our results do not support the idea of a mental movement toward the solution during arithmetic but indicate a semantic association between operation and space.
When we read a text, we obtain information at different levels of representation from abstract symbols. A reader’s ultimate aim is the extraction of the meaning of the words and the text. The reserach of eye movements in reading covers a broad range of psychological systems, ranging from low-level perceptual and motor processes to high-level cognition. Reading of skilled readers proceeds highly automatic, but is a complex phenomenon of interacting subprocesses at the same time. The study of eye movements during reading offers the possibility to investigate cognition via behavioral measures during the excercise of an everyday task. The process of reading is not limited to the directly fixated (or foveal) word but also extends to surrounding (or parafoveal) words, particularly the word to the right of the gaze position. This process may be unconscious, but parafoveal information is necessary for efficient reading. There is an ongoing debate on whether processing of the upcoming word encompasses word meaning (or semantics) or only superficial features. To increase the knowledge about how the meaning of one word helps processing another word, seven experiments were conducted. In these studies, words were exachanged during reading. The degree of relatedness between the word to the right of the currently fixated one and the word subsequently fixated was experimentally manipulated. Furthermore, the time course of the parafoveal extraction of meaning was investigated with two different approaches, an experimental one and a statistical one. As a major finding, fixation times were consistently lower if a semantically related word was presented compared to the presence of an unrelated word. Introducing an experimental technique that allows controlling the duration for which words are available, the time course of processing and integrating meaning was evaluated. Results indicated both facilitation and inhibition due to relatedness between the meanings of words. In a more natural reading situation, the effectiveness of the processing of parafoveal words was sometimes time-dependent and substantially increased with shorter distances between the gaze position and the word. Findings are discussed with respect to theories of eye-movement control. In summary, the results are more compatible with models of distributed word processing. The discussions moreover extend to language differences and technical issues of reading research.
Eye movements in reading are sensitive to foveal and parafoveal word features. Whereas the influence of orthographic or phonological parafoveal information on gaze control is undisputed, there has been no reliable evidence for early parafoveal extraction of semantic information in alphabetic script. Using a novel combination of the gaze-contingent fast-priming and boundary paradigms, we demonstrate semantic preview benefit when a semantically related parafoveal word was available during the initial 125 ms of a fixation on the pre-target word (Experiments 1 and 2). When the target location was made more salient, significant parafoveal semantic priming occurred only at 80 ms (Experiment 3). Finally, with short primes only (20, 40, 60 ms) effects were not significant but numerically in the expected direction for 40 and 60 ms (Experiment 4). In all experiments, fixation durations on the target word increased with prime durations under all conditions. The evidence for extraction of semantic information from the parafoveal word favors an explanation in terms of parallel word processing in reading.
Linked linear mixed models
(2016)
The complexity of eye-movement control during reading allows measurement of many dependent variables, the most prominent ones being fixation durations and their locations in words. In current practice, either variable may serve as dependent variable or covariate for the other in linear mixed models (LMMs) featuring also psycholinguistic covariates of word recognition and sentence comprehension. Rather than analyzing fixation location and duration with separate LMMs, we propose linking the two according to their sequential dependency. Specifically, we include predicted fixation location (estimated in the first LMM from psycholinguistic covariates) and its associated residual fixation location as covariates in the second, fixation-duration LMM. This linked LMM affords a distinction between direct and indirect effects (mediated through fixation location) of psycholinguistic covariates on fixation durations. Results confirm the robustness of distributed processing in the perceptual span. They also offer a resolution of the paradox of the inverted optimal viewing position (IOVP) effect (i.e., longer fixation durations in the center than at the beginning and end of words) although the opposite (i.e., an OVP effect) is predicted from default assumptions of psycholinguistic processing efficiency: The IOVP effect in fixation durations is due to the residual fixation-location covariate, presumably driven primarily by saccadic error, and the OVP effect (at least the left part of it) is uncovered with the predicted fixation-location covariate, capturing the indirect effects of psycholinguistic covariates. We expect that linked LMMs will be useful for the analysis of other dynamically related multiple outcomes, a conundrum of most psychonomic research.
Reading requires the orchestration of visual, attentional, language-related, and oculomotor processing constraints. This study replicates previous effects of frequency, predictability, and length of fixated words on fixation durations in natural reading and demonstrates new effects of these variables related to previous and next words. Results are based on fixation durations recorded from 222 persons, each reading 144 sentences. Such evidence for distributed processing of words across fixation durations challenges psycholinguistic immediacy-of-processing and eye-mind assumptions. Most of the time the mind processes several words in parallel at different perceptual and cognitive levels. Eye movements can help to unravel these processes.