Refine
Has Fulltext
- yes (18) (remove)
Year of publication
Document Type
- Doctoral Thesis (9)
- Postprint (9)
Language
- English (18)
Keywords
- prediction (18) (remove)
Institute
- Institut für Physik und Astronomie (3)
- Mathematisch-Naturwissenschaftliche Fakultät (3)
- Strukturbereich Kognitionswissenschaften (2)
- Department Linguistik (1)
- Department Psychologie (1)
- Hasso-Plattner-Institut für Digital Engineering GmbH (1)
- Hasso-Plattner-Institut für Digital Engineering gGmbH (1)
- Hochschulambulanz (1)
- Humanwissenschaftliche Fakultät (1)
- Institut für Biochemie und Biologie (1)
Species respond to environmental change by dynamically adjusting their geographical ranges. Robust predictions of these changes are prerequisites to inform dynamic and sustainable conservation strategies. Correlative species distribution models (SDMs) relate species’ occurrence records to prevailing environmental factors to describe the environmental niche. They have been widely applied in global change context as they have comparably low data requirements and allow for rapid assessments of potential future species’ distributions. However, due to their static nature, transient responses to environmental change are essentially ignored in SDMs. Furthermore, neither dispersal nor demographic processes and biotic interactions are explicitly incorporated. Therefore, it has often been suggested to link statistical and mechanistic modelling approaches in order to make more realistic predictions of species’ distributions for scenarios of environmental change. In this thesis, I present two different ways of such linkage. (i) Mechanistic modelling can act as virtual playground for testing statistical models and allows extensive exploration of specific questions. I promote this ‘virtual ecologist’ approach as a powerful evaluation framework for testing sampling protocols, analyses and modelling tools. Also, I employ such an approach to systematically assess the effects of transient dynamics and ecological properties and processes on the prediction accuracy of SDMs for climate change projections. That way, relevant mechanisms are identified that shape the species’ response to altered environmental conditions and which should hence be considered when trying to project species’ distribution through time. (ii) I supplement SDM projections of potential future habitat for black grouse in Switzerland with an individual-based population model. By explicitly considering complex interactions between habitat availability and demographic processes, this allows for a more direct assessment of expected population response to environmental change and associated extinction risks. However, predictions were highly variable across simulations emphasising the need for principal evaluation tools like sensitivity analysis to assess uncertainty and robustness in dynamic range predictions. Furthermore, I identify data coverage of the environmental niche as a likely cause for contrasted range predictions between SDM algorithms. SDMs may fail to make reliable predictions for truncated and edge niches, meaning that portions of the niche are not represented in the data or niche edges coincide with data limits. Overall, my thesis contributes to an improved understanding of uncertainty factors in predictions of range dynamics and presents ways how to deal with these. Finally I provide preliminary guidelines for predictive modelling of dynamic species’ response to environmental change, identify key challenges for future research and discuss emerging developments.
A large body of research now supports the presence of both syntactic and lexical predictions in sentence processing. Lexical predictions, in particular, are considered to indicate a deep level of predictive processing that extends past the structural features of a necessary word (e.g. noun), right down to the phonological features of the lexical identity of a specific word (e.g. /kite/; DeLong et al., 2005). However, evidence for lexical predictions typically focuses on predictions in very local environments, such as the adjacent word or words (DeLong et al., 2005; Van Berkum et al., 2005; Wicha et al., 2004). Predictions in such local environments may be indistinguishable from lexical priming, which is transient and uncontrolled, and as such may prime lexical items that are not compatible with the context (e.g. Kukona et al., 2014). Predictive processing has been argued to be a controlled process, with top-down information guiding preactivation of plausible upcoming lexical items (Kuperberg & Jaeger, 2016). One way to distinguish lexical priming from prediction is to demonstrate that preactivated lexical content can be maintained over longer distances.
In this dissertation, separable German particle verbs are used to demonstrate that preactivation of lexical items can be maintained over multi-word distances. A self-paced reading time and an eye tracking experiment provide some support for the idea that particle preactivation triggered by a verb and its context can be observed by holding the sentence context constant and manipulating the predictabilty of the particle. Although evidence of an effect of particle predictability was only seen in eye tracking, this is consistent with previous evidence suggesting that predictive processing facilitates only some eye tracking measures to which the self-paced reading modality may not be sensitive (Staub, 2015; Rayner1998). Interestingly, manipulating the distance between the verb and the particle did not affect reading times, suggesting that the surprisal-predicted faster reading times at long distance may only occur when the additional distance is created by information that adds information about the lexical identity of a distant element (Levy, 2008; Grodner & Gibson, 2005). Furthermore, the results provide support for models proposing that temporal decay is not major influence on word processing (Lewandowsky et al., 2009; Vasishth et al., 2019).
In the third and fourth experiments, event-related potentials were used as a method for detecting specific lexical predictions. In the initial ERP experiment, we found some support for the presence of lexical predictions when the sentence context constrained the number of plausible particles to a single particle. This was suggested by a frontal post-N400 positivity (PNP) that was elicited when a lexical prediction had been violated, but not to violations when more than one particle had been plausible. The results of this study were highly consistent with previous research suggesting that the PNP might be a much sought-after ERP marker of prediction failure (DeLong et al., 2011; DeLong et al., 2014; Van Petten & Luka, 2012; Thornhill & Van Petten, 2012; Kuperberg et al., 2019). However, a second experiment in a larger sample experiment failed to replicate the effect, but did suggest the relationship of the PNP to predictive processing may not yet be fully understood. Evidence for long-distance lexical predictions was inconclusive.
The conclusion drawn from the four experiments is that preactivation of the lexical entries of plausible upcoming particles did occur and was maintained over long distances. The facilitatory effect of this preactivation at the particle site therefore did not appear to be the result of transient lexical priming. However, the question of whether this preactivation can also lead to lexical predictions of a specific particle remains unanswered. Of particular interest to future research on predictive processing is further characterisation of the PNP. Implications for models of sentence processing may be the inclusion of long-distance lexical predictions, or the possibility that preactivation of lexical material can facilitate reading times and ERP amplitude without commitment to a specific lexical item.
The Limpopo Basin in southern Africa is prone to droughts which affect the livelihood of millions of people in South Africa, Botswana, Zimbabwe and Mozambique. Seasonal drought early warning is thus vital for the whole region. In this study, the predictability of hydrological droughts during the main runoff period from December to May is assessed using statistical approaches. Three methods (multiple linear models, artificial neural networks, random forest regression trees) are compared in terms of their ability to forecast streamflow with up to 12 months of lead time. The following four main findings result from the study.
1. There are stations in the basin at which standardised streamflow is predictable with lead times up to 12 months. The results show high inter-station differences of forecast skill but reach a coefficient of determination as high as 0.73 (cross validated).
2. A large range of potential predictors is considered in this study, comprising well-established climate indices, customised teleconnection indices derived from sea surface temperatures and antecedent streamflow as a proxy of catchment conditions. El Nino and customised indices, representing sea surface temperature in the Atlantic and Indian oceans, prove to be important teleconnection predictors for the region. Antecedent streamflow is a strong predictor in small catchments (with median 42% explained variance), whereas teleconnections exert a stronger influence in large catchments.
3. Multiple linear models show the best forecast skill in this study and the greatest robustness compared to artificial neural networks and random forest regression trees, despite their capabilities to represent nonlinear relationships.
4. Employed in early warning, the models can be used to forecast a specific drought level. Even if the coefficient of determination is low, the forecast models have a skill better than a climatological forecast, which is shown by analysis of receiver operating characteristics (ROCs). Seasonal statistical forecasts in the Limpopo show promising results, and thus it is recommended to employ them as complementary to existing forecasts in order to strengthen preparedness for droughts.
The current thesis examined how second language (L2) speakers of German predict upcoming input during language processing. Early research has shown that the predictive abilities of L2 speakers relative to L1 speakers are limited, resulting in the proposal of the Reduced Ability to Generate Expectations (RAGE) hypothesis. Considering that prediction is assumed to facilitate language processing in L1 speakers and probably plays a role in language learning, the assumption that L1/L2 differences can be explained in terms of different processing mechanisms is a particularly interesting approach. However, results from more recent studies on the predictive processing abilities of L2 speakers have indicated that the claim of the RAGE hypothesis is too broad and that prediction in L2 speakers could be selectively limited. In the current thesis, the RAGE hypothesis was systematically put to the test.
In this thesis, German L1 and highly proficient late L2 learners of German with Russian as L1 were tested on their predictive use of one or more information sources that exist as cues to sentence interpretation in both languages, to test for selective limits. The results showed that, in line with previous findings, L2 speakers can use the lexical-semantics of verbs to predict the upcoming noun. Here the level of prediction was more systematically controlled for than in previous studies by using verbs that restrict the selection of upcoming nouns to the semantic category animate or inanimate. Hence, prediction in L2 processing is possible. At the same time, this experiment showed that the L2 group was slower/less certain than the L1 group. Unlike previous studies, the experiment on case marking demonstrated that L2 speakers can use this morphosyntactic cue for prediction. Here, the use of case marking was tested by manipulating the word order (Dat > Acc vs. Acc > Dat) in double object constructions after a ditransitive verb. Both the L1 and the L2 group showed a difference between the two word order conditions that emerged within the critical time window for an anticipatory effect, indicating their sensitivity towards case. However, the results for the post-critical time window pointed to a higher uncertainty in the L2 group, who needed more time to integrate incoming information and were more affected by the word order variation than the L1 group, indicating that they relied more on surface-level information. A different cue weighting was also found in the experiment testing whether participants predict upcoming reference based on implicit causality information. Here, an additional child L1 group was tested, who had a lower memory capacity than the adult L2 group, as confirmed by a digit span task conducted with both learner groups. Whereas the children were only slightly delayed compared to the adult L1 group and showed the same effect of condition, the L2 speakers showed an over-reliance on surface-level information (first-mention/subjecthood). Hence, the pattern observed resulted more likely from L1/L2 differences than from resource deficits.
The reviewed studies and the experiments conducted show that L2 prediction is affected by a range of factors. While some of the factors can be attributed to more individual differences (e.g., language similarity, slower processing) and can be interpreted by L2 processing accounts assuming that L1 and L2 processing are basically the same, certain limits are better explained by accounts that assume more substantial L1/L2 differences. Crucially, the experimental results demonstrate that the RAGE hypothesis should be refined: Although prediction as a fast-operating mechanism is likely to be affected in L2 speakers, there is no indication that prediction is the dominant source of L1/L2 differences. The results rather demonstrate that L2 speakers show a different weighting of cues and rely more on semantic and surface-level information to predict as well as to integrate incoming information.
Organizations try to gain competitive advantages, and to increase customer satisfaction. To ensure the quality and efficiency of their business processes, they perform business process management. An important part of process management that happens on the daily operational level is process controlling. A prerequisite of controlling is process monitoring, i.e., keeping track of the performed activities in running process instances. Only by process monitoring can business analysts detect delays and react to deviations from the expected or guaranteed performance of a process instance. To enable monitoring, process events need to be collected from the process environment. When a business process is orchestrated by a process execution engine, monitoring is available for all orchestrated process activities. Many business processes, however, do not lend themselves to automatic orchestration, e.g., because of required freedom of action. This situation is often encountered in hospitals, where most business processes are manually enacted. Hence, in practice it is often inefficient or infeasible to document and monitor every process activity. Additionally, manual process execution and documentation is prone to errors, e.g., documentation of activities can be forgotten. Thus, organizations face the challenge of process events that occur, but are not observed by the monitoring environment. These unobserved process events can serve as basis for operational process decisions, even without exact knowledge of when they happened or when they will happen. An exemplary decision is whether to invest more resources to manage timely completion of a case, anticipating that the process end event will occur too late. This thesis offers means to reason about unobserved process events in a probabilistic way. We address decisive questions of process managers (e.g., "when will the case be finished?", or "when did we perform the activity that we forgot to document?") in this thesis. As main contribution, we introduce an advanced probabilistic model to business process management that is based on a stochastic variant of Petri nets. We present a holistic approach to use the model effectively along the business process lifecycle. Therefore, we provide techniques to discover such models from historical observations, to predict the termination time of processes, and to ensure quality by missing data management. We propose mechanisms to optimize configuration for monitoring and prediction, i.e., to offer guidance in selecting important activities to monitor. An implementation is provided as a proof of concept. For evaluation, we compare the accuracy of the approach with that of state-of-the-art approaches using real process data of a hospital. Additionally, we show its more general applicability in other domains by applying the approach on process data from logistics and finance.
Background:
Endomyocardial biopsy is considered as the gold standard in patients with suspected myocarditis. We aimed to evaluate the impact of bioptic findings on prediction of successful return to work.
Methods:
In 1153 patients (48.9 ± 12.4 years, 66.2% male), who were hospitalized due to symptoms of left heart failure between 2005 and 2012, an endomyocardial biopsy was performed. Routine clinical and laboratory data, sociodemographic parameters, and noninvasive and invasive cardiac variables including endomyocardial biopsy were registered. Data were linked with return to work data from the German statutory pension insurance program and analyzed by Cox regression.
Results:
A total of 220 patients had a complete data set of hospital and insurance information. Three quarters of patients were virus-positive (54.2% parvovirus B19, other or mixed infection 16.7%). Mean invasive left ventricular ejection fraction was 47.1% ± 18.6% (left ventricular ejection fraction <45% in 46.3%). Return to work was achieved after a mean interval of 168.8 ± 347.7 days in 220 patients (after 6, 12, and 24 months in 61.3%, 72.2%, and 76.4%). In multivariate regression analysis, only age (per 10 years, hazard ratio, 1.27; 95% confidence interval, 1.10–1.46; p = 0.001) and left ventricular ejection fraction (per 5% increase, hazard ratio, 1.07; 95% confidence interval, 1.03–1.12; p = 0.002) were associated with increased, elevated work intensity (heavy vs light, congestive heart failure, 0.58; 95% confidence interval, 0.34–0.99; p < 0.049) with decreased probability of return to work. None of the endomyocardial biopsy–derived parameters was significantly associated with return to work in the total group as well as in the subgroup of patients with biopsy-proven myocarditis.
Conclusion:
Added to established predictors, bioptic data demonstrated no additional impact for return to work probability. Thus, socio-medical evaluation of patients with suspected myocarditis furthermore remains an individually oriented process based primarily on clinical and functional parameters.
Increased N400 amplitudes on indefinite articles (a/an) incompatible with expected nouns have been initially taken as strong evidence for probabilistic pre-activation of phonological word forms, and recently been intensely debated because they have been difficult to replicate. Here, these effects are simulated using a neural network model of sentence comprehension that we previously used to simulate a broad range of empirical N400 effects. The model produces the effects when the cue validity of the articles concerning upcoming noun meaning in the learning environment is high, but fails to produce the effects when the cue validity of the articles is low due to adjectives presented between articles and nouns during training. These simulations provide insight into one of the factors potentially contributing to the small size of the effects in empirical studies and generate predictions for cross-linguistic differences in article induced N400 effects based on articles’ cue validity. The model accounts for article induced N400 effects without assuming pre-activation of word forms, and instead simulates these effects as the stimulus-induced change in a probabilistic representation of meaning corresponding to an implicit semantic prediction error.
In the present work, we use symbolic regression for automated modeling of dynamical systems. Symbolic regression is a powerful and general method suitable for data-driven identification of mathematical expressions. In particular, the structure and parameters of those expressions are identified simultaneously.
We consider two main variants of symbolic regression: sparse regression-based and genetic programming-based symbolic regression. Both are applied to identification, prediction and control of dynamical systems.
We introduce a new methodology for the data-driven identification of nonlinear dynamics for systems undergoing abrupt changes. Building on a sparse regression algorithm derived earlier, the model after the change is defined as a minimum update with respect to a reference model of the system identified prior to the change. The technique is successfully exemplified on the chaotic Lorenz system and the van der Pol oscillator. Issues such as computational complexity, robustness against noise and requirements with respect to data volume are investigated.
We show how symbolic regression can be used for time series prediction. Again, issues such as robustness against noise and convergence rate are investigated us- ing the harmonic oscillator as a toy problem. In combination with embedding, we demonstrate the prediction of a propagating front in coupled FitzHugh-Nagumo oscillators. Additionally, we show how we can enhance numerical weather predictions to commercially forecast power production of green energy power plants.
We employ symbolic regression for synchronization control in coupled van der Pol oscillators. Different coupling topologies are investigated. We address issues such as plausibility and stability of the control laws found. The toolkit has been made open source and is used in turbulence control applications.
Genetic programming based symbolic regression is very versatile and can be adapted to many optimization problems. The heuristic-based algorithm allows for cost efficient optimization of complex tasks.
We emphasize the ability of symbolic regression to yield white-box models. In contrast to black-box models, such models are accessible and interpretable which allows the usage of established tool chains.
Despite recent growth of research on the effects of prosocial media, processes underlying these effects are not well understood. Two studies explored theoretically relevant mediators and moderators of the effects of prosocial media on helping. Study 1 examined associations among prosocial- and violent-media use, empathy, and helping in samples from seven countries. Prosocial-media use was positively associated with helping. This effect was mediated by empathy and was similar across cultures. Study 2 explored longitudinal relations among prosocial-video-game use, violent-video-game use, empathy, and helping in a large sample of Singaporean children and adolescents measured three times across 2 years. Path analyses showed significant longitudinal effects of prosocial- and violent-video-game use on prosocial behavior through empathy. Latent-growth-curve modeling for the 2-year period revealed that change in video-game use significantly affected change in helping, and that this relationship was mediated by change in empathy.
A large number and wide variety of lake ecosystem models have been developed and published during the past four decades. We identify two challenges for making further progress in this field. One such challenge is to avoid developing more models largely following the concept of others ('reinventing the wheel'). The other challenge is to avoid focusing on only one type of model, while ignoring new and diverse approaches that have become available ('having tunnel vision'). In this paper, we aim at improving the awareness of existing models and knowledge of concurrent approaches in lake ecosystem modelling, without covering all possible model tools and avenues. First, we present a broad variety of modelling approaches. To illustrate these approaches, we give brief descriptions of rather arbitrarily selected sets of specific models. We deal with static models (steady state and regression models), complex dynamic models (CAEDYM, CE-QUAL-W2, Delft 3D-ECO, LakeMab, LakeWeb, MyLake, PCLake, PROTECH, SALMO), structurally dynamic models and minimal dynamic models. We also discuss a group of approaches that could all be classified as individual based: super-individual models (Piscator, Charisma), physiologically structured models, stage-structured models and traitbased models. We briefly mention genetic algorithms, neural networks, Kalman filters and fuzzy logic. Thereafter, we zoom in, as an in-depth example, on the multi-decadal development and application of the lake ecosystem model PCLake and related models (PCLake Metamodel, Lake Shira Model, IPH-TRIM3D-PCLake). In the discussion, we argue that while the historical development of each approach and model is understandable given its 'leading principle', there are many opportunities for combining approaches. We take the point of view that a single 'right' approach does not exist and should not be strived for. Instead, multiple modelling approaches, applied concurrently to a given problem, can help develop an integrative view on the functioning of lake ecosystems. We end with a set of specific recommendations that may be of help in the further development of lake ecosystem models.