TY  - GEN
A1  - Schmidt, Lennart
A1  - Heße, Falk
A1  - Attinger, Sabine
A1  - Kumar, Rohini
T1  - Challenges in applying machine learning models for hydrological inference: a case study for flooding events across Germany
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Machine learning (ML) algorithms are being increasingly used in Earth and Environmental modeling studies owing to the ever-increasing availability of diverse data sets and computational resources as well as advancement in ML algorithms. Despite advances in their predictive accuracy, the usefulness of ML algorithms for inference remains elusive. In this study, we employ two popular ML algorithms, artificial neural networks and random forest, to analyze a large data set of flood events across Germany with the goals to analyze their predictive accuracy and their usability to provide insights to hydrologic system functioning. The results of the ML algorithms are contrasted against a parametric approach based on multiple linear regression. For analysis, we employ a model-agnostic framework named Permuted Feature Importance to derive the influence of models' predictors. This allows us to compare the results of different algorithms for the first time in the context of hydrology. Our main findings are that (1) the ML models achieve higher prediction accuracy than linear regression, (2) the results reflect basic hydrological principles, but (3) further inference is hindered by the heterogeneity of results across algorithms. Thus, we conclude that the problem of equifinality as known from classical hydrological modeling also exists for ML and severely hampers its potential for inference. To account for the observed problems, we propose that when employing ML for inference, this should be made by using multiple algorithms and multiple methods, of which the latter should be embedded in a cross-validation routine.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1193 
KW  - machine learning
KW  - inference
KW  - floods
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-523843
SN  - 1866-8372
IS  - 5
ER  - 
TY  - GEN
A1  - Hempel, Sabrina
A1  - Koseska, Aneta
A1  - Nikoloski, Zoran
A1  - Kurths, Jürgen
T1  - Unraveling gene regulatory networks from time-resolved gene expression data
BT  - a measures comparison study
N2  - Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications.

Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study.

Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 371 
KW  - unferring cellular networks
KW  - mutual information
KW  - Escherichia-coli
KW  - cluster-analysis
KW  - series
KW  - algorithms
KW  - inference
KW  - models
KW  - recognition
KW  - variables
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-400924
ER  - 
TY  - GEN
A1  - Arslan, Seçkin
A1  - Bastiaanse, Roelien
A1  - Felser, Claudia
T1  - Looking at the evidence in visual world
BT  - eye-movements reveal how bilingual and monolingual Turkish speakers process grammatical evidentiality
T2  - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe
N2  - This study presents pioneering data on how adult early bilinguals (heritage speakers) and late bilingual speakers of Turkish and German process grammatical evidentiality in a visual world setting in comparison to monolingual speakers of Turkish. Turkish marks evidentiality, the linguistic reference to information source, through inflectional affixes signaling either direct (-DI) or indirect (-mls) evidentiality. We conducted an eyetracking-during-listening experiment where participants were given access to visual 'evidence' supporting the use of either a direct or indirect evidential form. The behavioral results indicate that the monolingual Turkish speakers comprehended direct and indirect evidential scenarios equally well. In contrast, both late and early bilinguals were less accurate and slower to respond to direct than to indirect evidentials. The behavioral results were also reflected in the proportions of looks data. That is, both late and early bilinguals fixated less frequently on the target picture in the direct than in the indirect evidential condition while the monolinguals showed no difference between these conditions. Taken together, our results indicate reduced sensitivity to the semantic and pragmatic function of direct evidential forms in both late and early bilingual speakers, suggesting a simplification of the Turkish evidentiality system in Turkish heritage grammars. We discuss our findings with regard to theories of incomplete acquisition and first language attrition.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 408 
KW  - evidentiality
KW  - information source
KW  - inference
KW  - witnessing
KW  - visual world paradigm
KW  - eye-movements
KW  - Turkish-German bilingualism
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-406307
IS  - 408
ER  -