TY - GEN A1 - Schmidt, Lennart A1 - Heße, Falk A1 - Attinger, Sabine A1 - Kumar, Rohini T1 - Challenges in applying machine learning models for hydrological inference: a case study for flooding events across Germany T2 - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - Machine learning (ML) algorithms are being increasingly used in Earth and Environmental modeling studies owing to the ever-increasing availability of diverse data sets and computational resources as well as advancement in ML algorithms. Despite advances in their predictive accuracy, the usefulness of ML algorithms for inference remains elusive. In this study, we employ two popular ML algorithms, artificial neural networks and random forest, to analyze a large data set of flood events across Germany with the goals to analyze their predictive accuracy and their usability to provide insights to hydrologic system functioning. The results of the ML algorithms are contrasted against a parametric approach based on multiple linear regression. For analysis, we employ a model-agnostic framework named Permuted Feature Importance to derive the influence of models' predictors. This allows us to compare the results of different algorithms for the first time in the context of hydrology. Our main findings are that (1) the ML models achieve higher prediction accuracy than linear regression, (2) the results reflect basic hydrological principles, but (3) further inference is hindered by the heterogeneity of results across algorithms. Thus, we conclude that the problem of equifinality as known from classical hydrological modeling also exists for ML and severely hampers its potential for inference. To account for the observed problems, we propose that when employing ML for inference, this should be made by using multiple algorithms and multiple methods, of which the latter should be embedded in a cross-validation routine. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1193 KW - machine learning KW - inference KW - floods Y1 - 2019 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-523843 SN - 1866-8372 IS - 5 ER - TY - GEN A1 - Hempel, Sabrina A1 - Koseska, Aneta A1 - Nikoloski, Zoran A1 - Kurths, Jürgen T1 - Unraveling gene regulatory networks from time-resolved gene expression data BT - a measures comparison study N2 - Background: Inferring regulatory interactions between genes from transcriptomics time-resolved data, yielding reverse engineered gene regulatory networks, is of paramount importance to systems biology and bioinformatics studies. Accurate methods to address this problem can ultimately provide a deeper insight into the complexity, behavior, and functions of the underlying biological systems. However, the large number of interacting genes coupled with short and often noisy time-resolved read-outs of the system renders the reverse engineering a challenging task. Therefore, the development and assessment of methods which are computationally efficient, robust against noise, applicable to short time series data, and preferably capable of reconstructing the directionality of the regulatory interactions remains a pressing research problem with valuable applications. Results: Here we perform the largest systematic analysis of a set of similarity measures and scoring schemes within the scope of the relevance network approach which are commonly used for gene regulatory network reconstruction from time series data. In addition, we define and analyze several novel measures and schemes which are particularly suitable for short transcriptomics time series. We also compare the considered 21 measures and 6 scoring schemes according to their ability to correctly reconstruct such networks from short time series data by calculating summary statistics based on the corresponding specificity and sensitivity. Our results demonstrate that rank and symbol based measures have the highest performance in inferring regulatory interactions. In addition, the proposed scoring scheme by asymmetric weighting has shown to be valuable in reducing the number of false positive interactions. On the other hand, Granger causality as well as information-theoretic measures, frequently used in inference of regulatory networks, show low performance on the short time series analyzed in this study. Conclusions: Our study is intended to serve as a guide for choosing a particular combination of similarity measures and scoring schemes suitable for reconstruction of gene regulatory networks from short time series data. We show that further improvement of algorithms for reverse engineering can be obtained if one considers measures that are rooted in the study of symbolic dynamics or ranks, in contrast to the application of common similarity measures which do not consider the temporal character of the employed data. Moreover, we establish that the asymmetric weighting scoring scheme together with symbol based measures (for low noise level) and rank based measures (for high noise level) are the most suitable choices. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 371 KW - unferring cellular networks KW - mutual information KW - Escherichia-coli KW - cluster-analysis KW - series KW - algorithms KW - inference KW - models KW - recognition KW - variables Y1 - 2017 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-400924 ER - TY - GEN A1 - Arslan, Seçkin A1 - Bastiaanse, Roelien A1 - Felser, Claudia T1 - Looking at the evidence in visual world BT - eye-movements reveal how bilingual and monolingual Turkish speakers process grammatical evidentiality T2 - Postprints der Universität Potsdam : Humanwissenschaftliche Reihe N2 - This study presents pioneering data on how adult early bilinguals (heritage speakers) and late bilingual speakers of Turkish and German process grammatical evidentiality in a visual world setting in comparison to monolingual speakers of Turkish. Turkish marks evidentiality, the linguistic reference to information source, through inflectional affixes signaling either direct (-DI) or indirect (-mls) evidentiality. We conducted an eyetracking-during-listening experiment where participants were given access to visual 'evidence' supporting the use of either a direct or indirect evidential form. The behavioral results indicate that the monolingual Turkish speakers comprehended direct and indirect evidential scenarios equally well. In contrast, both late and early bilinguals were less accurate and slower to respond to direct than to indirect evidentials. The behavioral results were also reflected in the proportions of looks data. That is, both late and early bilinguals fixated less frequently on the target picture in the direct than in the indirect evidential condition while the monolinguals showed no difference between these conditions. Taken together, our results indicate reduced sensitivity to the semantic and pragmatic function of direct evidential forms in both late and early bilingual speakers, suggesting a simplification of the Turkish evidentiality system in Turkish heritage grammars. We discuss our findings with regard to theories of incomplete acquisition and first language attrition. T3 - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 408 KW - evidentiality KW - information source KW - inference KW - witnessing KW - visual world paradigm KW - eye-movements KW - Turkish-German bilingualism Y1 - 2018 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-406307 IS - 408 ER -