TY  - GEN
A1  - Sprenger, Heike
A1  - Erban, Alexander
A1  - Seddig, Sylvia
A1  - Rudack, Katharina
A1  - Thalhammer, Anja
A1  - Le, Mai Q.
A1  - Walther, Dirk
A1  - Zuther, Ellen
A1  - Köhl, Karin I.
A1  - Kopka, Joachim
A1  - Hincha, Dirk K.
T1  - Metabolite and transcript markers for the prediction of potato drought tolerance
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Potato (Solanum tuberosum L.) is one of the most important food crops worldwide. Current potato varieties are highly susceptible to drought stress. In view of global climate change, selection of cultivars with improved drought tolerance and high yield potential is of paramount importance. Drought tolerance breeding of potato is currently based on direct selection according to yield and phenotypic traits and requires multiple trials under drought conditions. Marker‐assisted selection (MAS) is cheaper, faster and reduces classification errors caused by noncontrolled environmental effects. We analysed 31 potato cultivars grown under optimal and reduced water supply in six independent field trials. Drought tolerance was determined as tuber starch yield. Leaf samples from young plants were screened for preselected transcript and nontargeted metabolite abundance using qRT‐PCR and GC‐MS profiling, respectively. Transcript marker candidates were selected from a published RNA‐Seq data set. A Random Forest machine learning approach extracted metabolite and transcript markers for drought tolerance prediction with low error rates of 6% and 9%, respectively. Moreover, by combining transcript and metabolite markers, the prediction error was reduced to 4.3%. Feature selection from Random Forest models allowed model minimization, yielding a minimal combination of only 20 metabolite and transcript markers that were successfully tested for their reproducibility in 16 independent agronomic field trials. We demonstrate that a minimum combination of transcript and metabolite markers sampled at early cultivation stages predicts potato yield stability under drought largely independent of seasonal and regional agronomic conditions.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 673 
KW  - drought tolerance
KW  - machine learning
KW  - metabolite markers
KW  - potato (Solanum tuberosum)
KW  - prediction models
KW  - transcript markers
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-424630
SN  - 1866-8372
IS  - 673
ER  - 
TY  - GEN
A1  - Seewann, Lena
A1  - Verwiebe, Roland
A1  - Buder, Claudia
A1  - Fritsch, Nina-Sophie
T1  - “Broadcast your gender.”
BT  - A comparison of four text-based classification methods of German YouTube channels
T2  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe
N2  - Social media platforms provide a large array of behavioral data relevant to social scientific research. However, key information such as sociodemographic characteristics of agents are often missing. This paper aims to compare four methods of classifying social attributes from text. Specifically, we are interested in estimating the gender of German social media creators. By using the example of a random sample of 200 YouTube channels, we compare several classification methods, namely (1) a survey among university staff, (2) a name dictionary method with the World Gender Name Dictionary as a reference list, (3) an algorithmic approach using the website gender-api.com, and (4) a Multinomial Naïve Bayes (MNB) machine learning technique. These different methods identify gender attributes based on YouTube channel names and descriptions in German but are adaptable to other languages. Our contribution will evaluate the share of identifiable channels, accuracy and meaningfulness of classification, as well as limits and benefits of each approach. We aim to address methodological challenges connected to classifying gender attributes for YouTube channels as well as related to reinforcing stereotypes and ethical implications.
T3  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe - 152 
KW  - text based classification methods
KW  - gender
KW  - YouTube
KW  - machine learning
KW  - authorship attribution
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-566287
SN  - 1867-5808
IS  - 152
ER  - 
TY  - GEN
A1  - Schmidt, Lennart
A1  - Heße, Falk
A1  - Attinger, Sabine
A1  - Kumar, Rohini
T1  - Challenges in applying machine learning models for hydrological inference: a case study for flooding events across Germany
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Machine learning (ML) algorithms are being increasingly used in Earth and Environmental modeling studies owing to the ever-increasing availability of diverse data sets and computational resources as well as advancement in ML algorithms. Despite advances in their predictive accuracy, the usefulness of ML algorithms for inference remains elusive. In this study, we employ two popular ML algorithms, artificial neural networks and random forest, to analyze a large data set of flood events across Germany with the goals to analyze their predictive accuracy and their usability to provide insights to hydrologic system functioning. The results of the ML algorithms are contrasted against a parametric approach based on multiple linear regression. For analysis, we employ a model-agnostic framework named Permuted Feature Importance to derive the influence of models' predictors. This allows us to compare the results of different algorithms for the first time in the context of hydrology. Our main findings are that (1) the ML models achieve higher prediction accuracy than linear regression, (2) the results reflect basic hydrological principles, but (3) further inference is hindered by the heterogeneity of results across algorithms. Thus, we conclude that the problem of equifinality as known from classical hydrological modeling also exists for ML and severely hampers its potential for inference. To account for the observed problems, we propose that when employing ML for inference, this should be made by using multiple algorithms and multiple methods, of which the latter should be embedded in a cross-validation routine.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1193 
KW  - machine learning
KW  - inference
KW  - floods
Y1  - 2019
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-523843
SN  - 1866-8372
IS  - 5
ER  - 
TY  - GEN
A1  - Ryo, Masahiro
A1  - Jeschke, Jonathan M.
A1  - Rillig, Matthias C.
A1  - Heger, Tina
T1  - Machine learning with the hierarchy-of-hypotheses (HoH) approach discovers novel pattern in studies on biological invasions
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Research synthesis on simple yet general hypotheses and ideas is challenging in scientific disciplines studying highly context-dependent systems such as medical, social, and biological sciences. This study shows that machine learning, equation-free statistical modeling of artificial intelligence, is a promising synthesis tool for discovering novel patterns and the source of controversy in a general hypothesis. We apply a decision tree algorithm, assuming that evidence from various contexts can be adequately integrated in a hierarchically nested structure. As a case study, we analyzed 163 articles that studied a prominent hypothesis in invasion biology, the enemy release hypothesis. We explored if any of the nine attributes that classify each study can differentiate conclusions as classification problem. Results corroborated that machine learning can be useful for research synthesis, as the algorithm could detect patterns that had been already focused in previous narrative reviews. Compared with the previous synthesis study that assessed the same evidence collection based on experts' judgement, the algorithm has newly proposed that the studies focusing on Asian regions mostly supported the hypothesis, suggesting that more detailed investigations in these regions can enhance our understanding of the hypothesis. We suggest that machine learning algorithms can be a promising synthesis tool especially where studies (a) reformulate a general hypothesis from different perspectives, (b) use different methods or variables, or (c) report insufficient information for conducting meta-analyses.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1171 
KW  - artificial intelligence
KW  - hierarchy-of-hypotheses approach
KW  - machine learning
KW  - meta-analysis
KW  - synthesis
KW  - systematic review
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-517643
SN  - 1866-8372
IS  - 1171
SP  - 66
EP  - 73
ER  - 
TY  - GEN
A1  - Panzer, Marcel
A1  - Bender, Benedict
A1  - Gronau, Norbert
T1  - Neural agent-based production planning and control
BT  - an architectural review
T2  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe
N2  - Nowadays, production planning and control must cope with mass customization, increased fluctuations in demand, and high competition pressures. Despite prevailing market risks, planning accuracy and increased adaptability in the event of disruptions or failures must be ensured, while simultaneously optimizing key process indicators. To manage that complex task, neural networks that can process large quantities of high-dimensional data in real time have been widely adopted in recent years. Although these are already extensively deployed in production systems, a systematic review of applications and implemented agent embeddings and architectures has not yet been conducted. The main contribution of this paper is to provide researchers and practitioners with an overview of applications and applied embeddings and to motivate further research in neural agent-based production. Findings indicate that neural agents are not only deployed in diverse applications, but are also increasingly implemented in multi-agent environments or in combination with conventional methods — leveraging performances compared to benchmarks and reducing dependence on human experience. This not only implies a more sophisticated focus on distributed production resources, but also broadening the perspective from a local to a global scale. Nevertheless, future research must further increase scalability and reproducibility to guarantee a simplified transfer of results to reality.
T3  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe - 172 
KW  - production planning and control
KW  - machine learning
KW  - neural networks
KW  - systematic literature review
KW  - taxonomy
Y1  - 2022
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-604777
SN  - 1867-5808
ER  - 
TY  - GEN
A1  - Panzer, Marcel
A1  - Bender, Benedict
A1  - Gronau, Norbert
T1  - Deep reinforcement learning in production planning and control
BT  - A systematic literature review
T2  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe
N2  - Increasingly fast development cycles and individualized products pose major challenges for today's smart production systems in times of industry 4.0. The systems must be flexible and continuously adapt to changing conditions while still guaranteeing high throughputs and robustness against external disruptions. Deep reinforcement learning (RL) algorithms, which already reached impressive success with Google DeepMind's AlphaGo, are increasingly transferred to production systems to meet related requirements. Unlike supervised and unsupervised machine learning techniques, deep RL algorithms learn based on recently collected sensorand process-data in direct interaction with the environment and are able to perform decisions in real-time. As such, deep RL algorithms seem promising given their potential to provide decision support in complex environments, as production systems, and simultaneously adapt to changing circumstances. While different use-cases for deep RL emerged, a structured overview and integration of findings on their application are missing. To address this gap, this contribution provides a systematic literature review of existing deep RL applications in the field of production planning and control as well as production logistics. From a performance perspective, it became evident that deep RL can beat heuristics significantly in their overall performance and provides superior solutions to various industrial use-cases. Nevertheless, safety and reliability concerns must be overcome before the widespread use of deep RL is possible which presumes more intensive testing of deep RL in real world applications besides the already ongoing intensive simulations.
T3  - Zweitveröffentlichungen der Universität Potsdam : Wirtschafts- und Sozialwissenschaftliche Reihe - 198 
KW  - deep reinforcement learning
KW  - machine learning
KW  - production planning
KW  - production control
KW  - systematic literature review
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-605722
SN  - 2701-6277
SN  - 1867-5808
ER  - 
TY  - GEN
A1  - Konak, Orhan
A1  - Wegner, Pit
A1  - Arnrich, Bert
T1  - IMU-Based Movement Trajectory Heatmaps for Human Activity Recognition
T2  - Postprints der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Recent trends in ubiquitous computing have led to a proliferation of studies that focus on human activity recognition (HAR) utilizing inertial sensor data that consist of acceleration, orientation and angular velocity. However, the performances of such approaches are limited by the amount of annotated training data, especially in fields where annotating data is highly time-consuming and requires specialized professionals, such as in healthcare. In image classification, this limitation has been mitigated by powerful oversampling techniques such as data augmentation. Using this technique, this work evaluates to what extent transforming inertial sensor data into movement trajectories and into 2D heatmap images can be advantageous for HAR when data are scarce. A convolutional long short-term memory (ConvLSTM) network that incorporates spatiotemporal correlations was used to classify the heatmap images. Evaluation was carried out on Deep Inertial Poser (DIP), a known dataset composed of inertial sensor data. The results obtained suggest that for datasets with large numbers of subjects, using state-of-the-art methods remains the best alternative. However, a performance advantage was achieved for small datasets, which is usually the case in healthcare. Moreover, movement trajectories provide a visual representation of human activities, which can help researchers to better interpret and analyze motion patterns.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 4 
KW  - human activity recognition
KW  - image processing
KW  - machine learning
KW  - sensor data
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-487799
IS  - 4
ER  - 
TY  - GEN
A1  - Kibrik, Andrej A.
A1  - Khudyakova, Mariya V.
A1  - Dobrov, Grigory B.
A1  - Linnik, Anastasia
A1  - Zalmanov, Dmitrij A.
T1  - Referential Choice
BT  - Predictability and Its Limits
N2  - We report a study of referential choice in discourse production, understood as the choice between various types of referential devices, such as pronouns and full noun phrases. Our goal is to predict referential choice, and to explore to what extent such prediction is possible. Our approach to referential choice includes a cognitively informed theoretical component, corpus analysis, machine learning methods and experimentation with human participants. Machine learning algorithms make use of 25 factors, including referent’s properties (such as animacy and protagonism), the distance between a referential expression and its antecedent, the antecedent’s syntactic role, and so on. Having found the predictions of our algorithm to coincide with the original almost 90% of the time, we hypothesized that fully accurate prediction is not possible because, in many situations, more than one referential option is available. This hypothesis was supported by an experimental study, in which participants answered questions about either the original text in the corpus, or about a text modified in accordance with the algorithm’s prediction. Proportions of correct answers to these questions, as well as participants’ rating of the questions’ difficulty, suggested that divergences between the algorithm’s prediction and the original referential device in the corpus occur overwhelmingly in situations where the referential choice is not categorical.
T3  - Zweitveröffentlichungen der Universität Potsdam : Humanwissenschaftliche Reihe - 306 
KW  - cross-methodological approach
KW  - discourse production
KW  - machine learning
KW  - non-categoricity
KW  - referential choice
Y1  - 2016
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-100313
ER  - 
TY  - GEN
A1  - Hollstein, André
A1  - Segl, Karl
A1  - Guanter, Luis
A1  - Brell, Maximilian
A1  - Enesco, Marta
T1  - Ready-to-Use methods for the detection of clouds, cirrus, snow, shadow, water and clear sky pixels in Sentinel-2 MSI images
T2  - remote sensing
N2  - Classification of clouds, cirrus, snow, shadows and clear sky areas is a crucial step in the pre-processing of optical remote sensing images and is a valuable input for their atmospheric correction. The Multi-Spectral Imager on board the Sentinel-2's of the Copernicus program offers optimized bands for this task and delivers unprecedented amounts of data regarding spatial sampling, global coverage, spectral coverage, and repetition rate. Efficient algorithms are needed to process, or possibly reprocess, those big amounts of data. Techniques based on top-of-atmosphere reflectance spectra for single-pixels without exploitation of external data or spatial context offer the largest potential for parallel data processing and highly optimized processing throughput. Such algorithms can be seen as a baseline for possible trade-offs in processing performance when the application of more sophisticated methods is discussed. We present several ready-to-use classification algorithms which are all based on a publicly available database of manually classified Sentinel-2A images. These algorithms are based on commonly used and newly developed machine learning techniques which drastically reduce the amount of time needed to update the algorithms when new images are added to the database. Several ready-to-use decision trees are presented which allow to correctly label about 91% of the spectra within a validation dataset. While decision trees are simple to implement and easy to understand, they offer only limited classification skill. It improves to 98% when the presented algorithm based on the classical Bayesian method is applied. This method has only recently been used for this task and shows excellent performance concerning classification skill and processing performance. A comparison of the presented algorithms with other commonly used techniques such as random forests, stochastic gradient descent, or support vector machines is also given. Especially random forests and support vector machines show similar classification skill as the classical Bayesian method.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 455 
KW  - Sentinel-2 MSI
KW  - cloud detection
KW  - snow detection
KW  - cirrus detection
KW  - shadow detection
KW  - Bayesian classification
KW  - machine learning
KW  - decision trees
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-407938
ER  - 
TY  - GEN
A1  - Hecker, Pascal
A1  - Steckhan, Nico
A1  - Eyben, Florian
A1  - Schuller, Björn Wolfgang
A1  - Arnrich, Bert
T1  - Voice Analysis for Neurological Disorder Recognition – A Systematic Review and Perspective on Emerging Trends
T2  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät
N2  - Quantifying neurological disorders from voice is a rapidly growing field of research and holds promise for unobtrusive and large-scale disorder monitoring. The data recording setup and data analysis pipelines are both crucial aspects to effectively obtain relevant information from participants. Therefore, we performed a systematic review to provide a high-level overview of practices across various neurological disorders and highlight emerging trends. PRISMA-based literature searches were conducted through PubMed, Web of Science, and IEEE Xplore to identify publications in which original (i.e., newly recorded) datasets were collected. Disorders of interest were psychiatric as well as neurodegenerative disorders, such as bipolar disorder, depression, and stress, as well as amyotrophic lateral sclerosis amyotrophic lateral sclerosis, Alzheimer's, and Parkinson's disease, and speech impairments (aphasia, dysarthria, and dysphonia). Of the 43 retrieved studies, Parkinson's disease is represented most prominently with 19 discovered datasets. Free speech and read speech tasks are most commonly used across disorders. Besides popular feature extraction toolkits, many studies utilise custom-built feature sets. Correlations of acoustic features with psychiatric and neurodegenerative disorders are presented. In terms of analysis, statistical analysis for significance of individual features is commonly used, as well as predictive modeling approaches, especially with support vector machines and a small number of artificial neural networks. An emerging trend and recommendation for future studies is to collect data in everyday life to facilitate longitudinal data collection and to capture the behavior of participants more naturally. Another emerging trend is to record additional modalities to voice, which can potentially increase analytical performance.
T3  - Zweitveröffentlichungen der Universität Potsdam : Reihe der Digital Engineering Fakultät - 13 
KW  - neurological disorders
KW  - voice
KW  - speech
KW  - everyday life
KW  - multiple modalities
KW  - machine learning
KW  - disorder recognition
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-581019
IS  - 13
ER  - 
TY  - GEN
A1  - Ayzel, Georgy
A1  - Izhitskiy, Alexander
T1  - Climate change impact assessment on freshwater inflow into the Small Aral Sea
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - During the last few decades, the rapid separation of the Small Aral Sea from the isolated basin has changed its hydrological and ecological conditions tremendously. In the present study, we developed and validated the hybrid model for the Syr Darya River basin based on a combination of state-of-the-art hydrological and machine learning models. Climate change impact on freshwater inflow into the Small Aral Sea for the projection period 2007–2099 has been quantified based on the developed hybrid model and bias corrected and downscaled meteorological projections simulated by four General Circulation Models (GCM) for each of three Representative Concentration Pathway scenarios (RCP). The developed hybrid model reliably simulates freshwater inflow for the historical period with a Nash–Sutcliffe efficiency of 0.72 and a Kling–Gupta efficiency of 0.77. Results of the climate change impact assessment showed that the freshwater inflow projections produced by different GCMs are misleading by providing contradictory results for the projection period. However, we identified that the relative runoff changes are expected to be more pronounced in the case of more aggressive RCP scenarios. The simulated projections of freshwater inflow provide a basis for further assessment of climate change impacts on hydrological and ecological conditions of the Small Aral Sea in the 21st Century.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1071 
KW  - Small Aral Sea
KW  - hydrology
KW  - climate change
KW  - modeling
KW  - machine learning
Y1  - 2021
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-472794
SN  - 1866-8372
IS  - 1071
ER  -