TY  - JOUR
A1  - Bussas, Matthias
A1  - Sawade, Christoph
A1  - Kuhn, Nicolas
A1  - Scheffer, Tobias
A1  - Landwehr, Niels
T1  - Varying-coefficient models for geospatial transfer learning
JF  - Machine learning
N2  - We study prediction problems in which the conditional distribution of the output given the input varies as a function of task variables which, in our applications, represent space and time. In varying-coefficient models, the coefficients of this conditional are allowed to change smoothly in space and time; the strength of the correlations between neighboring points is determined by the data. This is achieved by placing a Gaussian process (GP) prior on the coefficients. Bayesian inference in varying-coefficient models is generally intractable. We show that with an isotropic GP prior, inference in varying-coefficient models resolves to standard inference for a GP that can be solved efficiently. MAP inference in this model resolves to multitask learning using task and instance kernels. We clarify the relationship between varying-coefficient models and the hierarchical Bayesian multitask model and show that inference for hierarchical Bayesian multitask models can be carried out efficiently using graph-Laplacian kernels. We explore the model empirically for the problems of predicting rent and real-estate prices, and predicting the ground motion during seismic events. We find that varying-coefficient models with GP priors excel at predicting rents and real-estate prices. The ground-motion model predicts seismic hazards in the State of California more accurately than the previous state of the art.
KW  - Transfer learning
KW  - Varying-coefficient models
KW  - Housing-price prediction
KW  - Seismic-hazard models
Y1  - 2017
U6  - https://doi.org/10.1007/s10994-017-5639-3
SN  - 0885-6125
SN  - 1573-0565
VL  - 106
SP  - 1419
EP  - 1440
PB  - Springer
CY  - Dordrecht
ER  - 
TY  - GEN
A1  - Patil, Kaustubh R.
A1  - Haider, Peter
A1  - Pope, Phillip B.
A1  - Turnbaugh, Peter J.
A1  - Morrison, Mark
A1  - Scheffer, Tobias
A1  - McHardy, Alice C.
T1  - Taxonomic metagenome sequence assignment with structured output models
T2  - Nature methods : techniques for life scientists and chemists
Y1  - 2011
U6  - https://doi.org/10.1038/nmeth0311-191
SN  - 1548-7091
VL  - 8
IS  - 3
SP  - 191
EP  - 192
PB  - Nature Publ. Group
CY  - London
ER  - 
TY  - JOUR
A1  - Brückner, Michael
A1  - Kanzow, Christian
A1  - Scheffer, Tobias
T1  - Static prediction games for adversarial learning problems
JF  - Journal of machine learning research
N2  - The standard assumption of identically distributed training and test data is violated when the test data are generated in response to the presence of a predictive model. This becomes apparent, for example, in the context of email spam filtering. Here, email service providers employ spam filters, and spam senders engineer campaign templates to achieve a high rate of successful deliveries despite the filters. We model the interaction between the learner and the data generator as a static game in which the cost functions of the learner and the data generator are not necessarily antagonistic. We identify conditions under which this prediction game has a unique Nash equilibrium and derive algorithms that find the equilibrial prediction model. We derive two instances, the Nash logistic regression and the Nash support vector machine, and empirically explore their properties in a case study on email spam filtering.
KW  - static prediction games
KW  - adversarial classification
KW  - Nash equilibrium
Y1  - 2012
SN  - 1532-4435
VL  - 13
SP  - 2617
EP  - 2654
PB  - Microtome Publishing
CY  - Cambridge, Mass.
ER  - 
TY  - GEN
A1  - Ayzel, Georgy
A1  - Scheffer, Tobias
A1  - Heistermann, Maik
T1  - RainNet v1.0
BT  - a convolutional neural network for radar-based precipitation nowcasting
T2  - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - In this study, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. Its design was inspired by the U-Net and SegNet families of deep learning models, which were originally designed for binary segmentation tasks. RainNet was trained to predict continuous precipitation intensities at a lead time of 5min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900km × 900km and has a resolution of 1km in space and 5min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In order to achieve a lead time of 1h, a recursive approach was implemented by using RainNet predictions at 5min lead times as model inputs for longer lead times. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the rainymotion library and had previously been shown to outperform DWD's operational nowcasting model for the same set of verification events.

RainNet significantly outperforms the benchmark models at all lead times up to 60min for the routine verification metrics mean absolute error (MAE) and the critical success index (CSI) at intensity thresholds of 0.125, 1, and 5mm h⁻¹. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15mm h⁻¹). The limited ability of RainNet to predict heavy rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance in terms of a binary segmentation task. Furthermore, we suggest additional input data that could help to better identify situations with imminent precipitation dynamics. The model code, pretrained weights, and training data are provided in open repositories as an input for such future studies.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 964 
KW  - weather
KW  - models
KW  - skill
Y1  - 2020
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-472942
SN  - 1866-8372
IS  - 964
ER  - 
TY  - JOUR
A1  - Ayzel, Georgy
A1  - Scheffer, Tobias
A1  - Heistermann, Maik
T1  - RainNet v1.0
BT  - a convolutional neural network for radar-based precipitation nowcasting
JF  - Geoscientific Model Development
N2  - In this study, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. Its design was inspired by the U-Net and SegNet families of deep learning models, which were originally designed for binary segmentation tasks. RainNet was trained to predict continuous precipitation intensities at a lead time of 5min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900km × 900km and has a resolution of 1km in space and 5min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In order to achieve a lead time of 1h, a recursive approach was implemented by using RainNet predictions at 5min lead times as model inputs for longer lead times. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the rainymotion library and had previously been shown to outperform DWD's operational nowcasting model for the same set of verification events.

RainNet significantly outperforms the benchmark models at all lead times up to 60min for the routine verification metrics mean absolute error (MAE) and the critical success index (CSI) at intensity thresholds of 0.125, 1, and 5mm h⁻¹. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15mm h⁻¹). The limited ability of RainNet to predict heavy rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance in terms of a binary segmentation task. Furthermore, we suggest additional input data that could help to better identify situations with imminent precipitation dynamics. The model code, pretrained weights, and training data are provided in open repositories as an input for such future studies.
KW  - weather
KW  - models
KW  - skill
Y1  - 2020
U6  - https://doi.org/10.5194/gmd-13-2631-2020
SN  - 1991-959X
SN  - 1991-9603
VL  - 13
IS  - 6
SP  - 2631
EP  - 2644
PB  - Copernicus Publ.
CY  - Göttingen
ER  - 
TY  - JOUR
A1  - Prasse, Paul
A1  - Iversen, Pascal
A1  - Lienhard, Matthias
A1  - Thedinga, Kristina
A1  - Herwig, Ralf
A1  - Scheffer, Tobias
T1  - Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction
JF  - MDPI
N2  - Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases.
KW  - deep neural networks
KW  - drug-sensitivity prediction
KW  - anti-cancer drugs
Y1  - 2022
U6  - https://doi.org/10.3390/cancers14163950
SN  - 2072-6694
VL  - 14
SP  - 1
EP  - 14
PB  - MDPI
CY  - Basel, Schweiz
ET  - 16
ER  - 
TY  - GEN
A1  - Prasse, Paul
A1  - Iversen, Pascal
A1  - Lienhard, Matthias
A1  - Thedinga, Kristina
A1  - Herwig, Ralf
A1  - Scheffer, Tobias
T1  - Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction
T2  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe
N2  - Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases.
T3  - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1300 
KW  - deep neural networks
KW  - drug-sensitivity prediction
KW  - anti-cancer drugs
Y1  - 2023
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-577341
SN  - 1866-8372
SP  - 1
EP  - 14
PB  - Universitätsverlag Potsdam
CY  - Potsdam
ER  - 
TY  - JOUR
A1  - Prasse, Paul
A1  - Iversen, Pascal
A1  - Lienhard, Matthias
A1  - Thedinga, Kristina
A1  - Bauer, Christopher
A1  - Herwig, Ralf
A1  - Scheffer, Tobias
T1  - Matching anticancer compounds and tumor cell lines by neural networks with ranking loss
JF  - NAR: genomics and bioinformatics
N2  - Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug's inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model's capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data.
Y1  - 2022
U6  - https://doi.org/10.1093/nargab/lqab128
SN  - 2631-9268
VL  - 4
IS  - 1
PB  - Oxford Univ. Press
CY  - Oxford
ER  - 
TY  - INPR
A1  - Prasse, Paul
A1  - Gruben, Gerrit
A1  - Machlika, Lukas
A1  - Pevny, Tomas
A1  - Sofka, Michal
A1  - Scheffer, Tobias
T1  - Malware Detection by HTTPS Traffic Analysis
N2  - In order to evade detection by network-traffic analysis, a growing proportion of malware uses the encrypted HTTPS protocol. We explore the problem of detecting  malware on client computers based on HTTPS traffic analysis. In this setting, malware has to be detected based on the host IP address, ports, timestamp,  and data volume information of TCP/IP packets that are sent and received by all the applications on the client. We develop a scalable protocol that allows us to collect network flows of known malicious and benign applications as training data and derive a malware-detection method based on a neural networks and sequence classification. We study the method's ability to detect known and new, unknown malware in a large-scale empirical study.
KW  - machine learning
KW  - computer security
Y1  - 2017
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-100942
ER  - 
TY  - JOUR
A1  - Dick, Uwe
A1  - Scheffer, Tobias
T1  - Learning to control a structured-prediction decoder for detection of HTTP-layer DDoS attackers
JF  - Machine learning
N2  - We focus on the problem of detecting clients that attempt to exhaust server resources by flooding a service with protocol-compliant HTTP requests. Attacks are usually coordinated by an entity that controls many clients. Modeling the application as a structured-prediction problem allows the prediction model to jointly classify a multitude of clients based on their cohesion of otherwise inconspicuous features. Since the resulting output space is too vast to search exhaustively, we employ greedy search and techniques in which a parametric controller guides the search. We apply a known method that sequentially learns the controller and the structured-prediction model. We then derive an online policy-gradient method that finds the parameters of the controller and of the structured-prediction model in a joint optimization problem; we obtain a convergence guarantee for the latter method. We evaluate and compare the various methods based on a large collection of traffic data of a web-hosting service.
Y1  - 2016
U6  - https://doi.org/10.1007/s10994-016-5581-9
SN  - 0885-6125
SN  - 1573-0565
VL  - 104
SP  - 385
EP  - 410
PB  - Springer
CY  - Dordrecht
ER  -