TY - JOUR A1 - Bauer, Chris A1 - Herwig, Ralf A1 - Lienhard, Matthias A1 - Prasse, Paul A1 - Scheffer, Tobias A1 - Schuchhardt, Johannes T1 - Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types JF - Journal of translational medicine N2 - Background: There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually. Methods: In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data. Results: We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: . Conclusions: Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs. KW - Literature mining KW - Anti-cancer drugs KW - Tumor types KW - Word embeddings KW - Database Y1 - 2021 U6 - https://doi.org/10.1186/s12967-021-02941-z SN - 1479-5876 VL - 19 IS - 1 PB - BioMed Central CY - London ER - TY - JOUR A1 - Prasse, Paul A1 - Iversen, Pascal A1 - Lienhard, Matthias A1 - Thedinga, Kristina A1 - Bauer, Christopher A1 - Herwig, Ralf A1 - Scheffer, Tobias T1 - Matching anticancer compounds and tumor cell lines by neural networks with ranking loss JF - NAR: genomics and bioinformatics N2 - Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug's inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model's capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data. Y1 - 2022 U6 - https://doi.org/10.1093/nargab/lqab128 SN - 2631-9268 VL - 4 IS - 1 PB - Oxford Univ. Press CY - Oxford ER - TY - GEN A1 - Prasse, Paul A1 - Iversen, Pascal A1 - Lienhard, Matthias A1 - Thedinga, Kristina A1 - Herwig, Ralf A1 - Scheffer, Tobias T1 - Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction T2 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 1300 KW - deep neural networks KW - drug-sensitivity prediction KW - anti-cancer drugs Y1 - 2023 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-577341 SN - 1866-8372 SP - 1 EP - 14 PB - Universitätsverlag Potsdam CY - Potsdam ER - TY - JOUR A1 - Prasse, Paul A1 - Iversen, Pascal A1 - Lienhard, Matthias A1 - Thedinga, Kristina A1 - Herwig, Ralf A1 - Scheffer, Tobias T1 - Pre-Training on In Vitro and Fine-Tuning on Patient-Derived Data Improves Deep Neural Networks for Anti-Cancer Drug-Sensitivity Prediction JF - MDPI N2 - Large-scale databases that report the inhibitory capacities of many combinations of candidate drug compounds and cultivated cancer cell lines have driven the development of preclinical drug-sensitivity models based on machine learning. However, cultivated cell lines have devolved from human cancer cells over years or even decades under selective pressure in culture conditions. Moreover, models that have been trained on in vitro data cannot account for interactions with other types of cells. Drug-response data that are based on patient-derived cell cultures, xenografts, and organoids, on the other hand, are not available in the quantities that are needed to train high-capacity machine-learning models. We found that pre-training deep neural network models of drug sensitivity on in vitro drug-sensitivity databases before fine-tuning the model parameters on patient-derived data improves the models’ accuracy and improves the biological plausibility of the features, compared to training only on patient-derived data. From our experiments, we can conclude that pre-trained models outperform models that have been trained on the target domains in the vast majority of cases. KW - deep neural networks KW - drug-sensitivity prediction KW - anti-cancer drugs Y1 - 2022 U6 - https://doi.org/10.3390/cancers14163950 SN - 2072-6694 VL - 14 SP - 1 EP - 14 PB - MDPI CY - Basel, Schweiz ET - 16 ER - TY - JOUR A1 - Prasse, Paul A1 - Knaebel, Rene A1 - Machlica, Lukas A1 - Pevny, Tomas A1 - Scheffer, Tobias T1 - Joint detection of malicious domains and infected clients JF - Machine learning N2 - Detection of malware-infected computers and detection of malicious web domains based on their encrypted HTTPS traffic are challenging problems, because only addresses, timestamps, and data volumes are observable. The detection problems are coupled, because infected clients tend to interact with malicious domains. Traffic data can be collected at a large scale, and antivirus tools can be used to identify infected clients in retrospect. Domains, by contrast, have to be labeled individually after forensic analysis. We explore transfer learning based on sluice networks; this allows the detection models to bootstrap each other. In a large-scale experimental study, we find that the model outperforms known reference models and detects previously unknown malware, previously unknown malware families, and previously unknown malicious domains. KW - Machine learning KW - Neural networks KW - Computer security KW - Traffic data KW - Https traffic Y1 - 2019 U6 - https://doi.org/10.1007/s10994-019-05789-z SN - 0885-6125 SN - 1573-0565 VL - 108 IS - 8-9 SP - 1353 EP - 1368 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Ayzel, Georgy A1 - Scheffer, Tobias A1 - Heistermann, Maik T1 - RainNet v1.0 BT - a convolutional neural network for radar-based precipitation nowcasting JF - Geoscientific Model Development N2 - In this study, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. Its design was inspired by the U-Net and SegNet families of deep learning models, which were originally designed for binary segmentation tasks. RainNet was trained to predict continuous precipitation intensities at a lead time of 5min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900km × 900km and has a resolution of 1km in space and 5min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In order to achieve a lead time of 1h, a recursive approach was implemented by using RainNet predictions at 5min lead times as model inputs for longer lead times. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the rainymotion library and had previously been shown to outperform DWD's operational nowcasting model for the same set of verification events. RainNet significantly outperforms the benchmark models at all lead times up to 60min for the routine verification metrics mean absolute error (MAE) and the critical success index (CSI) at intensity thresholds of 0.125, 1, and 5mm h⁻¹. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15mm h⁻¹). The limited ability of RainNet to predict heavy rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance in terms of a binary segmentation task. Furthermore, we suggest additional input data that could help to better identify situations with imminent precipitation dynamics. The model code, pretrained weights, and training data are provided in open repositories as an input for such future studies. KW - weather KW - models KW - skill Y1 - 2020 U6 - https://doi.org/10.5194/gmd-13-2631-2020 SN - 1991-959X SN - 1991-9603 VL - 13 IS - 6 SP - 2631 EP - 2644 PB - Copernicus Publ. CY - Göttingen ER - TY - GEN A1 - Ayzel, Georgy A1 - Scheffer, Tobias A1 - Heistermann, Maik T1 - RainNet v1.0 BT - a convolutional neural network for radar-based precipitation nowcasting T2 - Postprints der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe N2 - In this study, we present RainNet, a deep convolutional neural network for radar-based precipitation nowcasting. Its design was inspired by the U-Net and SegNet families of deep learning models, which were originally designed for binary segmentation tasks. RainNet was trained to predict continuous precipitation intensities at a lead time of 5min, using several years of quality-controlled weather radar composites provided by the German Weather Service (DWD). That data set covers Germany with a spatial domain of 900km × 900km and has a resolution of 1km in space and 5min in time. Independent verification experiments were carried out on 11 summer precipitation events from 2016 to 2017. In order to achieve a lead time of 1h, a recursive approach was implemented by using RainNet predictions at 5min lead times as model inputs for longer lead times. In the verification experiments, trivial Eulerian persistence and a conventional model based on optical flow served as benchmarks. The latter is available in the rainymotion library and had previously been shown to outperform DWD's operational nowcasting model for the same set of verification events. RainNet significantly outperforms the benchmark models at all lead times up to 60min for the routine verification metrics mean absolute error (MAE) and the critical success index (CSI) at intensity thresholds of 0.125, 1, and 5mm h⁻¹. However, rainymotion turned out to be superior in predicting the exceedance of higher intensity thresholds (here 10 and 15mm h⁻¹). The limited ability of RainNet to predict heavy rainfall intensities is an undesirable property which we attribute to a high level of spatial smoothing introduced by the model. At a lead time of 5min, an analysis of power spectral density confirmed a significant loss of spectral power at length scales of 16km and below. Obviously, RainNet had learned an optimal level of smoothing to produce a nowcast at 5min lead time. In that sense, the loss of spectral power at small scales is informative, too, as it reflects the limits of predictability as a function of spatial scale. Beyond the lead time of 5min, however, the increasing level of smoothing is a mere artifact – an analogue to numerical diffusion – that is not a property of RainNet itself but of its recursive application. In the context of early warning, the smoothing is particularly unfavorable since pronounced features of intense precipitation tend to get lost over longer lead times. Hence, we propose several options to address this issue in prospective research, including an adjustment of the loss function for model training, model training for longer lead times, and the prediction of threshold exceedance in terms of a binary segmentation task. Furthermore, we suggest additional input data that could help to better identify situations with imminent precipitation dynamics. The model code, pretrained weights, and training data are provided in open repositories as an input for such future studies. T3 - Zweitveröffentlichungen der Universität Potsdam : Mathematisch-Naturwissenschaftliche Reihe - 964 KW - weather KW - models KW - skill Y1 - 2020 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:kobv:517-opus4-472942 SN - 1866-8372 IS - 964 ER - TY - JOUR A1 - Bussas, Matthias A1 - Sawade, Christoph A1 - Kuhn, Nicolas A1 - Scheffer, Tobias A1 - Landwehr, Niels T1 - Varying-coefficient models for geospatial transfer learning JF - Machine learning N2 - We study prediction problems in which the conditional distribution of the output given the input varies as a function of task variables which, in our applications, represent space and time. In varying-coefficient models, the coefficients of this conditional are allowed to change smoothly in space and time; the strength of the correlations between neighboring points is determined by the data. This is achieved by placing a Gaussian process (GP) prior on the coefficients. Bayesian inference in varying-coefficient models is generally intractable. We show that with an isotropic GP prior, inference in varying-coefficient models resolves to standard inference for a GP that can be solved efficiently. MAP inference in this model resolves to multitask learning using task and instance kernels. We clarify the relationship between varying-coefficient models and the hierarchical Bayesian multitask model and show that inference for hierarchical Bayesian multitask models can be carried out efficiently using graph-Laplacian kernels. We explore the model empirically for the problems of predicting rent and real-estate prices, and predicting the ground motion during seismic events. We find that varying-coefficient models with GP priors excel at predicting rents and real-estate prices. The ground-motion model predicts seismic hazards in the State of California more accurately than the previous state of the art. KW - Transfer learning KW - Varying-coefficient models KW - Housing-price prediction KW - Seismic-hazard models Y1 - 2017 U6 - https://doi.org/10.1007/s10994-017-5639-3 SN - 0885-6125 SN - 1573-0565 VL - 106 SP - 1419 EP - 1440 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Dick, Uwe A1 - Scheffer, Tobias T1 - Learning to control a structured-prediction decoder for detection of HTTP-layer DDoS attackers JF - Machine learning N2 - We focus on the problem of detecting clients that attempt to exhaust server resources by flooding a service with protocol-compliant HTTP requests. Attacks are usually coordinated by an entity that controls many clients. Modeling the application as a structured-prediction problem allows the prediction model to jointly classify a multitude of clients based on their cohesion of otherwise inconspicuous features. Since the resulting output space is too vast to search exhaustively, we employ greedy search and techniques in which a parametric controller guides the search. We apply a known method that sequentially learns the controller and the structured-prediction model. We then derive an online policy-gradient method that finds the parameters of the controller and of the structured-prediction model in a joint optimization problem; we obtain a convergence guarantee for the latter method. We evaluate and compare the various methods based on a large collection of traffic data of a web-hosting service. Y1 - 2016 U6 - https://doi.org/10.1007/s10994-016-5581-9 SN - 0885-6125 SN - 1573-0565 VL - 104 SP - 385 EP - 410 PB - Springer CY - Dordrecht ER - TY - JOUR A1 - Landwehr, Niels A1 - Kuehn, Nicolas M. A1 - Scheffer, Tobias A1 - Abrahamson, Norman A. T1 - A Nonergodic Ground-Motion Model for California with Spatially Varying Coefficients JF - Bulletin of the Seismological Society of America N2 - Traditional probabilistic seismic-hazard analysis as well as the estimation of ground-motion models (GMMs) is based on the ergodic assumption, which means that the distribution of ground motions over time at a given site is the same as their spatial distribution over all sites for the same magnitude, distance, and site condition. With a large increase in the number of recorded ground-motion data, there are now repeated observations at given sites and from multiple earthquakes in small regions, so that assumption can be relaxed. We use a novel approach to develop a nonergodic GMM, which is cast as a varying-coefficient model (VCM). In this model, the coefficients are allowed to vary by geographical location, which makes it possible to incorporate effects of spatially varying source, path, and site conditions. Hence, a separate set of coefficients is estimated for each source and site coordinate in the data set. The coefficients are constrained to be similar for spatially nearby locations. This is achieved by placing a Gaussian process prior on the coefficients. The amount of correlation is determined by the data. The spatial correlation structure of the model allows one to extrapolate the varying coefficients to a new location and trace the corresponding uncertainties. The approach is illustrated with the Next Generation Attenuation-West2 data set, using only Californian records. The VCM outperforms a traditionally estimated GMM in terms of generalization error and leads to a reduction in the aleatory standard deviation by similar to 40%, which has important implications for seismic-hazard calculations. The scaling of the model with respect to its predictor variables such as magnitude and distance is physically plausible. The epistemic uncertainty associated with the predicted ground motions is small in places where events or stations are close and large where data are sparse. Y1 - 2016 U6 - https://doi.org/10.1785/0120160118 SN - 0037-1106 SN - 1943-3573 VL - 106 SP - 2574 EP - 2583 PB - Seismological Society of America CY - Albany ER -