TY - JOUR A1 - Ryngajllo, Malgorzata A1 - Childs, Liam H. A1 - Lohse, Marc A1 - Giorgi, Federico M. A1 - Lude, Anja A1 - Selbig, Joachim A1 - Usadel, Björn T1 - SLocX predicting subcellular localization of Arabidopsis proteins leveraging gene expression data JF - Frontiers in plant science N2 - Despite the growing volume of experimentally validated knowledge about the subcellular localization of plant proteins, a well performing in silico prediction tool is still a necessity. Existing tools, which employ information derived from protein sequence alone, offer limited accuracy and/or rely on full sequence availability. We explored whether gene expression profiling data can be harnessed to enhance prediction performance. To achieve this, we trained several support vector machines to predict the subcellular localization of Arabidopsis thaliana proteins using sequence derived information, expression behavior, or a combination of these data and compared their predictive performance through a cross-validation test. We show that gene expression carries information about the subcellular localization not available in sequence information, yielding dramatic benefits for plastid localization prediction, and some notable improvements for other compartments such as the mito-chondrion, the Golgi, and the plasma membrane. Based on these results, we constructed a novel subcellular localization prediction engine, SLocX, combining gene expression profiling data with protein sequence-based information. We then validated the results of this engine using an independent test set of annotated proteins and a transient expression of GFP fusion proteins. Here, we present the prediction framework and a website of predicted localizations for Arabidopsis. The relatively good accuracy of our prediction engine, even in cases where only partial protein sequence is available (e.g., in sequences lacking the N-terminal region), offers a promising opportunity for similar application to non-sequenced or poorly annotated plant species. Although the prediction scope of our method is currently limited by the availability of expression information on the ATH1 array, we believe that the advances in measuring gene expression technology will make our method applicable for all Arabidopsis proteins. KW - subcellular localization KW - support vector machine KW - prediction KW - gene expression Y1 - 2011 U6 - https://doi.org/10.3389/fpls.2011.00043 SN - 1664-462X VL - 2 PB - Frontiers Research Foundation CY - Lausanne ER - TY - JOUR A1 - Seleem, Omar A1 - Ayzel, Georgy A1 - Costa Tomaz de Souza, Arthur A1 - Bronstert, Axel A1 - Heistermann, Maik T1 - Towards urban flood susceptibility mapping using data-driven models in Berlin, Germany JF - Geomatics, natural hazards and risk N2 - Identifying urban pluvial flood-prone areas is necessary but the application of two-dimensional hydrodynamic models is limited to small areas. Data-driven models have been showing their ability to map flood susceptibility but their application in urban pluvial flooding is still rare. A flood inventory (4333 flooded locations) and 11 factors which potentially indicate an increased hazard for pluvial flooding were used to implement convolutional neural network (CNN), artificial neural network (ANN), random forest (RF) and support vector machine (SVM) to: (1) Map flood susceptibility in Berlin at 30, 10, 5, and 2 m spatial resolutions. (2) Evaluate the trained models' transferability in space. (3) Estimate the most useful factors for flood susceptibility mapping. The models' performance was validated using the Kappa, and the area under the receiver operating characteristic curve (AUC). The results indicated that all models perform very well (minimum AUC = 0.87 for the testing dataset). The RF models outperformed all other models at all spatial resolutions and the RF model at 2 m spatial resolution was superior for the present flood inventory and predictor variables. The majority of the models had a moderate performance for predictions outside the training area based on Kappa evaluation (minimum AUC = 0.8). Aspect and altitude were the most influencing factors on the image-based and point-based models respectively. Data-driven models can be a reliable tool for urban pluvial flood susceptibility mapping wherever a reliable flood inventory is available. KW - Urban pluvial flood susceptibility KW - convolutional neural network KW - deep KW - learning KW - random forest KW - support vector machine KW - spatial resolution; KW - flood predictors Y1 - 2022 U6 - https://doi.org/10.1080/19475705.2022.2097131 SN - 1947-5705 SN - 1947-5713 VL - 13 IS - 1 SP - 1640 EP - 1662 PB - Taylor & Francis CY - London ER - TY - JOUR A1 - Webber, Heidi A1 - Lischeid, Gunnar A1 - Sommer, Michael A1 - Finger, Robert A1 - Nendel, Claas A1 - Gaiser, Thomas A1 - Ewert, Frank T1 - No perfect storm for crop yield failure in Germany JF - Environmental research letters N2 - Large-scale crop yield failures are increasingly associated with food price spikes and food insecurity and are a large source of income risk for farmers. While the evidence linking extreme weather to yield failures is clear, consensus on the broader set of weather drivers and conditions responsible for recent yield failures is lacking. We investigate this for the case of four major crops in Germany over the past 20 years using a combination of machine learning and process-based modelling. Our results confirm that years associated with widespread yield failures across crops were generally associated with severe drought, such as in 2018 and to a lesser extent 2003. However, for years with more localized yield failures and large differences in spatial patterns of yield failures between crops, no single driver or combination of drivers was identified. Relatively large residuals of unexplained variation likely indicate the importance of non-weather related factors, such as management (pest, weed and nutrient management and possible interactions with weather) explaining yield failures. Models to inform adaptation planning at farm, market or policy levels are here suggested to require consideration of cumulative resource capture and use, as well as effects of extreme events, the latter largely missing in process-based models. However, increasingly novel combinations of weather events under climate change may limit the extent to which data driven methods can replace process-based models in risk assessments. KW - crop yield failure KW - extreme events KW - support vector machine KW - process-based crop model KW - Germany Y1 - 2020 U6 - https://doi.org/10.1088/1748-9326/aba2a4 SN - 1748-9326 VL - 15 IS - 10 PB - IOP Publ. Ltd. CY - Bristol ER -