TY - JOUR A1 - Dormann, Carsten F. A1 - Elith, Jane A1 - Bacher, Sven A1 - Buchmann, Carsten M. A1 - Carl, Gudrun A1 - Carre, Gabriel A1 - Garcia Marquez, Jaime R. A1 - Gruber, Bernd A1 - Lafourcade, Bruno A1 - Leitao, Pedro J. A1 - Münkemüller, Tamara A1 - McClean, Colin A1 - Osborne, Patrick E. A1 - Reineking, Bjoern A1 - Schröder-Esselbach, Boris A1 - Skidmore, Andrew K. A1 - Zurell, Damaris A1 - Lautenbach, Sven T1 - Collinearity a review of methods to deal with it and a simulation study evaluating their performance JF - Ecography : pattern and diversity in ecology ; research papers forum N2 - Collinearity refers to the non independence of predictor variables, usually in a regression-type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold-based pre-selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor-response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine-learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold-based pre-selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold-based pre-selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the folk lore'-thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre-analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them. Y1 - 2013 U6 - https://doi.org/10.1111/j.1600-0587.2012.07348.x SN - 0906-7590 SN - 1600-0587 VL - 36 IS - 1 SP - 27 EP - 46 PB - Wiley-Blackwell CY - Hoboken ER - TY - JOUR A1 - Fournier, Bertrand A1 - Steiner, Magdalena A1 - Brochet, Xavier A1 - Degrune, Florine A1 - Mammeri, Jibril A1 - Carvalho, Diogo Leite A1 - Siliceo, Sara Leal A1 - Bacher, Sven A1 - Peña-Reyes, Carlos Andrés A1 - Heger, Thierry Jean T1 - Toward the use of protists as bioindicators of multiple stresses in agricultural soils BT - a case study in vineyard ecosystems JF - Ecological indicators : integrating monitoring, assessment and management N2 - Management of agricultural soil quality requires fast and cost-efficient methods to identify multiple stressors that can affect soil organisms and associated ecological processes. Here, we propose to use soil protists which have a great yet poorly explored potential for bioindication. They are ubiquitous, highly diverse, and respond to various stresses to agricultural soils caused by frequent management or environmental changes. We test an approach that combines metabarcoding data and machine learning algorithms to identify potential stressors of soil protist community composition and diversity. We measured 17 key variables that reflect various potential stresses on soil protists across 132 plots in 28 Swiss vineyards over 2 years. We identified the taxa showing strong responses to the selected soil variables (potential bioindicator taxa) and tested for their predictive power. Changes in protist taxa occurrence and, to a lesser extent, diversity metrics exhibited great predictive power for the considered soil variables. Soil copper concentration, moisture, pH, and basal respiration were the best predicted soil variables, suggesting that protists are particularly responsive to stresses caused by these variables. The most responsive taxa were found within the clades Rhizaria and Alveolata. Our results also reveal that a majority of the potential bioindicators identified in this study can be used across years, in different regions and across different grape varieties. Altogether, soil protist metabarcoding data combined with machine learning can help identifying specific abiotic stresses on microbial communities caused by agricultural management. Such an approach provides complementary information to existing soil monitoring tools that can help manage the impact of agricultural practices on soil biodiversity and quality. KW - Biomonitoring KW - Machine learning KW - Predictive model KW - Soil function KW - Soil KW - quality KW - Microbial ecology Y1 - 2022 U6 - https://doi.org/10.1016/j.ecolind.2022.108955 SN - 1470-160X SN - 1872-7034 VL - 139 PB - Elsevier CY - Amsterdam ER -