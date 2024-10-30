Schließen

Potential use of data-driven models to estimate and predict Soybean yields at national scale in Brazil

  • Large-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the "best" data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression-MLR; random forests-RF; and support vector machines-SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years x 150 "high-quality" counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potentialLarge-scale assessment of crop yields plays a fundamental role for agricultural planning and to achieve food security goals. In this study, we evaluated the robustness of data-driven models for estimating soybean yields at 120 days after sow (DAS) in the main producing regions in Brazil; and evaluated the reliability of the "best" data-driven model as a tool for early prediction of soybean yields for an independent year. Our methodology explicitly describes a general approach for wrapping up publicly available databases and build data-driven models (multiple linear regression-MLR; random forests-RF; and support vector machines-SVM) to predict yields at large scales using gridded data of weather and soil information. We filtered out counties with missing or suspicious yield records, resulting on a crop yield database containing 3450 records (23 years x 150 "high-quality" counties). RF and SVM had similar results for calibration and validation steps, whereas MLR showed the poorest performance. Our analysis revealed a potential use of data-driven models for predict soybean yields at large scales in Brazil with around one month before harvest (i.e. 90 DAS). Using a well-trained RF model for predicting crop yield during a specific year at 90 DAS, the RMSE ranged from 303.9 to 1055.7 kg ha(-1) representing a relative error (rRMSE) between 9.2 and 41.5%. Although we showed up robust data-driven models for yield prediction at large scales in Brazil, there are still a room for improving its accuracy. The inclusion of explanatory variables related to crop (e.g. growing degree-days, flowering dates), environment (e.g. remotely-sensed vegetation indices, number of dry and heat days during the cycle) and outputs from process-based crop simulation models (e.g. biomass, leaf area index and plant phenology), are potential strategies to improve model accuracy.show moreshow less

Metadaten
Author details:Leonardo A. Monteiro, Rafael M. Ramos, Rafael Battisti, Johnny R. Soares, Julianne C. Oliveira, Gleyce K. D. A. Figueiredo, Rubens A. C. Lamparelli, Claas NendelORCiDGND, Marcos Alberto Lana
DOI:https://doi.org/10.1007/s42106-022-00209-0
ISSN:1735-6814
ISSN:1735-8043
Title of parent work (English):International Journal of Plant Production
Publisher:Springer
Place of publishing:New York
Publication type:Article
Language:English
Date of first publication:2022/09/07
Publication year:2022
Release date:2024/10/30
Tag:climatic and soil variables; geospatial and temporal variability; large-scale analysis; machine learning approaches; public databases
Volume:16
Issue:4
Number of pages:13
First page:691
Last Page:703
Funding institution:Sao Paulo Research Foundation (FAPESP) [2014/26767-9, 2017/08970-0]
Organizational units:Mathematisch-Naturwissenschaftliche Fakultät / Institut für Biochemie und Biologie
DDC classification:5 Naturwissenschaften und Mathematik / 57 Biowissenschaften; Biologie / 570 Biowissenschaften; Biologie
Peer review:Referiert
License (German):License LogoKeine öffentliche Lizenz: Unter Urheberrechtsschutz

