TY - JOUR A1 - Liemohn, Michael W. A1 - McCollough, James P. A1 - Jordanova, Vania K. A1 - Ngwira, Chigomezyo M. A1 - Morley, Steven K. A1 - Cid, Consuelo A1 - Tobiska, W. Kent A1 - Wintoft, Peter A1 - Ganushkina, Natalia Yu A1 - Welling, Daniel T. A1 - Bingham, Suzy A1 - Balikhin, Michael A. A1 - Opgenoorth, Hermann J. A1 - Engel, Miles A. A1 - Weigel, Robert S. A1 - Singer, Howard J. A1 - Buresova, Dalia A1 - Bruinsma, Sean A1 - Zhelavskaya, Irina S. A1 - Shprits, Yuri Y. A1 - Vasile, Ruggero T1 - Model Evaluation Guidelines for Geomagnetic Index Predictions JF - Space Weather: The International Journal of Research and Applications N2 - Geomagnetic indices are convenient quantities that distill the complicated physics of some region or aspect of near-Earth space into a single parameter. Most of the best-known indices are calculated from ground-based magnetometer data sets, such as Dst, SYM-H, Kp, AE, AL, and PC. Many models have been created that predict the values of these indices, often using solar wind measurements upstream from Earth as the input variables to the calculation. This document reviews the current state of models that predict geomagnetic indices and the methods used to assess their ability to reproduce the target index time series. These existing methods are synthesized into a baseline collection of metrics for benchmarking a new or updated geomagnetic index prediction model. These methods fall into two categories: (1) fit performance metrics such as root-mean-square error and mean absolute error that are applied to a time series comparison of model output and observations and (2) event detection performance metrics such as Heidke Skill Score and probability of detection that are derived from a contingency table that compares model and observation values exceeding (or not) a threshold value. A few examples of codes being used with this set of metrics are presented, and other aspects of metrics assessment best practices, limitations, and uncertainties are discussed, including several caveats to consider when using geomagnetic indices. Plain Language Summary One aspect of space weather is a magnetic signature across the surface of the Earth. The creation of this signal involves nonlinear interactions of electromagnetic forces on charged particles and can therefore be difficult to predict. The perturbations that space storms and other activity causes in some observation sets, however, are fairly regular in their pattern. Some of these measurements have been compiled together into a single value, a geomagnetic index. Several such indices exist, providing a global estimate of the activity in different parts of geospace. Models have been developed to predict the time series of these indices, and various statistical methods are used to assess their performance at reproducing the original index. Existing studies of geomagnetic indices, however, use different approaches to quantify the performance of the model. This document defines a standardized set of statistical analyses as a baseline set of comparison tools that are recommended to assess geomagnetic index prediction models. It also discusses best practices, limitations, uncertainties, and caveats to consider when conducting a model assessment. Y1 - 2018 U6 - https://doi.org/10.1029/2018SW002067 SN - 1542-7390 VL - 16 IS - 12 SP - 2079 EP - 2102 PB - American Geophysical Union CY - Washington ER -