TY - JOUR A1 - von Specht, Sebastian A1 - Cotton, Fabrice T1 - A link between machine learning and optimization in ground-motion model development T2 - Bulletin of the Seismological Society of America N2 - The steady increase of ground-motion data not only allows new possibilities but also comes with new challenges in the development of ground-motion models (GMMs). Data classification techniques (e.g., cluster analysis) do not only produce deterministic classifications but also probabilistic classifications (e.g., probabilities for each datum to belong to a given class or cluster). One challenge is the integration of such continuous classification in regressions for GMM development such as the widely used mixed-effects model. We address this issue by introducing an extension of the mixed-effects model to incorporate data weighting. The parameter estimation of the mixed-effects model, that is, fixed-effects coefficients of the GMMs and the random-effects variances, are based on the weighted likelihood function, which also provides analytic uncertainty estimates. The data weighting permits for earthquake classification beyond the classical, expert-driven, binary classification based, for example, on event depth, distance to trench, style of faulting, and fault dip angle. We apply Angular Classification with Expectation-maximization, an algorithm to identify clusters of nodal planes from focal mechanisms to differentiate between, for example, interface- and intraslab-type events. Classification is continuous, that is, no event belongs completely to one class, which is taken into account in the ground-motion modeling. The theoretical framework described in this article allows for a fully automatic calibration of ground-motion models using large databases with automated classification and processing of earthquake and ground-motion data. As an example, we developed a GMM on the basis of the GMM by Montalva et al. (2017) with data from the strong-motion flat file of Bastias and Montalva (2016) with similar to 2400 records from 319 events in the Chilean subduction zone. Our GMM with the data-driven classification is comparable to the expert-classification-based model. Furthermore, the model shows temporal variations of the between-event residuals before and after large earthquakes in the region. Y1 - 2020 UR - https://publishup.uni-potsdam.de/frontdoor/index/index/docId/60556 SN - 0037-1106 SN - 1943-3573 VL - 110 IS - 6 SP - 2777 EP - 2800 PB - Seismological Society of America CY - Albany ER -