publish.UP Search

Efficient derivative-free Bayesian inference for large-scale inverse problems (2022)

Huang, Daniel Zhengyu ; Huang, Jiaoyang ; Reich, Sebastian ; Stuart, Andrew M.

We consider Bayesian inference for large-scale inverse problems, where computational challenges arise from the need for repeated evaluations of an expensive forward model. This renders most Markov chain Monte Carlo approaches infeasible, since they typically require O(10(4)) model runs, or more. Moreover, the forward model is often given as a black box or is impractical to differentiate. Therefore derivative-free algorithms are highly desirable. We propose a framework, which is built on Kalman methodology, to efficiently perform Bayesian inference in such inverse problems. The basic method is based on an approximation of the filtering distribution of a novel mean-field dynamical system, into which the inverse problem is embedded as an observation operator. Theoretical properties are established for linear inverse problems, demonstrating that the desired Bayesian posterior is given by the steady state of the law of the filtering distribution of the mean-field dynamical system, and proving exponential convergence to it. This suggests that, for nonlinear problems which are close to Gaussian, sequentially computing this law provides the basis for efficient iterative methods to approximate the Bayesian posterior. Ensemble methods are applied to obtain interacting particle system approximations of the filtering distribution of the mean-field model; and practical strategies to further reduce the computational and memory cost of the methodology are presented, including low-rank approximation and a bi-fidelity approach. The effectiveness of the framework is demonstrated in several numerical experiments, including proof-of-concept linear/nonlinear examples and two large-scale applications: learning of permeability parameters in subsurface flow; and learning subgrid-scale parameters in a global climate model. Moreover, the stochastic ensemble Kalman filter and various ensemble square-root Kalman filters are all employed and are compared numerically. The results demonstrate that the proposed method, based on exponential convergence to the filtering distribution of a mean-field dynamical system, is competitive with pre-existing Kalman-based methods for inverse problems.

Spectral convergence of diffusion maps (2021)

Wormell, Caroline L. ; Reich, Sebastian

Diffusion maps is a manifold learning algorithm widely used for dimensionality reduction. Using a sample from a distribution, it approximates the eigenvalues and eigenfunctions of associated Laplace-Beltrami operators. Theoretical bounds on the approximation error are, however, generally much weaker than the rates that are seen in practice. This paper uses new approaches to improve the error bounds in the model case where the distribution is supported on a hypertorus. For the data sampling (variance) component of the error we make spatially localized compact embedding estimates on certain Hardy spaces; we study the deterministic (bias) component as a perturbation of the Laplace-Beltrami operator's associated PDE and apply relevant spectral stability results. Using these approaches, we match long-standing pointwise error bounds for both the spectral data and the norm convergence of the operator discretization. We also introduce an alternative normalization for diffusion maps based on Sinkhorn weights. This normalization approximates a Langevin diffusion on the sample and yields a symmetric operator approximation. We prove that it has better convergence compared with the standard normalization on flat domains, and we present a highly efficient rigorous algorithm to compute the Sinkhorn weights.

GP-ETAS: semiparametric Bayesian inference for the spatio-temporal epidemic type aftershock sequence model (2022)

Molkenthin, Christian ; Donner, Christian ; Reich, Sebastian ; Zöller, Gert ; Hainzl, Sebastian ; Holschneider, Matthias ; Opper, Manfred

The spatio-temporal epidemic type aftershock sequence (ETAS) model is widely used to describe the self-exciting nature of earthquake occurrences. While traditional inference methods provide only point estimates of the model parameters, we aim at a fully Bayesian treatment of model inference, allowing naturally to incorporate prior knowledge and uncertainty quantification of the resulting estimates. Therefore, we introduce a highly flexible, non-parametric representation for the spatially varying ETAS background intensity through a Gaussian process (GP) prior. Combined with classical triggering functions this results in a new model formulation, namely the GP-ETAS model. We enable tractable and efficient Gibbs sampling by deriving an augmented form of the GP-ETAS inference problem. This novel sampling approach allows us to assess the posterior model variables conditioned on observed earthquake catalogues, i.e., the spatial background intensity and the parameters of the triggering function. Empirical results on two synthetic data sets indicate that GP-ETAS outperforms standard models and thus demonstrate the predictive power for observed earthquake catalogues including uncertainty quantification for the estimated parameters. Finally, a case study for the l'Aquila region, Italy, with the devastating event on 6 April 2009, is presented.

Forecast verification (2021)

Leung, Tsz Yan ; Leutbecher, Martin ; Reich, Sebastian ; Shepherd, Theodore G.

The philosophy of forecast verification is rather different between deterministic and probabilistic verification metrics: generally speaking, deterministic metrics measure differences, whereas probabilistic metrics assess reliability and sharpness of predictive distributions. This article considers the root-mean-square error (RMSE), which can be seen as a deterministic metric, and the probabilistic metric Continuous Ranked Probability Score (CRPS), and demonstrates that under certain conditions, the CRPS can be mathematically expressed in terms of the RMSE when these metrics are aggregated. One of the required conditions is the normality of distributions. The other condition is that, while the forecast ensemble need not be calibrated, any bias or over/underdispersion cannot depend on the forecast distribution itself. Under these conditions, the CRPS is a fraction of the RMSE, and this fraction depends only on the heteroscedasticity of the ensemble spread and the measures of calibration. The derived CRPS-RMSE relationship for the case of perfect ensemble reliability is tested on simulations of idealised two-dimensional barotropic turbulence. Results suggest that the relationship holds approximately despite the normality condition not being met.

Fokker-Planck particle systems for Bayesian inference: computational approaches (2021)

Reich, Sebastian ; Weissmann, Simon

Bayesian inference can be embedded into an appropriately defined dynamics in the space of probability measures. In this paper, we take Brownian motion and its associated Fokker-Planck equation as a starting point for such embeddings and explore several interacting particle approximations. More specifically, we consider both deterministic and stochastic interacting particle systems and combine them with the idea of preconditioning by the empirical covariance matrix. In addition to leading to affine invariant formulations which asymptotically speed up convergence, preconditioning allows for gradient-free implementations in the spirit of the ensemble Kalman filter. While such gradient-free implementations have been demonstrated to work well for posterior measures that are nearly Gaussian, we extend their scope of applicability to multimodal measures by introducing localized gradient-free approximations. Numerical results demonstrate the effectiveness of the considered methodologies.

McKean-Vlasov SDEs in nonlinear filtering (2021)

Pathiraja, Sahani Darschika ; Reich, Sebastian ; Stannat, Wilhelm

Various particle filters have been proposed over the last couple of decades with the common feature that the update step is governed by a type of control law. This feature makes them an attractive alternative to traditional sequential Monte Carlo which scales poorly with the state dimension due to weight degeneracy. This article proposes a unifying framework that allows us to systematically derive the McKean-Vlasov representations of these filters for the discrete time and continuous time observation case, taking inspiration from the smooth approximation of the data considered in [D. Crisan and J. Xiong, Stochastics, 82 (2010), pp. 53-68; J. M. Clark and D. Crisan, Probab. Theory Related Fields, 133 (2005), pp. 43-56]. We consider three filters that have been proposed in the literature and use this framework to derive Ito representations of their limiting forms as the approximation parameter delta -> 0. All filters require the solution of a Poisson equation defined on R-d, for which existence and uniqueness of solutions can be a nontrivial issue. We additionally establish conditions on the signal-observation system that ensures well-posedness of the weighted Poisson equation arising in one of the filters.

Balanced data assimilation for highly oscillatory mechanical systems (2021)

Hastermann, Gottfried ; Reinhardt, Maria ; Klein, Rupert ; Reich, Sebastian

Data assimilation algorithms are used to estimate the states of a dynamical system using partial and noisy observations. The ensemble Kalman filter has become a popular data assimilation scheme due to its simplicity and robustness for a wide range of application areas. Nevertheless, this filter also has limitations due to its inherent assumptions of Gaussianity and linearity, which can manifest themselves in the form of dynamically inconsistent state estimates. This issue is investigated here for balanced, slowly evolving solutions to highly oscillatory Hamiltonian systems which are prototypical for applications in numerical weather prediction. It is demonstrated that the standard ensemble Kalman filter can lead to state estimates that do not satisfy the pertinent balance relations and ultimately lead to filter divergence. Two remedies are proposed, one in terms of blended asymptotically consistent time-stepping schemes, and one in terms of minimization-based postprocessing methods. The effects of these modifications to the standard ensemble Kalman filter are discussed and demonstrated numerically for balanced motions of two prototypical Hamiltonian reference systems.

Supervised learning from noisy observations (2021)

Gottwald, Georg A. ; Reich, Sebastian

Data-driven prediction and physics-agnostic machine-learning methods have attracted increased interest in recent years achieving forecast horizons going well beyond those to be expected for chaotic dynamical systems. In a separate strand of research data-assimilation has been successfully used to optimally combine forecast models and their inherent uncertainty with incoming noisy observations. The key idea in our work here is to achieve increased forecast capabilities by judiciously combining machine-learning algorithms and data assimilation. We combine the physics-agnostic data -driven approach of random feature maps as a forecast model within an ensemble Kalman filter data assimilation procedure. The machine-learning model is learned sequentially by incorporating incoming noisy observations. We show that the obtained forecast model has remarkably good forecast skill while being computationally cheap once trained. Going beyond the task of forecasting, we show that our method can be used to generate reliable ensembles for probabilistic forecasting as well as to learn effective model closure in multi-scale systems. (C) 2021 Elsevier B.V. All rights reserved.

Impact of the mesoscale range on error growth and the limits to atmospheric predictability (2020)

Leung, Tsz Yan ; Leutbecher, Martin ; Reich, Sebastian ; Shepherd, Theodore G.

Global numerical weather prediction (NWP) models have begun to resolve the mesoscale k(-5/3) range of the energy spectrum, which is known to impose an inherently finite range of deterministic predictability per se as errors develop more rapidly on these scales than on the larger scales. However, the dynamics of these errors under the influence of the synoptic-scale k(-3) range is little studied. Within a perfect-model context, the present work examines the error growth behavior under such a hybrid spectrum in Lorenz's original model of 1969, and in a series of identical-twin perturbation experiments using an idealized two-dimensional barotropic turbulence model at a range of resolutions. With the typical resolution of today's global NWP ensembles, error growth remains largely uniform across scales. The theoretically expected fast error growth characteristic of a k(-5/3) spectrum is seen to be largely suppressed in the first decade of the mesoscale range by the synoptic-scale k(-3) range. However, it emerges once models become fully able to resolve features on something like a 20-km scale, which corresponds to a grid resolution on the order of a few kilometers.

Combining machine learning and data assimilation to forecast dynamical systems from noisy partial observations (2021)

Gottwald, Georg A. ; Reich, Sebastian

We present a supervised learning method to learn the propagator map of a dynamical system from partial and noisy observations. In our computationally cheap and easy-to-implement framework, a neural network consisting of random feature maps is trained sequentially by incoming observations within a data assimilation procedure. By employing Takens's embedding theorem, the network is trained on delay coordinates. We show that the combination of random feature maps and data assimilation, called RAFDA, outperforms standard random feature maps for which the dynamics is learned using batch data.

Refine

Has Fulltext

Author

Year of publication

Document Type

Language

Is part of the Bibliography

Keywords

Institute

62 search hits