Refine
Language
- English (17)
Is part of the Bibliography
- yes (17)
Keywords
- catchment (3)
- floods (3)
- inference (3)
- machine learning (3)
- Bayesianism (2)
- aquifer (2)
- climate change impacts (2)
- coupled surface (2)
- data science (2)
- dynamics (2)
Dimension reduction for integrating data series in Bayesian inversion of geostatistical models
(2019)
This study explores methods with which multidimensional data, e.g. time series, can be effectively incorporated into a Bayesian framework for inferring geostatistical parameters. Such series are difficult to use directly in the likelihood estimation procedure due to their high dimensionality; thus, a dimension reduction approach is taken to utilize these measurements in the inference. Two synthetic scenarios from hydrology are explored in which pumping drawdown and concentration breakthrough curves are used to infer the global mean of a log-normally distributed hydraulic conductivity field. Both cases pursue the use of a parametric model to represent the shape of the observed time series with physically-interpretable parameters (e.g. the time and magnitude of a concentration peak), which is compared to subsets of the observations with similar dimensionality. The results from both scenarios highlight the effectiveness for the shape-matching models to reduce dimensionality from 100+ dimensions down to less than five. The models outperform the alternative subset method, especially when the observations are noisy. This approach to incorporating time series observations in the Bayesian framework for inferring geostatistical parameters allows for high-dimensional observations to be faithfully represented in lower-dimensional space for the non-parametric likelihood estimation procedure, which increases the applicability of the framework to more observation types. Although the scenarios are both from hydrogeology, the methodology is general in that no assumptions are made about the subject domain. Any application that requires the inference of geostatistical parameters using series in either time of space can use the approach described in this paper.
High-performance numerical codes are an indispensable tool for hydrogeologists when modeling subsurface flow and transport systems. But as they are written in compiled languages, like C/C++ or Fortran, established software packages are rarely user-friendly, limiting a wider adoption of such tools. OpenGeoSys (OGS), an open-source, finite-element solver for thermo-hydro-mechanical-chemical processes in porous and fractured media, is no exception. Graphical user interfaces may increase usability, but do so at a dramatic reduction of flexibility and are difficult or impossible to integrate into a larger workflow. Python offers an optimal trade-off between these goals by providing a highly flexible, yet comparatively user-friendly environment for software applications. Hence, we introduceogs5py, a Python-API for the OpenGeoSys 5 scientific modeling package. It provides a fully Python-based representation of an OGS project, a large array of convenience functions for users to interact with OGS and connects OGS to the scientific and computational environment of Python.
Geostatistics as a subfield of statistics accounts for the spatial correlations encountered in many applications of, for example, earth sciences. Valuable information can be extracted from these correlations, also helping to address the often encountered burden of data scarcity. Despite the value of additional data, the use of geostatistics still falls short of its potential. This problem is often connected to the lack of user-friendly software hampering the use and application of geostatistics. We therefore present GSTools, a Python-based software suite for solving a wide range of geostatistical problems. We chose Python due to its unique balance between usability, flexibility, and efficiency and due to its adoption in the scientific community. GSTools provides methods for generating random fields; it can perform kriging, variogram estimation and much more. We demonstrate its abilities by virtue of a series of example applications detailing their use.
The subsurface is a temporally dynamic and spatially heterogeneous compartment of the Earth's critical zone, and biogeochemical transformations taking place in this compartment are crucial for the cycling of nutrients.
The impact of spatial heterogeneity on such microbially mediated nutrient cycling is not well known, which imposes a severe challenge in the prediction of in situ biogeochemical transformation rates and further of nutrient loading contributed by the groundwater to the surface water bodies.
Therefore, we used a numerical modelling approach to evaluate the sensitivity of groundwater microbial biomass distribution and nutrient cycling to spatial heterogeneity in different scenarios accounting for various residence times.
The model results gave us an insight into domain characteristics with respect to the presence of oxic niches in predominantly anoxic zones and vice versa depending on the extent of spatial heterogeneity and the flow regime.
The obtained results show that microbial abundance, distribution, and activity are sensitive to the applied flow regime and that the mobile (i.e. observable by groundwater sampling) fraction of microbial biomass is a varying, yet only a small, fraction of the total biomass in a domain. Furthermore, spatial heterogeneity resulted in anaerobic niches in the domain and shifts in microbial biomass between active and inactive states. The lack of consideration of spatial heterogeneity, thus, can result in inaccurate estimation of microbial activity. In most cases this leads to an overestimation of nutrient removal (up to twice the actual amount) along a flow path.
We conclude that the governing factors for evaluating this are the residence time of solutes and the Damkohler number (Da) of the biogeochemical reactions in the domain. We propose a relationship to scale the impact of spatial heterogeneity on nutrient removal governed by the logioDa.
This relationship may be applied in upscaled descriptions of microbially mediated nutrient cycling dynamics in the subsurface thereby resulting in more accurate predictions of, for example, carbon and nitrogen cycling in groundwater over long periods at the catchment scale.
The fluxes of water and solutes in the subsurface compartment of the Critical Zone are temporally dynamic and it is unclear how this impacts microbial mediated nutrient cycling in the spatially heterogeneous subsurface. To investigate this, we undertook numerical modeling, simulating the transport in a wide range of spatially heterogeneous domains, and the biogeochemical transformation of organic carbon and nitrogen compounds using a complex microbial community with four (4) distinct functional groups, in water saturated subsurface compartments. We performed a comprehensive uncertainty analysis accounting for varying residence times and spatial heterogeneity. While the aggregated removal of chemical species in the domains over the entire simulation period was approximately the same as that in steady state conditions, the sub-scale temporal variation of microbial biomass and chemical discharge from a domain depended strongly on the interplay of spatial heterogeneity and temporal dynamics of the forcing. We showed that the travel time and the Damkohler number (Da) can be used to predict the temporally varying chemical discharge from a spatially heterogeneous domain. In homogeneous domains, chemical discharge in temporally dynamic conditions could be double of that in the steady state conditions while microbial biomass varied up to 75% of that in steady state conditions. In heterogeneous domains, the interquartile range of uncertainty in chemical discharge in reaction dominated systems (log(10)Da > 0) was double of that in steady state conditions. However, high heterogeneous domains resulted in outliers where chemical discharge could be as high as 10-20 times of that in steady state conditions in high flow periods. And in transport dominated systems (log(10)Da < 0), the chemical discharge could be half of that in steady state conditions in unusually low flow conditions. In conclusion, ignoring spatio-temporal heterogeneities in a numerical modeling approach may exacerbate inaccurate estimation of nutrient export and microbial biomass. The results are relevant to long-term field monitoring studies, and for homogeneous soil column-scale experiments investigating the role of temporal dynamics on microbial redox dynamics.
Stochastic modeling is a common practice for modeling uncertainty in hydrogeology. In stochastic modeling, aquifer properties are characterized by their probability density functions (PDFs). The Bayesian approach for inverse modeling is often used to assimilate information from field measurements collected at a site into properties’ posterior PDFs. This necessitates the definition of a prior PDF, characterizing the knowledge of hydrological properties before undertaking any investigation at the site, and usually coming from previous studies at similar sites. In this paper, we introduce a Bayesian hierarchical algorithm capable of assimilating various information–like point measurements, bounds and moments–into a single, informative PDF that we call ex-situ prior. This informative PDF summarizes the ex-situ information available about a hydrogeological parameter at a site of interest, which can then be used as a prior PDF in future studies at the site. We demonstrate the behavior of the algorithm on several synthetic case studies, compare it to other methods described in the literature, and illustrate the approach by applying it to a public open-access hydrogeological dataset.
Most large-scale hydrologic models fall short in reproducing groundwater head dynamics and simulating transport process due to their oversimplified representation of groundwater flow. In this study, we aim to extend the applicability of the mesoscale Hydrologic Model (mHM v5.7) to subsurface hydrology by coupling it with the porous media simulator OpenGeoSys (OGS). The two models are one-way coupled through model interfaces GIS2FEM and RIV2FEM, by which the grid-based fluxes of groundwater recharge and the river-groundwater exchange generated by mHM are converted to fixed-flux boundary conditions of the groundwater model OGS. Specifically, the grid-based vertical reservoirs in mHM are completely preserved for the estimation of land-surface fluxes, while OGS acts as a plug-in to the original mHM modeling framework for groundwater flow and transport modeling. The applicability of the coupled model (mHM-OGS v1.0) is evaluated by a case study in the central European mesoscale river basin - Nagelstedt. Different time steps, i.e., daily in mHM and monthly in OGS, are used to account for fast surface flow and slow groundwater flow. Model calibration is conducted following a two-step procedure using discharge for mHM and long-term mean of groundwater head measurements for OGS. Based on the model summary statistics, namely the Nash-Sutcliffe model efficiency (NSE), the mean absolute error (MAE), and the interquartile range error (QRE), the coupled model is able to satisfactorily represent the dynamics of discharge and groundwater heads at several locations across the study basin. Our exemplary calculations show that the one-way coupled model can take advantage of the spatially explicit modeling capabilities of surface and groundwater hydrologic models and provide an adequate representation of the spatiotemporal behaviors of groundwater storage and heads, thus making it a valuable tool for addressing water resources and management problems.
Groundwater is the biggest single source of high-quality freshwater worldwide, which is also continuously threatened by the changing climate. In this paper, we investigate the response of the regional groundwater system to climate change under three global warming levels (1.5, 2, and 3 ∘C) in a central German basin (Nägelstedt). This investigation is conducted by deploying an integrated modeling workflow that consists of a mesoscale hydrologic model (mHM) and a fully distributed groundwater model, OpenGeoSys (OGS). mHM is forced with climate simulations of five general circulation models under three representative concentration pathways. The diffuse recharges estimated by mHM are used as boundary forcings to the OGS groundwater model to compute changes in groundwater levels and travel time distributions. Simulation results indicate that groundwater recharges and levels are expected to increase slightly under future climate scenarios. Meanwhile, the mean travel time is expected to decrease compared to the historical average. However, the ensemble simulations do not all agree on the sign of relative change. Changes in mean travel time exhibit a larger variability than those in groundwater levels. The ensemble simulations do not show a systematic relationship between the projected change (in both groundwater levels and travel times) and the warming level, but they indicate an increased variability in projected changes with adjusting the enhanced warming level from 1.5 to 3 ∘C. Correspondingly, it is highly recommended to restrain the trend of global warming.
Machine learning (ML) algorithms are being increasingly used in Earth and Environmental modeling studies owing to the ever-increasing availability of diverse data sets and computational resources as well as advancement in ML algorithms. Despite advances in their predictive accuracy, the usefulness of ML algorithms for inference remains elusive. In this study, we employ two popular ML algorithms, artificial neural networks and random forest, to analyze a large data set of flood events across Germany with the goals to analyze their predictive accuracy and their usability to provide insights to hydrologic system functioning. The results of the ML algorithms are contrasted against a parametric approach based on multiple linear regression. For analysis, we employ a model-agnostic framework named Permuted Feature Importance to derive the influence of models' predictors. This allows us to compare the results of different algorithms for the first time in the context of hydrology. Our main findings are that (1) the ML models achieve higher prediction accuracy than linear regression, (2) the results reflect basic hydrological principles, but (3) further inference is hindered by the heterogeneity of results across algorithms. Thus, we conclude that the problem of equifinality as known from classical hydrological modeling also exists for ML and severely hampers its potential for inference. To account for the observed problems, we propose that when employing ML for inference, this should be made by using multiple algorithms and multiple methods, of which the latter should be embedded in a cross-validation routine.