11.1. KOH and BOH discrepancy models#
In the following we’ll use the abbreviations KOH and BOH:
KOH = Kennedy and O’Hagan, Bayesian calibration of computer models
BOH = Brynjarsdóttir and OʼHagan, Learning about physical parameters: the importance of model discrepancy. This content is particularly important if you are trying to extract the value of physical parameters from modeling data.
Comparisons between theoretical models and experimental data are at the heart of scientific inquiry. Theoretical models guide our understanding of complex systems by translating hypotheses into quantitative predictions that can be tested experimentally. Traditionally, a close fit between a model’s predictions and measured data is interpreted as a sign of success, often implying that the model parameters capture the underlying physical processes. However, this paradigm assumes that the model fully represents the complexity of actual systems – an assumption that is rarely justified in practice. All models have inherent limitations beyond their domains of validity, and using them beyond these regimes without accounting for theoretical uncertainties can lead to biased parameter estimates, reducing these parameters to mere “fitting variables” rather than meaningful physical quantities,. Moreover, discrepancies between certain measurements and otherwise successful models sometimes lead researchers to assign lower weights to these data, thereby diminishing their utility and limiting the potential insights they can provide.
A model discrepancy framework employing Gaussian processes (GPs) was introduced in KOH and variations have been explored in many studies. In these approaches, the discrepancy between experimental data and theoretical model predictions, stemming from missing physics or approximations, is modeled using a GP. However, one persistent challenge is the need to constrain the GP’s covariance kernel. For example, in BOH, the authors emphasized the importance of incorporating knowledge of the theory’s validity at specific points in the input space (i.e., the domain in which observables are measured) so that both the GP and its derivative could be accurately constrained. In practice, however, specifying such accurate knowledge about the theory is often difficult. In this chapter, we construct the GP covariance kernel based on only qualitative prior knowledge of the theory’s domain of validity across the input space. This type of knowledge – for example, recognizing that “the theory is more reliable in this regime than in that one” – is typically easier to provide and often available. By leveraging this information, the framework prioritizes the accurate extraction of model parameters rather than simply optimizing the fit to the observables. We perform Bayesian parameter inference to simultaneously estimate both the model parameters and the GP hyperparameters, thereby quantifying uncertainties from both the experimental data and the theoretical model.