"Thinking inside the box" : using derivatives to improve Bayesian black box emulation of computer simulators with application to compartmental models
Increasingly, science relies on complex numerical models to aid understanding of physical phenomena. Often the equations in such models contain a high number of poorly known parameters so that the resulting output encodes much uncertainty. A 'computer simulator', which comprises the model equations together with a solver routine, produces a solution for a given choice of these 'input' parameters. In cases where the dimension of the input parameter space is high, we can only hope to obtain a thin coverage of the space by running the simulator. Building a representation of the simulator output as a function of the input, then, is a statistical problem in which we observe output at a collection of input choices and, based on these observations, infer output values for unseen inputs about which we are uncertain. In a Bayesian context, this representation, termed the 'emulator', encodes our beliefs about the relationships between inputs and outputs. Our interest is in exploiting the structure of compartmental models to aid in this process. Compartmental models are widely applied to model systems in the absence of fundamental equations to describe the processes of interest. We show that the structure of such models enables us to efficiently generate additional function information, in the form of input derivatives, each time we run the simulator and we adapt the emulator methodology to allow for derivatives. We show that considering derivatives offers a range of natural ways to aid assessment of prior beliefs and that updating based on derivatives can lead to substantial reduction in emulator uncertainty. We show that, in addition, the model structure allows us to derive estimates of increased costs of generating derivatives which we can compare against the corresponding reduction in uncertainties. We are motivated throughout by the problem of calibrating a compartmental model of plankton cycles at multiple locations in the sea, and we show that a knock on effect of reduction of uncertainty by derivatives is an improvement in our ability to perform this calibration. The search for a model which could accurately reproduce plankton cycles at various physical locations, if successful, is thought to have significant ramifications for understanding climate change.