Title:

Using derivative information in the statistical analysis of computer models

Complex deterministic models are an important tool for studying a wide range of systems. Often though, such models are computationally too expensive to perform the many runs required. In this case one option is to build a Gaussian process emulator which acts as a surrogate, enabling fast prediction of the model output at specied input congurations. Derivative information may be available, either through the running of an appropriate adjoint model or as a result of some analysis previously performed. An emulator would likely benet from the inclusion of this derivative information. Whether further eciency is achieved, however, depends on the relation between the computational cost of obtaining the derivatives and the value of the derivative information in the emulator. In our examples we see that derivatives are more valuable in models which have shorter correlation lengths and emulators without derivatives generally tend to require twice as many model runs as the emulators with derivatives to produce a similar predictive performance. We conclude that an optimal solution is likely to be a hybrid design consisting of adjoint runs in some parts of the input space and standard model runs in others. The knowledge of the derivatives of complex models can add greatly to their utility, for example in the application of sensitivity analysis or data assimilation. One way of generating such derivatives, as suggested above, is by coding an adjoint model. Despite automatic dierentiation software, this remains a complex task and the adjoint model when written is computationally more demanding. We suggest an alternative method for generating partial derivatives of complex model output, with respect to model inputs. We propose the use of a Gaussian process emulator which, as long as the model is suitable for emulation, can be used to estimate derivatives even without any derivative information known a priori. We present encouraging results which show how an emulator of derivatives could reduce the demand for writing and running adjoint models. This is done with the use of both toy models and the climate model CGOLDSTEIN.
