Use this URL to cite or link to this record in EThOS:
Title: P-spline additive modeling and partial derivative estimation for environmental data
Author: Vazanellis, George
ISNI:       0000 0004 8503 3396
Awarding Body: University of Glasgow
Current Institution: University of Glasgow
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis addresses the construction of complex additive mixed models for environmental data and the use of those models to estimate partial derivatives for the purpose of detecting impacts of known events. The methods developed are applied to a data set collected by the Scottish Environment Protection Agency in an effort to monitor the dissolved oxygen of the River Clyde. There are many metrics recorded along the River. Exploratory analysis is carried out to pinpoint some possible drivers of the dissolved oxygen. The River Clyde contains processes which are diffcult to represent by conventional parametric models. P-splines offer a means of fitting a flexible model to this data set. There is also the possibility of the presence of interactions between some explanatory covariates. Because of the sampling regime, a random effects component is appropriate. An additive mixed model with interactions allows for all the above-mentioned components to be included in a representative model for the River Run data. The methodology for fitting such a model, along with descriptions of four information criteria which are intended to aid in smoothing parameter selection, are explained in this thesis. Two options for performing analysis of variance for additive models with interactions are considered: A simple F-test and a quadratic approach. The performance and computational expense of each is compared to a parametric bootstrap and to various other standard tests. A simple additive model with no interactions is initially fitted with varying degrees of freedom for each main effect. The four information criteria scores are calculated for every main effect across all degrees of freedom. The information criterion which performs best is then used to select the optimal smoothing parameter for every main effect in an additive model and an additive mixed model, both with no interactions. Before an additive mixed model with interactions is fitted, a simulation study is conducted to see if the order of optimization of the main effect degrees of freedom is of any importance. An additive mixed model with interactions is subsequently fitted and interpreted. One aim of this thesis is to determine if upgrades to two wastewater treatment facilities have had positive impacts to the levels of dissolved oxygen in the river. Partial derivatives with respect to time are discussed as a means of detecting subtle changes in a system which has shown gradual increases in dissolved oxygen over the past four decades. An argument is made for the use of P-splines with penalty orders other than 2 if the main goal is derivative estimation. A simulation study is conducted and the optimal penalty order is then used to construct a derivative additive mixed model with interactions for the River Run data. This model is used to see if there is evidence the wastewater facility upgrades had a positive impact. One positive result of this research is that the quadratic forms method of analysis of variance for additive models with interactions was found to out-perform the simple F-test and was less computationally expensive than the parametric bootstrap. A second positive result was finding a preferred information criterion for smoothing parameter selection and using the optimal degrees of freedom to subsequently fit such a complex additive mixed model with interactions. A third positive result was finding that penalty order three outperformed penalty order two in estimating partial derivatives. Finally, the fourth positive result was constructing a derivative model and subsequently using it to provide evidence the wastewater treatment facility upgrades had a positive impact on the dissolved oxygen.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: HA Statistics ; QA Mathematics