Use this URL to cite or link to this record in EThOS:
Title: Optimal and sequential design for bridge regression with application in organic chemistry
Author: Carnaby, Sarah
ISNI:       0000 0004 2703 5596
Awarding Body: University of Southampton
Current Institution: University of Southampton
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis presents and applies methods for the design and analysis of experiments for a family of coefficient shrinkage methods, known collectively as bridge regression, with emphasis on the two special cases of ridge regression and the lasso. The application is the problem of understanding and predicting the melting point of small molecule organic compounds using chemical descriptors. Experiments typically have a large number of predictors compared to the number of observations, and high correlations between pairs of predictors. In this thesis, bridge regression is used to select linear models which are then compared to models selected by more commonly used methods of variable selection, such as subset selection and stepwise selection. Models including two-way product, or interaction, terms are also considered. A general method is developed for the selection of an optimal design when accurate estimates of the model coefficients are required. The method exploits a relationship between bridge regression and Bayesian methods which is used to develop a class of D-optimal designs. A necessary approximation to the variance-covariance matrix of coefficient estimators is derived. Designs are found using algorithmic search for ridge regression and the lasso, for experiments with (a) two-level factors and (b) the motivating chemistry problem. Comparisons are made with alternative designs. A sequential design criterion is developed to enhance an existing design. The criterion selects additional design points, from a finite set of candidate points, that exhibit the highest estimated prediction variance obtained from bootstrapping. The method is applied to the Bayesian D-optimal designs and is shown to be capable of improving design performance through the addition of only a small number of runs
Supervisor: Woods, David Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA Mathematics