Use this URL to cite or link to this record in EThOS:
Title: Working with collinearity in epidemiology : development of collinearity diagnostics, identifying latent constructs in exploratory research and dealing with perfectly collinear variables in regression
Author: Woolston, Andrew Stephen
ISNI:       0000 0004 2723 0244
Awarding Body: University of Leeds
Current Institution: University of Leeds
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Collinearity plays an integral role in regression studies involving epidemiological data. Variables often form part of a common biological mechanism or measure the same element of a latent structure. It is a natural feature of most data and as such it is rarely possible to physically control for collinearity in data collection. A focus is placed on the analytical assessment of the data. Departures from independence can severely distort the interpretation of a model and the role of each covariate. This leads to increased inaccuracy as expressed through the regression coefficients and increased uncertainty as expressed through coefficient standard errors. Such a feature has the potential to impact on the clinical conclusions formed from regression studies. The work in this thesis first considers an assessment of the impact of collinearity on model parameters and the conclusions formed. A new collinearity index is developed which incorporates the role of the response in moderating the impact of collinearity. The idea for the new index is developed using vector geometry and extended to a general measure. The work in collinearity is later extended to consider the formation of a dependency structure from a collection of collinear variables. A novel methodology, labelled the matroid approach, is coded and implemented on a metabolic syndrome dataset to extract a latent structure that could represent this clinical construct. Comparisons are subsequently made to existing exploratory factor analysis and clustering methods in the literature. Finally, the unique problem of perfect collinearity is considered in a lifecourse and age-period-cohort setting. The justification of constraint and non-constraint regression methods is considered in an attempt to provide ‘solutions’ to the identification problem generated by collinearity.
Supervisor: Gilthorpe, M. ; Tu, Y. K. ; Baxter, P. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available