Use this URL to cite or link to this record in EThOS:
Title: Exploring nonlinear regression methods, with application to association studies
Author: Speed, Douglas Christopher
ISNI:       0000 0004 2712 2761
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
The field of nonlinear regression is a long way from reaching a consensus. Once a method decides to explore nonlinear combinations of predictors, a number of questions are raised, such as what nonlinear combinations to permit and how best to search the resulting model space. Genetic Association Studies comprise an area that stands to gain greatly from the development of more sophisticated regression methods. While these studies' ability to interrogate the genome has advanced rapidly over recent years, it is thought that a lack of suitable regression tools prevents them from achieving their full potential. I have tried to investigate the area of regression in a methodical manner. In Chapter 1, I explain the regression problem and outline existing methods. I observe that both linear and nonlinear methods can be categorised according to the restrictions enforced by their underlying model assumptions and speculate that a method with as few restrictions as possible might prove more powerful. In order to design such a method, I begin by assuming each predictor is tertiary (takes no more than three distinct values). In Chapters 2 and 3, I propose the method Sparse Partitioning. Its name derives from the way it searches for high scoring partitions of the predictor set, where each partition defines groups of predictors that jointly contribute towards the response. A sparsity assumption supposes most predictors belong in the 'null group' indicating they have no effect on the outcome. In Chapter 4, I compare the performance of Sparse Partitioning to existing methods using simulated and real data. The results highlight how greatly a method's power depends on the validity of its model assumptions. For this reason, Sparse Partitioning appears to offer a robust alternative to current methods, as its lack of restrictions allows it to maintain power in scenarios where other methods will fail. Sparse Partitioning relies on Markov chain Monte Carlo estimation, which limits the size of problem on which it can be used. Therefore, in Chapter 5, I propose a deterministic version ofthe method which, although less powerful, is not affected by convergence issues. In Chapter 6, I describe Bayesian Projection Pursuit, which adds spline fitting into the method to cope withnon-tertiary predictors.
Supervisor: Tavaré, Simon Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: Nonlinear regression ; Association studies ; Bayesian ; Statistical genetics