Use this URL to cite or link to this record in EThOS:
Title: Investigation of over-fitting and optimism in prognostic models
Author: Richardson, Matthew
ISNI:       0000 0004 0123 4255
Awarding Body: University of Birmingham
Current Institution: University of Birmingham
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Access from Institution:
This work seeks to develop a high quality prognostic model for the CARE-HF data; see (Richardson et al. 2007). The CARE-HF trial was a major study into the effects of cardiac resynchronization. Cardiac resynchronization has been shown to reduce mortality in patients suffering heart failure due to electrical problems in the heart. The prognostic model presented in this work was motivated by the question as to which patient characteristics may modify the effect of cardiac resynchronization. This is a question of great importance to clinicians. Efforts are made to produce a high quality prognostic model in part through the application of methods to reduce the risk of over-fitting. One method discussed in this work is the strategy proposed by Frank Harrell Jr. The various aspects of Harrell’s approach are discussed. An attempt is made to extend Harrell’s strategy to frailty models. Key issues such as missing data and imputation, specification of the functional form of the model, and validation are examined in relation to the prognostic model for the CARE-HF data. Material is presented covering survival analysis, maximum likelihood methods, model selection criteria (AIC, BIC), specification of functional form (cubic splines and fractional polynomials) and validation methods (cross-validation, bootstrap methods). The concepts of over-fitting and optimism are examined. The author concludes that whilst Harrell’s strategy is valuable it is still quite possible to produce models that are over-fitted. MDL (Minimum Description Length) is suggested as potentially useful methods by which statistical models can be obtained that have an in built resistance to over-fitting. The author also recommends that concepts such as over-fitting, optimism and model validation are introduced earlier in more elementary courses on statistical modelling.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: R Medicine (General) ; RC Internal medicine