Use this URL to cite or link to this record in EThOS:
Title: Analysis of the impact of mutations and prediction of their pathogenicity
Author: Al-Numair, N. S.
ISNI:       0000 0004 5358 306X
Awarding Body: University College London (University of London)
Current Institution: University College London (University of London)
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Inherited diseases and cancer are often characterized by single DNA base mutations that can result in altered gene expression, altered mRNA splicing, or changes to the protein structure. The effects of the latter category on protein function and how this is related to disease is the easiest of these to understand. Pathogenic deviations (PDs) are utations reported to be disease-causing, while true single nucleotide polymorphisms (SNPs) are understood to have a negligible effect on phenotype. With recent developments in biotechnology, the most relevant being the increased reliability and speed of sequencing, a wealth of information regarding SNPs and PDs has been acquired. Quite apart from the analytical challenge of analysing this information with a view to identifying novel therapies and targets for disease, the challenge of simply storing, mapping, and processing these data is significant in itself. This thesis builds on earlier work in the Martin group in which a database (SAAPdb) was developed to map mutation data to protein structure and allow the likely local protein structural effects of a mutation to be evaluated. In this thesis, a general introduction to the relevant biology (Chapter 1) and bioinformatics tools and resources (Chapter 2) is provided. In Chapter 3, the Single Amino Acid Polymorphism database (SAAPdb) is described and the work done to fix bugs and update the data is outlined. Despite this work, owing to continuous maintenance problems identified when updating the program, the Martin group has now switched to using a ‘pipeline’ version that no longer relies on any pre-calculated data stored in a database. Earlier work performed during a Masters project showed that some of the analyses were extremely sensitive to tructural details. These analyses have been updated and extended, confirming earlier results. Consequently, some of he analyses were updated to replace Boolean True/False (Good/Bad) assignments with energy or pseudo-energy alues. A pseudo-energy potential was developed for evaluating the effects of mutations to-proline or from-glycine (Chapter 4) and a new full-energy method for assessing the effects of side-chain clashes was evaluated (Chapter 5). method using the structural analyses data together with random forests to predict whether a mutation will be amaging was then developed (Chapter 6). This method was demonstrated to be better than all competing individual methods. A variation of this approach was used to distinguish between two phenotypes (hypertrophic ardiomyopathy – HCM, and dilated cardiomyopathy – DCM ) caused by mutations in the cardiac beta-myosin gene (MYH7, Chapter). The thesis finishes with a general discussion and conclusions (Chapter 8).
Supervisor: Martin, A. C. R. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available