Use this URL to cite or link to this record in EThOS:
Title: Modelling for selecting vaccines against antigenically variable viruses
Author: Rahman, Tameera
ISNI:       0000 0004 6061 4696
Awarding Body: University of Surrey
Current Institution: University of Surrey
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
In vitro and in vivo selection of vaccines is time consuming, expensive and the selected vaccines may not be able to provide protection against broad-spectrum viruses owing to the complexities of emerging antigenically novel disease strains. A powerful computational model that can effectively predict antigenically variant strains can minimise the amount of resources spent on exclusive serological testing of vaccines and make broad spectrum vaccines possible for many diseases. However, in silico vaccine prediction remains a grand challenge. To address this challenge, we investigate the use of linear regression, non-linear regression and support vector machine (SVM) classification models to predict the antigenic similarity between footand- mouth disease virus (FMDV) strains. The parameters of the linear regression model are estimated using the least squares method and the structure and parameters of the non-linear model are optimised using a hybrid evolutionary algorithm. We apply semi-supervised classification methods i.e. transductive SVM (TSVM) to improve our classification results due to the availability of limited labelled data. In addition, we examine two different scoring methods for weighting the type of amino acid substitutions in the classification and regression models in two different setups i.e. the entire external viral capsid protein or only antigenically important areas in the capsid proteins are considered. Statistical analysis of our data confirmed possible correlates of amino acid substitutions in antigenic areas in capsid proteins of FMDV and influenza. Across all our prediction models, we achieved the best results when the scoring method based on biochemical properties of amino acids is employed in combination with regression or classification and models based on substitutions in the antigenic areas performed better than those that took the entire exposed viral capsid protein. In our regression analysis, the non-linear regression method optimised with the evolutionary algorithm performed consistently better (throughout FMDV and influenza datasets) than the linear and non-linear models whose parameters are estimated using the least squares method. In addition, for the best models, optimised non-linear regression models consist of more terms than their linear counterparts, implying a non-linear nature of influences of amino acid substitutions. For our classification models we also used Ebola data. Our TSVM models outperformed our SVM models across all datasets i.e. FMDV, influenza and Ebola, which confirmed the benefits of using unlabelled data for boosting generalization performance. However, including additional antigenic areas in our Ebola TSVM model had no effect on the prediction ability of the model which we think is because the additional peptides were not biologically significant in terms of relaying any effect on the antigenic values which we use as our labels.
Supervisor: Jin, Yaochu ; Laing, Emma Sponsor: University of Surrey
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available