Development of a suite of bioinformatics tools for the analysis and prediction of membrane protein structure
This thesis describes the development of a novel approach for prediction of the three-dimensional structure of transmembrane regions of membrane proteins directly from amino acid sequence and basic transmembrane region topology. The development rationale employed involved a knowledge-based approach. Based on determined membrane protein structures, 20x20 association matrices were generated to summarise the distance associations between amino acid side chains on different alpha helical transmembrane regions of membrane proteins. Using these association matrices, combined with a knowledge-based scale for propensity for residue orientation in transmembrane segments (kPROT) (Pilpel et al., 1999), the software predicts the optimal orientations and associations of transmembrane regions and generates a 3D structural model of a gi ven membrane protein, based on the amino acid sequence composition of its transmembrane regions. During the development, several structural and biostatistical analyses of determined membrane protein structures were undertaken with the aim of ensuring a consistent and reliable association matrix upon which to base the predictions. Evaluation of the model structures obtained for the protein sequences of a dataset of 17 membrane proteins of detennined structure based on cross-validated leave-one-out testing revealed generally high accuracy of prediction, with over 80% of associations between transmembrane regions being correctly predicted. These results provide a promising basis for future development and refinement of the algorithm, and to this end, work is underway using evolutionary computing approaches. As it stands, the approach gives scope for significant immediate benefit to researchers as a valuable starting point in the prediction of structure for membrane proteins of hitherto unknown structure.