Structural diversity of biological ligands and their binding sites in proteins
The phenomenon of molecular recognition, which underpins almost all biological processes, is dynamic, complex and subtle. Establishing an interaction between a pair of molecules involves mutual structural rearrangements guided by a highly convoluted energy landscape, the accurate mapping of which continues to elude us. The analysis of interactions between proteins and small molecules has been a focus of intense interest for many years, offering as it does the promise of increased insight into many areas of biology, and the potential for greatly improved drug design methodologies. Computational methods for predicting which types of ligand a given protein may bind, and what conformation two molecules will adopt once paired, are particularly sought after. The work presented in this thesis aims to quantify the amount of structural variability observed in the ways in which proteins interact with ligands. This diversity is considered from two perspectives: to what extent ligands bind to different proteins in distinct conformations, and the degree to which binding sites specific for the same ligand have different atomic structures. The first study could be of value to approaches which aim to predict the bound pose of a ligand, since by cataloguing the range of conformations previously observed, it may be possible to better judge the biological likelihood of a newly predicted molecular arrangement. The findings show that several common biological ligands exhibit considerable conformational diversity when bound to proteins. Although binding in predominantly extended conformations, the analysis presented here highlights several cases in which the biological requirements of a given protein force its ligand to adopt a highly compact form. Comparing the conformational diversity observed within several protein families, the hypothesis that homologous proteins tend to bind ligands in a similar arrangement is generally upheld, but several families are identified in which this is demonstrably not the case. Consideration of diversity in the binding site itself, on the other hand, may be useful in guiding methods which search for binding sites in uncharacterised protein structures: identifying those regions of known sites which are less variable could help to focus the search only on the most important features. Analysis of the diversity of a non-redundant dataset of adenine binding sites shows that a small number of key interactions are conserved, with the majority of the fragment environment being highly variable. Just as ligand conformation varies between protein families, so the degree of binding site diversity is observed to be significantly higher in some families than others. Taken together, the results of this work suggest that the repertoire of strategies produced by nature for the purposes of molecular recognition are extremely extensive. Moreover, the importance of a given ligand conformation or pattern of interaction appears to vary greatly depending on the function of the particular group of proteins studied. As such, it is proposed that diversity analysis may form a significant part of future large-scale studies of ligand-protein interactions.