Use this URL to cite or link to this record in EThOS:
Title: Data mining for lead optimisation
Author: Papadatos, George
ISNI:       0000 0004 2720 2999
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
The recurring theme of this thesis is the application of diverse data mining and chemoinformatics techniques to structural and experimental property data, and particularly to data produced during the stage of drug discovery called lead optimisation. The work reported here seeks to provide more than one rational answer to the real-life issues routinely facing medicinal chemists. The thesis is divided into three parts: In the first part, several methodologies are described which facilitate the automatic mining of temporal, hierarchical lead optimisation data from the archives. Then, these data are appropriately used to provide informative visualisations, with regard to the exploration of chemical space, both locally (i.e. on a chemical array level) and globally (i.e. in the whole project). Finally, several ways of assessing the progress of a particular lead optimisation project are investigated. The second part of the thesis compares and assesses the relative merits of two computational methods that quantify the neighbourhood behaviour of a descriptor. The main conclusions of this part are two-fold: firstly, the optimality criterion method is demonstrated to be a suitable way to select descriptors for the systematic exploration of chemical space during array-based lead optimisation; secondly, regarding the actual neighbourhood behaviour performance exhibited by twelve types of fingerprints, it is shown that circular-based ones perform consistently better than the others and, notably, at a much lower similarity threshold. The third part focuses on explicit structural transformations between molecular pairs and their impact on properties such as hERG channel blocking, solubility and lipophilicity. More importantly, the study investigates the context of a transformation and its role on the impact of a particular modification. Using substructural descriptors to represent the context of a transformation, and considering both the local and the global environment, several contextsensitive cases are identified and rationalised. Overall, it is demonstrated that the inclusion of contextual information can enhance the predictive power of matched molecular pair analysis. Several context-sensitive examples are also identified in publicly available data.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available