Use this URL to cite or link to this record in EThOS:
Title: Evaluation of similarity measures for ligand-based virtual screening
Author: Mazalan, Lucyantie
ISNI:       0000 0004 7428 1562
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Nearest neighbour searching is a fundamental concept for many ligand-based virtual screening applications. The system searches for the nearest molecule by quantifying their similarity using various molecular representations and similarity coefficients. These similarity measures are the key components of the system where the variability and the characteristic of the components affect the effectiveness of the search. The first aim of this thesis was to investigate the effects of 2D fingerprint dimensionality on the effectiveness of chemoinformatics applications and the contributing factors were analysed. Two nearest neighbour search applications, similarity searching and molecular clustering were conducted. Various types of coefficients were used to measure the similarity and distances of the chemical dataset. It was observed that the effectiveness of the similarity search and clustering applications varied depending on the coefficient used to measure the degree of similarity or distances. The sparseness of the representations also affects the similarity measures. The second aim of the study was to quantify the relative importance of the components influencing 2D fingerprint similarity searching and this research was carried out using cross-classified modeling. Effectiveness values produced by different types of 2D fingerprints and similarity coefficients were used to model the more important component. The bioactivity of the molecule was the most important factor identified, followed by the reference structure. Evaluation between the fingerprint representation and the similarity coefficient revealed that the fingerprint had a greater role in determining the effectiveness of the similarity searching than the similarity coefficient. This research contributes to the knowledge of similarity measures in the chemoinformatics domain on the impact of high dimensional space and the similarity search components. This contribution provides a practical implication on the effectiveness of the similarity search application in particular and ligandbased virtual screening applications.
Supervisor: Willett, Peter ; Holliday, John ; Sbaffi, Laura Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available