Use this URL to cite or link to this record in EThOS:
Title: Uncertainty estimation for QSAR models using machine learning methods
Author: Founti, Christina Maria
ISNI:       0000 0004 8506 5208
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Providing safe, timely and affordable treatments is a major challenge addressed by big pharma. An important computational technique that is established in risk assessment as an alternative method to animal testing is Quantitative Structure-Activity Relationship (QSAR) modelling. In drug discovery, QSAR models are utilised to predict the properties of new compounds, thus reducing the number of tests required and associated risks of potential side effects leading to high costs and drug attrition. Yet, their value is limited in the absence of information regarding the reliability of their predictions. The current research contributes to the understanding of limitations associated with uncertainty estimation methods for QSAR models and their implications on the validation of Absorption, Distribution, Metabolism and Excretion (ADME) models. The aim of this thesis is to investigate the value of machine learning algorithms in the estimation of errors in QSAR models and report on their performance for different ADME endpoints. The study focuses on the evaluation of error models as a method for identifying poorly predicted compounds and estimating the uncertainty of QSAR predictions. Assessment of the models takes into account the correlations of the error estimates to the actual prediction errors and the magnitude of the error estimates in relation to the experimental error. The error models are then integrated in the conformal prediction framework for the estimation of compound-specific prediction intervals. For this purpose, a new normalisation method that combines error models and applicability domain features is defined. The results of the assessment suggest that the performance of error models is influenced by the quality of the QSAR model and the presence of measurement bias in the modelled ADME data. It is shown that considering different types of features in the error models provides a flexible approach for optimising not only the efficiency of prediction intervals but also ensuring that they are correlated to the actual prediction error.
Supervisor: Gillet, Val ; Vessey, Jonathan D. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available