Use this URL to cite or link to this record in EThOS:
Title: Optimisation-based methodologies for complex data analysis
Author: Silva, Jonathan Cardoso
ISNI:       0000 0004 7656 630X
Awarding Body: King's College London
Current Institution: King's College London (University of London)
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Networks are a natural representation of data collected across many disciplines. The complex relationships between entities studied in these fields, whether it be people, computers or molecules, cannot be fully characterised individually but are, instead, better described by computational models as a function of their overall interactions. This thesis focuses on the development of such models to detect communities and to predict outcome variables in real networks using mathematical programming, a transparent, flexible and customisable modelling paradigm. First, the temporal evolution of groups in complex social networks is explored. Various methods detect groups in dynamic networks either by aggregating all temporal contact information into a single network or by looking at snapshots of time independently. In this work, a more robust approach is employed where, at each time step, both the current and previous modular structures of a network are considered. A mixed integer non-linear programming (MINLP) model is proposed to capture more stable patterns of change and was shown to match the ground truth of networks. Next, the development of Quantitative Structure-Activity Relationship models (QSAR) is addressed. These regression models are vastly used in drug discovery and aim to predict biological activity from the attributes of molecules. In this work, algorithms are proposed to divide the compounds in sub-groups either by their molecular features or from modules that naturally arise when representing this data as a network. Suitable equations that predict biological activity are then identified for each group by a mathematical programming model. These algorithms create predictive, customisable and interpretable QSAR sub-models, which can later be used for virtual screening, SAR studies or lead optimisation of drug candidates. Overall, this thesis proposes computational models to optimisation problems in regression and network analysis. The proposed methods produce transparent and interpretable solutions towards a better understanding of the dynamics of social systems and have the potential to assist in the endeavours of drug discovery.
Supervisor: Tsoka, Sophia ; Sastry, Nishanth Ramakrishna Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available