Use this URL to cite or link to this record in EThOS:
Title: Substructural analysis techniques for structure-property correlation within computerised chemical information systems
Author: Bawden, David
ISNI:       0000 0001 0863 3126
Awarding Body: University of Sheffield
Current Institution: University of Sheffield
Date of Award: 1978
Availability of Full Text:
Access from EThOS:
Access from Institution:
The work described in this thesis involves a novel method of substructural analysis, with potential application for structure- property correlation and information retrieval within computerised chemical information systems. A review is given of the development of the concept of chemical structure and its representation, its application in computerised chemical information systems, and methods for correlating structure with molecular properties. A method is presented for derivation of structural features, representing the whole structure, from Wiswesser Line Notation (WLN) by computer program. These features are then used as variables in statistical analysis procedures: in this work multiple regression analysis and cluster analysis are used. This procedure allows for a rapid, convenient and thorough analysis of large data-sets. The type of structural features used may be easily varied, allowing for investi- gation of factors such as ring substitution patterns, group interactions, and three-dimensional structure. The method is applicable to sets of diverse or structurally related compounds. Statistical tests of the results enable quantitative testing of hypotheses. Multiple regression analysis allows a direct, quantitative correlation between structure and molecular property, and subsequent property prediction. It is applied to sets of aliphatic, alicyclic aromatic, and heterocyclic compounds, including sets of highly diverse structures. Properties examined include biological effects, toxicty, pK, thermochemical properties, boiling point, solubility, and partition coefficient. Some of these properties are highly dependent upon electronic and steric effects, and hence upon relative position of substituents, and on three-dimensional structure. Highly significant correlations are obtained in all cases, and the potential for property prediction is demonstrated. Cluster analysis is applied to several sets of structures. Intuitively sensible classifications are obtained, and the potential for both property prediction and information retrieval discussed. Since these techniques involve the widely used WLN, relatively simple COBOL programs, and standard statistical packages, they should be applicable within operational environments.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available