A quantitative structure-activity relationship (QSAR) study of the Ames mutagenicity assay.
In-vitro mutagenicity assays have traditionally been used for first line identification of
potential genotoxic hazard, purporting to chemical carcinogenesis and heritable genetic
damage. The recent advances m combinatorial chemistry and high throughput
screening technologies have led to a massive explosion in numbers of possible
therapeutic candidates being produced at the early stages of drug discovery. This rapid
increase in the number of chemicals to be classified results in a greater need for to
acquire alternative methods for the prediction of toxicity. Quantitative StructureActivity
Relationships (QSAR) can till this need for early hazard identifications by
elucidating the physicochemical basis of biological activity. The assumption with
predictive QSARs for toxicity is that "biological activity may be described as a function
of chemical constitution".
This thesis focuses on the Ames mutagenicity assay data for two compound sets; one of
90 compounds, with limited structural flexibility, comprising a range of chemical
classes (non-congeneric series), the second, a set of 30 flavonoid compounds. Three
physicochemical descriptor sets were generated: EV A, a theoretical molecular
descriptor based on the normal co-ordinate modes of vibration; WHIM, derived from
weighting functions applied to the 3D-structural molecular co-ordinates; and TSAR, a
series of hydrophobic, electronic and steric parameters traditionally associated with the
production of biological QSARs. Various "unsupervised" data pre-treatment methods
were adopted, to reduce the level of degeneracy within the individual descriptor sets,
prior to the calculation of stepwise linear discriminant classification functions.
Good predictive models for Ames mutagenicity, as determined by leave-one-out
(jackknife) cross-validation, were obtained with each of the three physicochemical
descriptor sets. An increase in the predictive ability was observed following the
combination of variables from the individual descriptor sets, inferring that some unique
information associated with mutagenic activity is contained within each descriptor set.
The predictive stability of the models produced was assessed via independent
compound predictions, with a poor overall success rate determined. This failure in
external prediction was investigated and fundamental differences in physicochemical
data space occupancy revealed. Conclusions on training set composition and general
model applicability are made with consideration to individual model physicochemical
data space coverage.