Use this URL to cite or link to this record in EThOS:
Title: Stochastic models and statistical inference in evolutionary genetics : using DNA sequence data to learn about population divergence and speciation
Author: Barrigana Ramos Da Costa, R. J.
ISNI:       0000 0004 7225 6507
Awarding Body: UCL (University College London)
Current Institution: University College London (University of London)
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
During speciation, the degree of clustering of a population in terms of genetic polymorphisms increases gradually until the exchange of genes between subpopulations is no longer possible. The isolation-with-migration (IM) model is used to estimate how long ago an ancestral population divided into two subpopulations, and to infer the level of gene flow between the subpopulations during genetic divergence. Its assumption of constant gene flow until the present is however particularly unrealistic in the context of two present-day species. In addition, traditional methods to fit the IM model are aimed at large numbers of DNA sequences from a small number of loci, and are computationally very expensive. To overcome these limitations, this thesis begins by focusing on an extension of the IM model in which the initial period of gene flow is followed by a period of isolation: the so-called isolation-with-initial-migration (IIM) model. For an IIM model with potentially asymmetric gene flow and unequal subpopulation sizes, the distribution of the number of nucleotide differences between two homologous DNA sequences is derived. Based on this distribution, we develop a maximum-likelihood estimation method which is appropriate for data sets containing observations from many independent loci, and is both very efficient and able to deal with mutation rate heterogeneity. Using a data set of Drosophila sequences from approximately 30,000 loci, we show how alternative models, representing different evolutionary scenarios, can be distinguished by means of likelihood ratio tests. To enable inference on both historical and contemporary rates of gene flow between two closely related species, our estimation method is extended to a generalised IM (GIM) model, in which gene flow rates and population sizes can change at some point in the past. Finally, we show how the theory of statistical inference under model misspecification can be used to improve the accuracy of interval estimation and comparison of speciation models; and we develop a simulation method to estimate the limiting distribution of the likelihood ratio statistic when the true parameter vector lies on the boundary of the parameter space.
Supervisor: Wilkinson-Herbots, H. ; Yang, Z. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available