Bayesian molecular phylogenetics : estimation of divergence dates and hypothesis testing
With the advent of automated sequencing, sequence data are now available to help us understand the functioning of our genome, as well as its history. To date,powerful methods such as maximum likelihood have been used to estimate its mode and tempo of evolution and its branching pattern. However, these methods appear to have some limitations. The purpose of this thesis is to examine these issues in light of Bayesian modelling, taking advantage of some recent advances in Bayesian computation. Firstly, Bayesian methods to estimate divergence dates when rates of evolution vary from lineage to lineages are extended and compared. The power of the technique is demonstrated by analysing twenty-two genes sampled across the metazoans to test the Cambrian explosion hypothesis. While the molecular clock gives divergence dates at least twice as old as those indicated by the fossil records, it is shown (i) that modelling rate change gives results consistent with the fossils, (ii) that this improves dramatically the fit to the data and (iii) that these results are not dependent on the choice of a specific model of rate change.Results from this analysis support a molecular explosion of the metazoans about 600 million years (MY) ago, i.e. only some 50 MY before the morphological Cambrian explosion. Secondly, two new Bayesian tests of phylogenetic trees are developed. The first aims at selecting the correct tree, while the second constructs confidence sets of trees. Two other tests are also developed, in the frequentist framework. Based on p-values adjusted for multiple comparisons,they are built to match their Bayesian counterparts. These four new tests are compared with previous tests. Their sensitivity to model misspecification and the problem of regions is discussed. Finally, some extensions to the models examined are made to estimate divergence dates from data of multiple genes, and to detect positive selection.