Use this URL to cite or link to this record in EThOS:
Title: Bayesian methods for gene expression analysis from high-throughput sequencing data
Author: Glaus, Peter
ISNI:       0000 0004 4692 1294
Awarding Body: University of Manchester
Current Institution: University of Manchester
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
We study the tasks of transcript expression quantification and differential expression analysis based on data from high-throughput sequencing of the transcriptome (RNA-seq). In an RNA-seq experiment subsequences of nucleotides are sampled from a transcriptome specimen, producing millions of short reads. The reads can be mapped to a reference to determine the set of transcripts from which they were sequenced. We can measure the expression of transcripts in the specimen by determining the amount of reads that were sequenced from individual transcripts. In this thesis we propose a new probabilistic method for inferring the expression of transcripts from RNA-seq data. We use a generative model of the data that can account for read errors, fragment length distribution and non-uniform distribution of reads along transcripts. We apply the Bayesian inference approach, using the Gibbs sampling algorithm to sample from the posterior distribution of transcript expression. Producing the full distribution enables assessment of the uncertainty of the estimated expression levels. We also investigate the use of alternative inference techniques for the transcript expression quantification. We apply a collapsed Variational Bayes algorithm which can provide accurate estimates of mean expression faster than the Gibbs sampling algorithm. Building on the results from transcript expression quantification, we present a new method for the differential expression analysis. Our approach utilizes the full posterior distribution of expression from multiple replicates in order to detect significant changes in abundance between different conditions. The method can be applied to differential expression analysis of both genes and transcripts. We use the newly proposed methods to analyse real RNA-seq data and provide evaluation of their accuracy using synthetic datasets. We demonstrate the advantages of our approach in comparisons with existing alternative approaches for expression quantification and differential expression analysis. The methods are implemented in the BitSeq package, which is freely distributed under an open-source license. Our methods can be accessed and used by other researchers for RNA-seq data analysis.
Supervisor: Shapiro, Jonathan; Rattray, Magnus Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Bayesian inference ; gene expression ; transcript expression ; RNA-seq ; differential expression