Use this URL to cite or link to this record in EThOS:
Title: Statistical models for RNA-seq data analysis of cancer
Author: Stupnikov, Aleksei
ISNI:       0000 0004 6495 1454
Awarding Body: Queen's University Belfast
Current Institution: Queen's University Belfast
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
In our research we addressed several major points, related with RNA-seq-based models for Cancer. The first chapter reviews various genomics technologies from the pre-NGS era and most commonSy used NGS platforms, as well as recently developed methods. From here the main concepts of differential expression for SAGE technology and RNA-seq were considered, going on to discuss several the most widely used methods in the field. In the third chapter we formulated the biological problem, that is, reproducibility and robustness of RNA-seq Differential Expression Analysis, and made some general observations on counts distributions of cancer-related RNA-seq data as well as sequencing depth alterations impact on data. In the chapter five we employed this robustness approach to rank the performance of existing differential gene expression (DGE) models and studied effects of subsamping in terms of library, size and number of samples on the outcome of a DGE analysis. In addition, in this chapter we introduced samExploreR - an R package that allows one to implement the sequencing depth altering simulations quickly and efficiently. Building on this work we applied the concept of subsampling to Quadratic - a candidate compound discovery framework based on connectivity mapping and explored its robustness and reproducibility for various, datasets. Finally, in chapter seven we explored how integrating information from different RNA-seq based approaches may affect the resulting outcome of the analysis and studied robustness' of those methods. The approaches adapted in this body of work allowed us to introduce the procedure of subsampling as a quality control measure that can allow an inference of quality when applied to datasets in research and clinical procedures.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available