Use this URL to cite or link to this record in EThOS:
Title: Efficient combinatorial algorithms for DNA microarray design
Author: Li, Ying
ISNI:       0000 0001 3609 5564
Awarding Body: University of Liverpool
Current Institution: University of Liverpool
Date of Award: 2008
Availability of Full Text:
Access from EThOS:
Access from Institution:
The advent of efficient genome sequencing tools and high-throughput experimental biotechnology has led to enonnous progress in the life science. DNA microarray is among the most important innovations. It allows to measure the expression for thousands of genes simultaneously by analysing the hybridisation data. Such measurements have been proved to be invaluable in understanding the development of diseases such as cancer. However, the analysis of data is non-trivial since the hybridisation data relies on the quality of DNA microarray. High quality DNA microarray will lead to more efficient hybridisation and stronger signal.and reliability. The reliability of data is essential. Thus, the development of novel algorithms and techniques for DNA microarray design is crucial. This thesis considers a number of combinatorial issues in selecting, placing, and synthesising probes during the DNA microarray design process. A probe is a specific sequence of single-stranded DNA or RNA, typically labelled with a radioactive or fluorescent tag, which is designed to bind to, and thereby identify, a particular segment of DNA (or RNA). The probe selection problem we studied is to find for each gene sequence a unique probe such that every gene in the given dataset can be identified. However, due to homology, sometimes a gene does not have a unique probe, then we use a small number of non-unique probes to identify a gene. The challenge of the problem is that there are many candidate probes in a gene sequence and we have to find the right one (or a small subset) efficiently. A randomised probe selection algorithm for DNA microarray design is proposed. The algorithm overcomes some existing algorithms demanding optimal probes by exhaustive search. \Ve implement the randomised probe selection algorithm and develop a probe selection software RANDPS. Investigations using several real-life microarray datasets show that algorithm is able to find high quality probes. Nevertheless, the number of the probes selected might be too large for placing in a single microarray, thus minimising the number of probes is an important objective, since it is proportional to the cost of the microarray experiment. Therefore, we investigate the string barcoding problem in which a set of non-unique probes is given and the probes have to be chosen from the given set of probes. The objective is to use an appropriate combination of probes with minimum cardinality such that all genes in the dataset can be distinguished. An almost optimal O(nlSllog3 n)-time approximation algorithm for the considered problem is presented. The approximation procedure is a modification of the algorithm due to Berman et a1. [l0] which obtains the best possible approximation ratio (1 + In n). The improved time complexity is a direct consequence of more careful management of processed sets, use of several specialised graph and string data structures, as well as tighter time complexity analysis based on an amortised argument. After probes are selected, they are then synthesised on the microarrays by using a light-directed chemical process in which unintended illumination may contaminate the quality of the microarray experiments. Border length is a measure of the amount of unwanted illumination and the objective of this problem is to minimise the total border length during probe synthesis process. This problem is believed to be NP-hard and approximation of the BMP problem in asynchronous synthesis is studied. As far as we know, this is the first result with proved performance guarantee. The main result is an O(vnlog2 n)-approximation, where n is the number of probes to be synthesised. In the case where the placement is given in advance, we show that the problem is O(10g2 n)-approximable. A related problem called agreement maximisation problem (MAP) is also considered in this chapter. In contrast to BMp, we show that MAP admits a constant approximation even when placement is not given in advance. Supplied by The British Library - 'The world's knowledge'
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral