Use this URL to cite or link to this record in EThOS:
Title: Fingerprinting of complex bioprocess data
Author: Mohamed Azmin, Nor Fadhillah
Awarding Body: University of Newcastle Upon Tyne
Current Institution: University of Newcastle upon Tyne
Date of Award: 2013
Availability of Full Text:
Access from EThOS:
Access from Institution:
The focus of the research is on the analysis of complex bioprocess datasets with the ultimate goal of forming a link between the data and its underlying biological patterns. The challenges associated with investigating complex bioprocess data include the high dimensionality of the underlying measurements, the limited number of “observations”, and the complexity of selecting meaningful features to characterise the data. Contained within these data is a wealth of information that can contribute to inferring process outcomes and providing insight into improving productivity and process efficiency. To address these challenges, there is a real need for techniques to analyse and extract knowledge from the data. This thesis investigates an integrated discrete wavelet transform (DWT) and multiway principal components analysis (MPCA) approach to extract meaningful information from different types of bioprocess data. The integrated methodology is demonstrated by application to two types of bioprocess data: a near infrared (NIR) dataset collected from an industrial monoclonal antibodies (MAb) process, and an electrospray ionisation mass spectrometry (ESI-MS) dataset generated during the development of recombinant mammalian cell lines. The objective of the thesis was to develop a methodology that enabled the extraction of information from these two data sets. For the industrial NIR dataset, the genealogy or parent-child relationship of batch process from monoclonal antibodies (MAb) manufacturing was investigated whilst for the ESI-MS dataset goal was to identify characteristics that would enable the differentiation between high and low cell producers. The main challenges of the NIR and ESI-MS data sets lay in the complexity of the spectra. The NIR spectra usually have broad overlapping peaks and baseline shifts. Furthermore, as the NIR spectra used in this thesis were collected from batch process, there is an extra dimension in the data that of batch. On the one hand, the extra dimension provides extra information but on the other, it presents a further challenge as the data now is three-dimensional and requires additional pre-processing, including data matrix unfolding and batch alignment. Similar to the NIR spectra, the ESI-MS dataset also faces the problem of baseline shifts along with other complexities including high noise to signal ratio, shifts in the mass-to-charge ratio, and differences in signal intensities. These challenges lead to difficulties in extracting relevant information about the feature of interest. The proposed methodology was proven effective in extracting meaningful information from both data sets. In summary, the proposed method which utilised the integration of discrete wavelet transform and multiway principal component analysis was able to differentiate the distinguished characteristics of the spectra in the datasets thereby providing understanding of the relationships between spectral data and the underlying behaviour of the process.
Supervisor: Not available Sponsor: International Islamic University Malaysia ; Ministry of High Education Malaysia
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available