Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.537503
Title: Confidence and venn machines and their applications to proteomics
Author: Devetyarov, Dimitry
Awarding Body: Royal Holloway, University of London
Current Institution: Royal Holloway, University of London
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
When a prediction is made in a classification or regression problem, it is useful to have additional information on how reliable this individual prediction is. Such predictions complemented with the additional information are also expected to be valid, i.e., to have a guarantee on the outcome. Recently developed frameworks of confidence machines, category-based confidence machines and Venn machines allow us to address these problems: confidence machines complement each prediction with its confidence and output region predictions with the guaranteed asymptotical error rate; Venn machines output multiprobability predictions which are valid in respect of observed frequencies. Another advantage of these frameworks is the fact that they are based on the i.i.d. assumption and do not depend on the probability distribution of examples. This thesis is devoted to further development of these frameworks.Firstly, novel designs and implementations of confidence machines and Venn machines are proposed. These implementations are based on random forest and support vector machine classifiers and inherit their ability to predict with high accuracy on a certain type of data. Experimental testing is carried out.Secondly, several algorithms with online validity are designed for proteomic data analysis. These algorithms take into account the nature of mass spectrometry experiments and special features of the data analysed. They also allow us to address medical problems: to make early diagnosis of diseases and to identify potential biomarkers. Extensive experimental study is performed on the UK Collaborative Trial of Ovarian Cancer Screening data sets.Finally, in theoretical research we extend the class of algorithms which output valid predictions in the online mode: we develop a new method of constructing valid prediction intervals for a statistical model different from the standard i.i.d. assumption used in confidence and Venn machines.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.537503  DOI: Not available
Share: