Use this URL to cite or link to this record in EThOS:
Title: Statistical methods for Monte-Carlo based multiple hypothesis testing
Author: Hahn, Georg
ISNI:       0000 0004 7233 0148
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Access from Institution:
Statistical hypothesis testing is a key technique to perform statistical inference. The main focus of this work is to investigate multiple testing under the assumption that the analytical p-values underlying the tests for all hypotheses are unknown. Instead, we assume that they can be approximated by drawing Monte Carlo samples under the null. The first part of this thesis focuses on the computation of test results with a guarantee on their correctness, that is decisions on multiple hypotheses which are identical to the ones obtained with the unknown p-values. We present MMCTest, an algorithm to implement a multiple testing procedure which yields correct decisions on all hypotheses (up to a pre-specified error probability) based solely on Monte Carlo simulation. MMCTest offers novel ways to evaluate multiple hypotheses as it allows to obtain the (previously unknown) correct decision on hypotheses (for instance, genes) in real data studies (again up to an error probability pre-specified by the user). The ideas behind MMCTest are generalised in a framework for Monte Carlo based multiple testing, demonstrating that existing methods giving no guarantees on their test results can be modified to yield certain theoretical guarantees on the correctness of their outputs. The second part deals with multiple testing from a practical perspective. We assume that in practice, it might also be desired to sacrifice the additional computational effort needed to obtain guaranteed decisions and to invest it instead in the computation of a more accurate ad-hoc test result. This is attempted by QuickMMCTest, an algorithm which adaptively allocates more samples to hypotheses whose decisions are more prone to random fluctuations, thereby achieving an improved accuracy. This work also derives the optimal allocation of a finite number of samples to finitely many hypotheses under a normal approximation, where the optimal allocation is understood as the one minimising the expected number of erroneously classified hypotheses (with respect to the classification based on the analytical p-values). An empirical comparison of the optimal allocation of samples to the one computed by QuickMMCTest indicates that the behaviour of QuickMMCTest might not be too far away from being optimal.
Supervisor: Gandy, Axel Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral