Use this URL to cite or link to this record in EThOS:  https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.699320 
Title:  Spectral methods and computational tradeoffs in highdimensional statistical inference  
Author:  Wang, Tengyao 
ORCID:
0000000324264679
ISNI:
0000 0004 5989 0140


Awarding Body:  University of Cambridge  
Current Institution:  University of Cambridge  
Date of Award:  2016  
Availability of Full Text: 


Abstract:  
Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational tradeoffs in those problems. In the first chapter, we prove a useful variant of the wellknown Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semidefinite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a wellknown hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational tradeoffs in this problem. Such computational tradeoffs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of highdimensional changepoint estimation, where we estimate the time of change in the mean of a highdimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semidefinite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.


Supervisor:  Not available  Sponsor:  St John's College and Cambridge Overseas Trust  
Qualification Name:  Thesis (Ph.D.)  Qualification Level:  Doctoral  
EThOS ID:  uk.bl.ethos.699320  DOI:  
Keywords:  Mathematical statistics ; spectral methods ; DavisKahan theorem ; principal component analysis ; PCA ; restricted isometry ; highdimensional changepoint estimation ; semidefinite programming  
Share: 