Use this URL to cite or link to this record in EThOS:
Title: Statistical inference in high-dimensional matrix models
Author: Löffler, Matthias
ISNI:       0000 0004 8501 2245
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Matrix models are ubiquitous in modern statistics. For instance, they are used in finance to assess interdependence of assets, in genomics to impute missing data and in movie recommender systems to model the relationship between users and movie ratings. Typically such models are either high-dimensional, meaning that the number of parameters may exceed the number of data points by many orders of magnitudes, or nonparametric in the sense that the quantity of interest is an infinite dimensional operator. This leads to new algorithms and also to new theoretical phenomena that may occur when estimating a parameter of interest or functionals of it or when constructing confidence sets. In this thesis, we will exemplarily consider three such matrix models and develop statistical theory for them: Matrix completion, Principal Component Analysis (PCA) with Gaussian data and transition operators of Markov chains. We start with matrix completion and investigate the existence of adaptive confidence sets in the 'Bernoulli' and 'trace-regression' models. In the 'Bernoulli' model we show that adaptive confidence sets do not exist when the variance of the errors is unknown, whereas we give an explicit construction in the 'trace-regression' model. Finally, in the known variance case, we show that adaptive confidence sets do also exist in the 'Bernoulli' model based on a testing argument. Next, we consider PCA in a Gaussian observation model with complexity measured by the effective rank, the reciprocal of the percentage of variance explained by the first principal component. We investigate estimation of linear functionals of eigenvectors and prove Berry-Essen type bounds. Due to the high-dimensionality of the problem we discover a new phenomenon: The plug-in estimator based on the sample eigenvector can have non-negligible bias and hence may be not √n-consistent anymore. We show how to de-bias this estimator, achieving √n-convergence rates, and prove exact matching minimax lower bounds. Finally, we consider nonparametric estimation of the transition operator of a Markov chain and its transition density. We assume that the singular values of the transition operator decay exponentially. For example, this assumption is fulfilled by discrete, low frequency observations of periodised, reversible stochastic differential equations. Using penalization techniques from low rank matrix estimation we develop a new algorithm and show improved convergence rates.
Supervisor: Nickl, Richard Sponsor: ERC ; EPSRC
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
Keywords: High-dimensional Statistics ; Low-rank inference ; PCA