Use this URL to cite or link to this record in EThOS:
Title: Maximum likelihood estimation of a multivariate log-concave density
Author: Cule, Madeleine
ISNI:       0000 0004 2708 3205
Awarding Body: University of Cambridge
Current Institution: University of Cambridge
Date of Award: 2010
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
Density estimation is a fundamental statistical problem. Many methods are eithersensitive to model misspecification (parametric models) or difficult to calibrate, especiallyfor multivariate data (nonparametric smoothing methods). We propose an alternativeapproach using maximum likelihood under a qualitative assumption on the shape ofthe density, specifically log-concavity. The class of log-concave densities includes manycommon parametric families and has desirable properties. For univariate data, theseestimators are relatively well understood, and are gaining in popularity in theory andpractice. We discuss extensions for multivariate data, which require different techniques. After establishing existence and uniqueness of the log-concave maximum likelihoodestimator for multivariate data, we see that a reformulation allows us to compute itusing standard convex optimization techniques. Unlike kernel density estimation, orother nonparametric smoothing methods, this is a fully automatic procedure, and noadditional tuning parameters are required. Since the assumption of log-concavity is non-trivial, we introduce a method forassessing the suitability of this shape constraint and apply it to several simulated datasetsand one real dataset. Density estimation is often one stage in a more complicatedstatistical procedure. With this in mind, we show how the estimator may be used forplug-in estimation of statistical functionals. A second important extension is the use oflog-concave components in mixture models. We illustrate how we may use an EM-stylealgorithm to fit mixture models where the number of components is known. Applicationsto visualization and classification are presented. In the latter case, improvement over aGaussian mixture model is demonstrated. Performance for density estimation is evaluated in two ways. Firstly, we considerHellinger convergence (the usual metric of theoretical convergence results for nonparametricmaximum likelihood estimators). We prove consistency with respect to this metricand heuristically discuss rates of convergence and model misspecification, supportedby empirical investigation. Secondly, we use the mean integrated squared error todemonstrate favourable performance compared with kernel density estimates using avariety of bandwidth selectors, including sophisticated adaptive methods. Throughout, we emphasise the development of stable numerical procedures able tohandle the additional complexity of multivariate data.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral