Contribution to the analysis of latent structures
What is a latent variable? Simply defined, a latent variable is a variable that cannot be directly measured or observed. A latent variable model or latent structure model is a model whose structure contains one or many latent variables. The subject of this thesis is the study of various topics that arise during the analysis and/or use of latent structure models. Two classical models, namely the factor analysis (FA) model and the finite mixture (FM) model, are first considered and examined extensively, after which the mixture of factor analysers (MFA) model, constructed using ingredients from both FA and FM is introduced and studied at length. Several extensions of the MFA model are also presented, one of which consists of the incorporation of fixed observed covariates into the model. Common to all the models considered are such topics as: (a) model selection which consists of the determination or estimation of the dimensionality of the latent space; (b) parameter estimation which consists of estimating the parameters of the postulated model in order to interpret and characterise the mechanism that produced the observed data; (c) prediction which consists of estimating responses for future unseen observations. Other important topics such as identifiability (for unique solution, interpretability and parameter meaningfulness), density estimation, and to a certain extent aspects of unsupervised learning and exploration of group structure (through clustering, data visualisation in 2D) are also covered. We approach such topics as parameter estimation and model selection from both the likelihood-based and Bayesian perspectives, with a concentration on Maximum Likelihood Estimation via the EM algorithm, and Bayesian Analysis via Stochastic Simulation (derivation of efficient Markov Chain Monte Carlo algorithms). The main emphasis of our work is on the derivation and construction of computationally efficient algorithms that perform well on both synthetic tasks and real-life problems, and that can be used as alternatives to other existing methods wherever appropriate.