Predictive classification using mixtures of normal distributions
Classification using mixture distributions to model each class has not received too much attention in the literature. The most important attempts use normal distributions as com- ponents in these mixtures. Recently developed methods have allowed the use of these kinds of models as a flexible approach for density estimation. Most of the methods de- veloped so far use plug-in estimates for the parameters and assume that the number of components in the mixture is known. We obtain a predictive classifier for the classes by using Markov Chain Monte Carlo techniques which allow us to obtain a sampling chain for the parameters. This fully Bayesian approach to classification has the advantage that the number of components for each class is taken as another variable parameter and integrated out of the classification. To achieve this we use a birth-and-death/Gibbs sampler algorithm developed by Stephens (1997). We use five different datasets, two simulated ones to test the methods on a single class and three real datasets to test the methods for classification. We look at different models to de- fine which gives better flexibility in the modelling and an overall better classification. We look at different types of priors for the means and dispersion matrices of the components. Joint conjugate priors and an independent conjugate priors for the means and dispersion matrices for the components are used. We use a model with a common dispersion matrix for all the components and another one with a reparametrisation of these dispersion ma- trices into size, shape and orientation (Banfield and Raftery (1993)). We allow the sizes to differ while keeping a common shape and orientation for the dispersion matrices of the components in a class. We found that this type of modelling with independent conjugate priors for the means and dispersions while allowing the sizes of the dispersions to vary gave the best results for classification purposes as it allowed great flexibility and separation between the compo- nents of the classes.