Use this URL to cite or link to this record in EThOS:
Title: Bayesian regression and discrimination with many variables
Author: Chang, Kai-Ming
ISNI:       0000 0001 3527 7841
Awarding Body: University of London
Current Institution: University College London (University of London)
Date of Award: 2001
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This thesis attempts to provide general procedures for Bayesian regression and discriminant analysis with many variables and explore potential problems in the analysis. For regression analysis, a normal random regression model is assumed, i.e. the joint distribution of the response variables and the regressors is multivariate normal given their means and covariance matrix. For the discriminant analysis, we consider the case that each observation is from one of several multivariate normal populations. In classical statistics, the problem in fitting a multivariate model with more variables than the number of observations is that the estimate of the covariance matrix of the multivariate normal distribution is singular and the distribution is degenerate. In Bayesian statistics, this problem can be avoided by using proper prior assumptions for the covariance matrix. We assign an inverse-Wishart distribution (which is a conjugate prior in the case of a non-hierarchical analysis) for the covariance matrix and suppose the prior expected covariance matrix has a simple structure so that the number of hyperparameters required in the model is small. Hierarchical modelling of these hyperparameters is employed. Although we have managed to keep the model relatively simple with our strong assumptions, the posterior model is still complicated. We found ARMS within Gibbs sampling with multiple chains to be an appropriate MCMC strategy for fitting our models. Convergence checking for multiple chains MCMC is simple. Due to the ill-condition of the sample covariance matrix and the large number of variables, the computational problems are significant. Appropriate matrix manipulating and rescaling techniques are required. Two practical cases are considered as examples, one for regression and the other for discrimination. Both cases involve NIR spectral data with many variables. The high correlation between variables makes the examples more challenging. We consider three correlation structures including the over-simplified identity structure and two autoregressive correlation functions, which are believed to be much closer to the real situation than the over-simplified one. However, we found the autoregressive correlation functions do not guarantee better predictions in our examples.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Multivariate normal distribution