Title:

Smoothing and ordering in discriminant analysis

This thesis addresses the question of how to achieve reliable estimation of the posterior probability function in discriminant analysis, both for continuous and ordered discrete feature variables. In the latter instance we are also concerned with the estimation of a posterior, which, regarded as a function of the feature variables, is ordered with respect to one or more independent variable. Chapter 1 introduces the discrimination problem, establishes notation and describes the possible approaches. Methods of density estimation, for use in discriminant analysis, are described, including the kernel method, as are some more direct approaches to discrimination and classification. Some comparative studies and their conclusions are reviewed. Means of assessing the performance of a discriminant rule are described with emphasis on measures of reliability rather than separation. The final section mentions briefly the important problem of variable selection, although this is not addressed elsewhere in the thesis. Chapter 2 addresses the problem of choosing smoothing parameters in kernel density estimation with continuous variables when this is to be used in the discrimination context. It is natural to suspect that the optimal degree of smoothing for marginal density estimates may not be that which will produce an optimal density ratio or posterior probability function when two such estimates are combined. A simulation study confirms that some popular methods for choosing the smoothing parameter can produce an estimated density ratio which Is poor in terms of mean square error. Some alternatives are proposed based on direct assessment measures of reliability, not of the marginal estimates but of the predicted probabilities. These are compared to the marginal approaches. To a more limited extent, the optimal (minimum mean square error) kernel method is compared to an optimal spline estimate of the density ratio. Both the marginal and direct methods are then applied to a real data set and the resulting estimates compared with a spline estimate. Chapter 3 discusses ordered variables, from qualitative orderings to grouped continuous variables, ways in which ordering can affect a data set and suitable models In each case. Particular emphasis is given to discrete kernel estimators and isotonic regression techniques. Some problems in applying existing algorithms for the latter are described and suggestions made for overcoming these. Chapter 4 applies ordered kernels and isotonic regression to 1 and 2dimensional problems using the data of Titterington et al. (1981), concluding that the kernel methods are unable to recover the type of ordering manifested by the data and that a diagnostic approach is required. The results are compared in the univariate case to those in Chapter 2, Section 2.6 which used continuous kernels. The use of isotonic regression is then compared with 2 logistic models and an independence model using the same data set but with 3 variables. Suggestions are made for further smoothing of the isotonic estimator, 2 of which are implemented. Finally, Chapter 5 draws some conclusions and makes suggestions for further work. In particular, isotonic splines may be worthy of investigation.
