Title:

Simple components, correlated components and an application of statistical shape analysis to consumer and other multivariate data

The interpretation of a principal component analysis can be complicated because the components are linear combinations of possibly many observed variables. A rotation of the principal components can improve the interpretation, however, there are usually still many small noninformative loadings, which taken together account for a significant proportion of the observed variation. Presented is a new computationally efficient method to find simple components using similar criteria to principal components. Simple components are defined to have restricted weights that are proportional to the set of integers {0, 1, 1}. This choice ensures that no subjective decision is required as to whether a weight is important, and an individual weight is interpreted in a similar way to a correlation of one, minus one or zero with the component. The algorithm can find solutions for large problems in tractable time and can easily accommodate alternative criteria. An application is proposed that provides a simple component summary of a large data set. When data is related to an orthogonal basis, these axes represent the maximum separation of information between axes. An approach is developed that finds orthogonal rotations of the principal components so that the sum or the sum of the squared covariance between a set of components is maximized. This approach can find a group of correlated components that explain a latent trait, and in addition explain different aspects of that trait. Another application is developed where an arbitrary configuration of points from a multidimensional scaling or similar method, can be displayed on a parallel coordinate plot so that the number of cross over's between the axes are minimized. This aids the identification of clusters and outliers. In consumer research a respondent's perception is often driven by tacit knowledge, for example when making product comparisons. However, the traditional variable analogue scale may not capture this. A two dimensional response is proposed for a multiple product comparison. Principal shape analysis is developed to extract latent shape responses from the questions answered by the respondents. The analysis framework is coordinate free, and uses a scaled Euclidean distance matrix to represent a configuration of products, which can be considered a shape. A Euclidean distance matrix representation does not suffer from the problems associated with the use of shape coordinate systems.
