Statistical classification of atmospheric regimes
Meteorologists have spent decades attempting to predict the weather over extended periods of time. Complex models of up to several million variables can only produce reliable predictions of up to four days. By representing the atmosphere in a multi-dimensional 'phase space', we hope to find preferred areas of this space where the weather will persist. Using a simple simulation model we applied 9 clustering methods, some of which are new, to the simulated data. These methods represent 3 different levels of interactions between the user and the method. While developing new cluster methods, we also developed an outlier method which is shown to be better than 16 current multivariate outlier methods, based on a real dataset. The results of the simulation studies indicate that the more interaction between the user and the method, the better the outcome. Next we adapted the usual Ward's, and Caussinus and Ruiz's clustering methods to take time into consideration. This created 6 new time constraint clustering methods which we applied to simulated data from a new time dependent simulated model. Consistent patterns were found and the results also indicate that if we apply the usual Ward's clustering method on suspected time dependent data then we would achieve the best outcome only 35% of the time, at most. Finally we looked at ways of sieving transient observations from cluster groups and highlighting significant transitions by applying several techniques to a meteorological dataset.