Title:
|
Model Coverage, Forward Searching, and Multiple Outlier Detection
|
Atkinson and Riani's forward search approach has been proposed as a robust procedure
for the detection of multiple outliers in fitting statistical models. However,
problems appear when it is applied to certain data sets. This thesis identifies the
probable reason to be their method of initial subset choice. The degree of alias- .
ing, inadequate representation of the range of values in the predictor set, and the'
robust estimation can all lead to a poor choice of subset, which will lead to erroneous
conclusions. This motivat.~s the concept of model coverage. Three coverage
measure are proposed to consider model coverage in the search. A starting set
with good coverage was found to improve the forward search procedure on a wide
' range of data sets. These ideas 'are implemented in Lisp-Stat and illustrated using
many real-life and simulated data sets. Several methodologies, using these ideas
in the forward search, were investiga~ed and observed improvements to the search
were recorded.
Vve called the best such methodology,' the UCA forward search algorithm. This
new algorithm was applied to many data sets and substantial improvements in the'
diagnostic plots were observed proving better results when the outlier structure is
known. The DCA forward search algorithm is shown to give good improvement
over the Atkinson and Riani methodology, and produces accurate results with data
sets containing high leverage outliers.
The DCA forward search algorithm is simple to use and produces results quickly
and efficiently.
|