Use this URL to cite or link to this record in EThOS:
Title: Novel computationally intelligent machine learning algorithms for data mining and knowledge discovery
Author: Gheyas, Iffat A.
ISNI:       0000 0004 2736 1308
Awarding Body: University of Stirling
Current Institution: University of Stirling
Date of Award: 2009
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.
Supervisor: Smith, Leslie S. Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Feature SubsetSselection ; Missing value impuation ; Single Imputation ; Multiple Imputation ; Dimensionality Reduction ; Time Series Forecasting ; Curse of Dimensionality ; Neural Networks ; Evolutionary Algorithm ; Data mining ; Internet searching ; Machine learning