Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.582997
Title: Partial least squares structural equation modelling with incomplete data : an investigation of the impact of imputation methods
Author: Mohd Jamil, J. B.
Awarding Body: University of Bradford
Current Institution: University of Bradford
Date of Award: 2012
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Despite considerable advances in missing data imputation methods over the last three decades, the problem of missing data remains largely unsolved. Many techniques have emerged in the literature as candidate solutions. These techniques can be categorised into two classes: statistical methods of data imputation and computational intelligence methods of data imputation. Due to the longstanding use of statistical methods in handling missing data problems, it takes quite some time for computational intelligence methods to gain profound attention even though these methods have analogous accuracy, in comparison to other approaches. The merits of both these classes have been discussed at length in the literature, but only limited studies make significant comparison to these classes. This thesis contributes to knowledge by firstly, conducting a comprehensive comparison of standard statistical methods of data imputation, namely, mean substitution (MS), regression imputation (RI), expectation maximization (EM), tree imputation (TI) and multiple imputation (MI) on missing completely at random (MCAR) data sets. Secondly, this study also compares the efficacy of these methods with a computational intelligence method of data imputation, ii namely, a neural network (NN) on missing not at random (MNAR) data sets. The significance difference in performance of the methods is presented. Thirdly, a novel procedure for handling missing data is presented. A hybrid combination of each of these statistical methods with a NN, known here as the post-processing procedure, was adopted to approximate MNAR data sets. Simulation studies for each of these imputation approaches have been conducted to assess the impact of missing values on partial least squares structural equation modelling (PLS-SEM) based on the estimated accuracy of both structural and measurement parameters. The best method to deal with particular missing data mechanisms is highly recognized. Several significant insights were deduced from the simulation results. It was figured that for the problem of MCAR by using statistical methods of data imputation, MI performs better than the other methods for all percentages of missing data. Another unique contribution is found when comparing the results before and after the NN post-processing procedure. This improvement in accuracy may be resulted from the neural network's ability to derive meaning from the imputed data set found by the statistical methods. Based on these results, the NN post-processing procedure is capable to assist MS in producing significant improvement in accuracy of the approximated values. This is a promising result, as MS is the weakest method in this study. This evidence is also informative as MS is often used as the default method available to users of PLS-SEM software.
Supervisor: Wallace, James Sponsor: Minister of Higher Education Malaysia ; University Utara Malaysia
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.582997  DOI: Not available
Keywords: Missing data ; Partial least squares ; Structural equation modelling ; Neural networks ; Imputation methods ; Incomplete data
Share: