Use this URL to cite or link to this record in EThOS:
Title: An intrusion detection scheme for identifying known and unknown web attacks (I-WEB)
Author: Kamarudin, Muhammad Hilmi
ISNI:       0000 0004 7425 6287
Awarding Body: University of Warwick
Current Institution: University of Warwick
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
The number of utilised features could increase the system's computational effort when processing large network traffic. In reality, it is pointless to use all features considering that redundant or irrelevant features would deteriorate the detection performance. Meanwhile, statistical approaches are extensively practised in the Anomaly Based Detection System (ABDS) environment. These statistical techniques do not require any prior knowledge on attack traffic; this advantage has therefore attracted many researchers to employ this method. Nevertheless, the performance is still unsatisfactory since it produces high false detection rates. In recent years, the demand for data mining (DM) techniques in the field of anomaly detection has significantly increased. Even though this approach could distinguish normal and attack behaviour effectively, the performance (true positive, true negative, false positive and false negative) is still not achieving the expected improvement rate. Moreover, the need to re-initiate the whole learning procedure, despite the attack traffic having previously been detected, seems to contribute to the poor system performance. This study aims to improve the detection of normal and abnormal traffic by determining the prominent features and recognising the outlier data points more precisely. To achieve this objective, the study proposes a novel Intrusion Detection Scheme for Identifying Known and Unknown Web Attacks (I-WEB) which combines various strategies and methods. The proposed I-WEB is divided into three phases namely pre-processing, anomaly detection and post-processing. In the pre-processing phase, the strengths of both filter and wrapper procedures are combined to select the optimal set of features. In the filter, Correlation-based Feature Selection (CFS) is proposed, whereas the Random Forest (RF) classifier is chosen to evaluate feature subsets in wrapper procedures. In the anomaly detection phase, the statistical analysis is used to formulate a normal profile as well as calculate the traffic normality score for every traffic. The threshold measurement is defined using Euclidean Distance (ED) alongside the Chebyshev Inequality Theorem (CIT) with the aim of improving the attack recognition rate by eliminating the set of outlier data points accurately. To improve the attack identification and reduce the misclassification rates that are first detected by statistical analysis, ensemble-learning particularly using a boosting classifier is proposed. This method uses using LogitBoost as the meta-classifier and RF as the base-classifier. Furthermore, verified attack traffic detected by ensemble learning is then extracted and computed as signatures before storing it in the signature library for future identification. This helps to reduce the detection time since similar traffic behaviour will not have to be re-executed in future.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: QA76 Electronic computers. Computer science. Computer software