Title:
|
Detection of obfuscated malware
|
A cyber war exists between anti-malware researchers and malware writers. At the heart of this
war rages a weapons race that has existed for decades, originating the 19805, with the arrival
of the first computer virus. Obfuscation is one of the latest strategies employed by malware writers
to camouflage the tell-tale signs of malware and thereby undermine anti-malware software making
malware analysis difficult for anti-malware researchers.The the motivation for this research is,
therefore, to find a malware detection strategy that is immune to the obfuscation methods used by
the malware writers. One approach is to use program run-time traces (dynamic analysis) to perform
N~gram analysis. N-gram analysis is the investigation of a program structure using bytes,
charactersor text strings. The research presented in this thesis uses dynamic analysis to investigate
malwaredetection using a Support Vector Machine (SVM) approach based on N-gram analysis.
The key challenges addressed in this research are: Configuration of a host environment that can trace
both benign and malicious software programs; SVM configuration using cross~validation to provide a
robust classifier; the challenge of feature selection and feature reduction is addressed by first applying
a feature filter and then presenting the reduced feature set to the SVM for feature selection.
Several filtering methods are investigated and the findings have identified a suitable filter based on
Eigenvectors. The final challenge associated with dynamic analysis is the length of time a program
has to be run to ensure a correct classification. This is addressed in this research by investigating 14
different program run-lengths The findings show that obfuscated (packed and polymorphic) malware
can be detected using a Support Vector Machine classifier with features extracted from program
run-length traces.
|