An online and adaptive signature-based approach for intrusion detection using learning classifier systems

Download files
Access & Terms of Use
open access
Copyright: Shafi, Kamran
Altmetric
Abstract
This thesis presents the case of dynamically and adaptively learning signatures for network intrusion detection using genetic based machine learning techniques. The two major criticisms of the signature based intrusion detection systems are their i) reliance on domain experts to handcraft intrusion signatures and ii) inability to detect previously unknown attacks or the attacks for which no signatures are available at the time. In this thesis, we present a biologically-inspired computational approach to address these two issues. This is done by adaptively learning maximally general rules, which are referred to as signatures, from network traffic through a supervised learning classifier system, UCS. The rules are learnt dynamically (i.e., using machine intelligence and without the requirement of a domain expert), and adaptively (i.e., as the data arrives without the need to relearn the complete model after presenting each data instance to the current model). Our approach is hybrid in that signatures for both intrusive and normal behaviours are learnt. The rule based profiling of normal behaviour allows for anomaly detection in that the events not matching any of the rules are considered potentially harmful and could be escalated for an action. We study the effect of key UCS parameters and operators on its performance and identify areas of improvement through this analysis. Several new heuristics are proposed that improve the effectiveness of UCS for the prediction of unseen and extremely rare intrusive activities. A signature extraction system is developed that adaptively retrieves signatures as they are discovered by UCS. The signature extraction algorithm is augmented by introducing novel subsumption operators that minimise overlap between signatures. Mechanisms are provided to adapt the main algorithm parameters to deal with online noisy and imbalanced class data. The performance of UCS, its variants and the signature extraction system is measured through standard evaluation metrics on a publicly available intrusion detection dataset provided during the 1999 KDD Cup intrusion detection competition. We show that the extended UCS significantly improves test accuracy and hit rate while significantly reducing the rate of false alarms and cost per example scores than the standard UCS. The results are competitive to the best systems participated in the competition in addition to our systems being online and incremental rule learners. The signature extraction system built on top of the extended UCS retrieves a magnitude smaller rule set than the base UCS learner without any significant performance loss. We extend the evaluation of our systems to real time network traffic which is captured from a university departmental server. A methodology is developed to build fully labelled intrusion detection dataset by mixing real background traffic with attacks simulated in a controlled environment. Tools are developed to pre-process the raw network data into feature vector format suitable for UCS and other related machine learning systems. We show the effectiveness of our feature set in detecting payload based attacks.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Shafi, Kamran
Supervisor(s)
Abbass, Hussein
Zhu, Weiping
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2008
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download whole.pdf 3.29 MB Adobe Portable Document Format
Related dataset(s)