Semi-supervised co-training and active learning based approach for multi-view intrusion detection | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1145/1529282.1529735

Title:	Semi-supervised co-training and active learning based approach for multi-view intrusion detection
Authors:	Mao C.-H. Lee H.-M. Parikh D. Chen T. Huang S.-Y.
Keywords:	Active learning Co-training Intrusion detection Multi-view Semi-supervised learning
Issue Date:	2009
Citation:	Mao C.-H., Lee H.-M., Parikh D., Chen T., Huang S.-Y. (2009). Semi-supervised co-training and active learning based approach for multi-view intrusion detection. Proceedings of the ACM Symposium on Applied Computing : 2042-2048. ScholarBank@NUS Repository. https://doi.org/10.1145/1529282.1529735
Abstract:	Although there is immense data available from networks and hosts, a very small proportion of this data is labeled due to the cost of obtaining expert labels. This proves to be a significant bottle-neck for developing supervised intrusion detection systems that rely solely on labeled data. In spite of the data being collected from real network environments and hence potentially holding valuable information for intrusion detection, such systems can not exploit the remaining unlabeled data. In this work, we intelligently leverage both labeled and unlabeled data. Also, intrusion detection tasks naturally lend themselves into a multi-view scenario, and can benefit significantly if these multiple views are combined meaningfully. In this paper, we propose a co-training method framework for intrusion detection, which is a semi-supervised learning method and can not only utilize unlabeled data, but can also combine multi-view data. We also employ an active learning framework where statistically ambiguous parts of the unlabeled data are identified, which can then be labeled by an expert. This allows for minimal expert labeling while ensuring that the labels obtained from the expert are most informative. In our experiments, we demonstrate that leveraging the unlabeled data using our proposed method significantly reduces the error rate as compared to using the labeled data alone. In addition, our proposed multi-view method has a lower error rate than using a single view. Copyright 2009 ACM.
Source Title:	Proceedings of the ACM Symposium on Applied Computing
URI:	http://scholarbank.nus.edu.sg/handle/10635/146192
ISBN:	9781605581668
DOI:	10.1145/1529282.1529735
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.