Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction

TOP Literature Database Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2307.01570

PDF

https://arxiv.org/pdf/2307.01570

Paper Information

Author: Vu-Duc Ngo;Tuan-Cuong Vuong;Thien Van Luong;Hung Tran
Published: 7-4-2023
Affiliation: MobiFone Research and Development Center, MobiFone Corporation
Country: Vietnam
Conference: Clust. Comput.

Labels Estimated by AI

Feature Selection Method Feature Extraction Method Computational Efficiency

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.

External Datasets

UNSW-NB15

KDD99

NSLKDD

Kyoto 2006+