Research Group on Intelligent Engineering and Computing for Advanced Innovation and Development (GECAD), School of Engineering, Polytechnic of Porto (ISEP/IPP)
These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The use of Machine Learning (ML) models in cybersecurity solutions requires
high-quality data that is stripped of redundant, missing, and noisy
information. By selecting the most relevant features, data integrity and model
efficiency can be significantly improved. This work evaluates the feature sets
provided by a combination of different feature selection methods, namely
Information Gain, Chi-Squared Test, Recursive Feature Elimination, Mean
Absolute Deviation, and Dispersion Ratio, in multiple IoT network datasets. The
influence of the smaller feature sets on both the classification performance
and the training time of ML models is compared, with the aim of increasing the
computational efficiency of IoT intrusion detection. Overall, the most
impactful features of each dataset were identified, and the ML models obtained
higher computational efficiency while preserving a good generalization, showing
little to no difference between the sets.