Externally validating the IoTDevID device identification methodology using the CIC IoT 2022 Dataset

TOP Literature Database Externally validating the IoTDevID device identification methodology using the CIC IoT 2022 Dataset

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2307.08679

PDF

https://arxiv.org/pdf/2307.08679

Paper Information

Author: Kahraman Kostas;Mike Just;Michael A. Lones
Published: 7-3-2023
Affiliation: Department of Computer Science, Heriot-Watt University
Country: United Kingdom
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Machine Learning Method Dataset Generation Data Integrity Constraints

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

In the era of rapid IoT device proliferation, recognizing, diagnosing, and securing these devices are crucial tasks. The IoTDevID method (IEEE Internet of Things 2022) proposes a machine learning approach for device identification using network packet features. In this article we present a validation study of the IoTDevID method by testing core components, namely its feature set and its aggregation algorithm, on a new dataset. The new dataset (CIC-IoT-2022) offers several advantages over earlier datasets, including a larger number of devices, multiple instances of the same device, both IP and non-IP device data, normal (benign) usage data, and diverse usage profiles, such as active and idle states. Using this independent dataset, we explore the validity of IoTDevID's core components, and also examine the impacts of the new data on model performance. Our results indicate that data diversity is important to model performance. For example, models trained with active usage data outperformed those trained with idle usage data, and multiple usage data similarly improved performance. Results for IoTDevID were strong with a 92.50 F1 score for 31 IP-only device classes, similar to our results on previous datasets. In all cases, the IoTDevID aggregation algorithm improved model performance. For non-IP devices we obtained a 78.80 F1 score for 40 device classes, though with much less data, confirming that data quantity is also important to model performance.

External Datasets

CIC IoT 2022

Aalto

UNSW