A Crowdsensing Intrusion Detection Dataset For Decentralized Federated Learning Models

Authors: Chao Feng, Alberto Huertas Celdran, Jing Han, Heqing Ren, Xi Cheng, Zien Zeng, Lucas Krauter, Gerome Bovet, Burkhard Stiller | Published: 2025-07-17

2025.07.172025.07.19

Authors: Chao Feng, Alberto Huertas Celdran, Jing Han, Heqing Ren, Xi Cheng, Zien Zeng, Lucas Krauter, Gerome Bovet, Burkhard Stiller
Published: 2025-07-17

Source: https://arxiv.org/abs/2507.13313

PDF: https://arxiv.org/pdf/2507.13313

Labels Predicted by AI

Feature Extraction IoT Data Analysis Task Performance Evaluation Method

Please note that these labels were automatically added by AI. Therefore, they may not be entirely accurate.
For more details, please see the About the Literature Database page.

Abstract

This paper introduces a dataset and experimental study for decentralized federated learning (DFL) applied to IoT crowdsensing malware detection. The dataset comprises behavioral records from benign and eight malware families. A total of 21,582,484 original records were collected from system calls, file system activities, resource usage, kernel events, input/output events, and network records. These records were aggregated into 30-second windows, resulting in 342,106 features used for model training and evaluation. Experiments on the DFL platform compare traditional machine learning (ML), centralized federated learning (CFL), and DFL across different node counts, topologies, and data distributions. Results show that DFL maintains competitive performance while preserving data locality, outperforming CFL in most settings. This dataset provides a solid foundation for studying the security of IoT crowdsensing environments.