Despite achieving good performance and wide adoption, machine learning based
security detection models (e.g., malware classifiers) are subject to concept
drift and evasive evolution of attackers, which renders up-to-date threat data
as a necessity. However, due to enforcement of various privacy protection
regulations (e.g., GDPR), it is becoming increasingly challenging or even
prohibitive for security vendors to collect individual-relevant and
privacy-sensitive threat datasets, e.g., SMS spam/non-spam messages from mobile
devices. To address such obstacles, this study systematically profiles the
(in)feasibility of federated learning for privacy-preserving cyber threat
detection in terms of effectiveness, byzantine resilience, and efficiency. This
is made possible by the build-up of multiple threat datasets and threat
detection models, and more importantly, the design of realistic and
security-specific experiments.
We evaluate FL on two representative threat detection tasks, namely SMS spam
detection and Android malware detection. It shows that FL-trained detection
models can achieve a performance that is comparable to centrally trained
counterparts. Also, most non-IID data distributions have either minor or
negligible impact on the model performance, while a label-based non-IID
distribution of a high extent can incur non-negligible fluctuation and delay in
FL training. Then, under a realistic threat model, FL turns out to be
adversary-resistant to attacks of both data poisoning and model poisoning.
Particularly, the attacking impact of a practical data poisoning attack is no
more than 0.14\% loss in model accuracy. Regarding FL efficiency, a
bootstrapping strategy turns out to be effective to mitigate the training delay
as observed in label-based non-IID scenarios.
参考文献
Network and Distributed System Security Symposium (NDSS)
Drebin: Effective and explainable detection of android malware in your pocket
D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck
Published: 2014
CODASPY 2017 - Proceedings of the 7th ACM Conference on Data and Application Security and Privacy
Deep android malware detection
N. McLaughlin, J. M. Del Rincon, B. J. Kang, S. Yerima, P. Miller, S. Sezer, Y. Safaei, E. Trickel, Z. Zhao, A. Doupe, G. J. Ahn
Published: 2017
Future Generation Computer Systems
Deep learning to filter sms spam
P. K. Roy, J. P. Singh, S. Banerjee
Published: 2020
Future Internet
A hybrid cnn-lstm model for sms spam detection in arabic and english messages
A. Ghourabi, M. A. Mahmood, Q. M. Alzubi
Published: 2020
Concurrency and Computation: Practice and Experience
A deep learning method for automatic sms spam classification: Performance of learning algorithms on indigenous dataset
O. Abayomi-Alli, S. Misra, A. Abayomi-Alli
Published: 2022
IEEE Access
A spam transformer model for sms spam detection
X. Liu, H. Lu, A. Nayak
Published: 2021
Proceedings of the australasian computer science week multiconference
Twitter spam detection based on deep learning
T. Wu, S. Liu, J. Zhang, Y. Xiang
Published: 2017
IEEE Network
Ai-based malicious network traffic detection in vanets
N. Lyamin, D. Kleyko, Q. Delooz, A. Vinel
Published: 2018
IEEE Internet of Things Journal
Corrauc: A malicious bot-iot traffic detection method in iot network using machine-learning techniques
M. Shafiq, Z. Tian, A. K. Bashir, X. Du, M. Guizani
Published: 2020
Information & Communications Technology Law
The european union general data protection regulation: what it is and what it means
C. J. Hoofnagle, B. Van Der Sloot, F. Z. Borgesius
Published: 2019
Journal of Data Protection & Privacy
Understanding the scope and impact of the california consumer privacy act of 2018
E. L. Harding, J. J. Vanto, R. Clark, L. Hannah Ji, S. C. Ainsworth