Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem

TOP Literature Database Do You Trust Your Model? Emerging Malware Threats in the Deep Learning Ecosystem

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2403.03593

PDF

https://arxiv.org/pdf/2403.03593

Paper Information

Author: Dorjan Hitaj,Giulio Pagnotta,Fabio De Gaspari,Sediola Ruko,Briland Hitaj,Luigi V. Mancini,Fernando Perez-Cruz
Published: 3-6-2024
Updated: 5-13-2025
Affiliation: Department of Computer Science, Sapienza University of Rome
Country: Italy
Conference: IEEE Trans. Dependable Secur. Comput.

Labels Estimated by AI

Prompt Injection Federated Learning Malware Classification

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Training high-quality deep learning models is a challenging task due to computational and technical requirements. A growing number of individuals, institutions, and companies increasingly rely on pre-trained, third-party models made available in public repositories. These models are often used directly or integrated in product pipelines with no particular precautions, since they are effectively just data in tensor form and considered safe. In this paper, we raise awareness of a new machine learning supply chain threat targeting neural networks. We introduce MaleficNet 2.0, a novel technique to embed self-extracting, self-executing malware in neural networks. MaleficNet 2.0 uses spread-spectrum channel coding combined with error correction techniques to inject malicious payloads in the parameters of deep neural networks. MaleficNet 2.0 injection technique is stealthy, does not degrade the performance of the model, and is robust against removal techniques. We design our approach to work both in traditional and distributed learning settings such as Federated Learning, and demonstrate that it is effective even when a reduced number of bits is used for the model parameters. Finally, we implement a proof-of-concept self-extracting neural network malware using MaleficNet 2.0, demonstrating the practicality of the attack against a widely adopted machine learning framework. Our aim with this work is to raise awareness against these new, dangerous attacks both in the research community and industry, and we hope to encourage further research in mitigation techniques against such threats.

External Datasets

MNIST

FashionMNIST

CIFAR10

CIFAR100

WikiText-2

ESC-50

ImageNet

Cats vs. Dogs

MMLU

RockYou