Label Poisoning is All You Need

TOP Literature Database Label Poisoning is All You Need

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2310.18933

PDF

https://arxiv.org/pdf/2310.18933

Paper Information

Author: Rishi D. Jha;Jonathan Hayase;Sewoong Oh
Published: 10-29-2023
Affiliation: Paul G. Allen School of Computer Science & Engineering, University of Washington
Country: United States of America
Conference: Conference on Neural Information Processing Systems (NeurIPS)

Labels Estimated by AI

Backdoor Attack Classification of Malicious Actors Security Analysis

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks. However, in some common machine learning scenarios, the training labels are provided by potentially malicious third-parties. This includes crowd-sourced annotation and knowledge distillation. We, hence, investigate a fundamental question: can we launch a successful backdoor attack by only corrupting labels? We introduce a novel approach to design label-only backdoor attacks, which we call FLIP, and demonstrate its strengths on three datasets (CIFAR-10, CIFAR-100, and Tiny-ImageNet) and four architectures (ResNet-32, ResNet-18, VGG-19, and Vision Transformer). With only 2% of CIFAR-10 labels corrupted, FLIP achieves a near-perfect attack success rate of 99.4% while suffering only a 1.8% drop in the clean test accuracy. Our approach builds upon the recent advances in trajectory matching, originally introduced for dataset distillation.

External Datasets

CIFAR-10

CIFAR-100

Tiny-ImageNet