Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

TOP Literature Database Demystifying Poisoning Backdoor Attacks from a Statistical Perspective

International Conference on Learning Representations (ICLR)

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2310.10780

PDF

https://arxiv.org/pdf/2310.10780

Paper Information

Author: Ganghua Wang;Xun Xian;Jayanth Srinivasa;Ashish Kundu;Xuan Bi;Mingyi Hong;Jie Ding
Published: 10-17-2023
Updated: 10-18-2023
Affiliation: School of Statistics, University of Minnesota
Country: United States of America
Conference: International Conference on Learning Representations (ICLR)

Labels Estimated by AI

Poisoning Model Performance Evaluation Convergence Property

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The growing dependence on machine learning in real-world applications emphasizes the importance of understanding and ensuring its safety. Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences. Such attacks involve embedding triggers within a learning model with the intention of causing malicious behavior when an active trigger is present while maintaining regular functionality without it. This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger, by establishing tight lower and upper boundaries for the performance of the compromised model on both clean and backdoor test data. The developed theory answers a series of fundamental but previously underexplored problems, including (1) what are the determining factors for a backdoor attack's success, (2) what is the direction of the most effective backdoor attack, and (3) when will a human-imperceptible trigger succeed. Our derived understanding applies to both discriminative and generative models. We also demonstrate the theory by conducting experiments using benchmark datasets and state-of-the-art backdoor attack scenarios.