Unlearning Backdoor Attacks through Gradient-Based Model Pruning

Proceedings of the IEEE International Conference on Image Processing (ICIP)

A new backdoor attack in CNNs by training set corruption without label poisoning

Mauro Barni, Kassem Kallas, Benedetta Tondi

Published: 2019

Advances in Neural Information Processing Systems

Effective backdoor defense by exploiting sensitivity of poisoned samples

W. Chen, B. Wu, H. Wang

Published: 2022

Targeted backdoor attacks on deep learning systems using data poisoning

Chen, X., Liu, C., Li, B., Lu, K., Song, D.

Published: 2017

Proceedings of the IEEE/CVF international conference on computer vision

Lira: Learnable, imperceptible and robust backdoor attacks

K. Doan, Y. Lao, W. Zhao, P. Li

Published: 2021

IEEE Access

BadNets: Evaluating backdooring attacks on deep neural networks

Tianyu Gu, Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg

Published: 2019

International Joint Conference on Neural Networks

Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark

S. Houben, J. Stallkamp, J. Salmen, M. Schlipsing, C. Igel

Published: 2013

Learning multiple layers of features from tiny images

Alex Krizhevsky, Geoffrey Hinton

Published: 2009

arXiv

Neural attention distillation: Erasing backdoor triggers from deep neural networks

Y. Li, X. Lyu, N. Koren, L. Lyu, B. Li, X. Ma

Published: 2021

ICML

Reconstructive neuron pruning for backdoor defense

Li, Y., Lyu, X., Ma, X., Koren, N., Lyu, L., Li, B., Jiang, Y.-G.

Published: 2023

Proceedings of the IEEE/CVF international conference on computer vision

Invisible backdoor attack with sample-specific triggers

Y. Li, Y. Li, B. Wu, L. Li, R. He, S. Lyu

Published: 2021

arxiv

Cited by 8

Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks

Kang Liu, Brendan Dolan-Gavitt, Siddharth Garg

Published: 5.31.2018

Deep neural networks (DNNs) provide excellent performance across a wide range of classification tasks, but their training requires high computational resources and is often outsourced to third parties. Recent work has shown that outsourced training introduces the risk that a malicious trainer will return a backdoored DNN that behaves normally on most inputs but causes targeted misclassifications or degrades the accuracy of the network when a trigger known only to the attacker is present. In this paper, we provide the first effective defenses against backdoor attacks on DNNs. We implement three backdoor attacks from prior work and use them to investigate two promising defenses, pruning and fine-tuning. We show that neither, by itself, is sufficient to defend against sophisticated attackers. We then evaluate fine-pruning, a combination of pruning and fine-tuning, and show that it successfully weakens or even eliminates the backdoors, i.e., in some cases reducing the attack success rate to 0% with only a 0.4% drop in accuracy for clean (non-triggering) inputs. Our work provides the first step toward defenses against backdoor attacks in deep neural networks.

Deep Learning Backdoor Detection Attack Method

IEEE INFOCOM 2022-IEEE Conference on Computer Communications

Backdoor defense with machine unlearning

Y. Liu, M. Fan, C. Chen, X. Liu, Z. Ma, L. Wang, J. Ma

Published: 2022

ACM SIGSAC CCS

Abs: Scanning neural networks for back-doors by artificial brain stimulation

Liu, Y., Lee, W.-C., Tao, G., Ma, S., Aafer, Y., Zhang, X.

Published: 2019

ACM Computing Surveys (CSUR)

A survey on deep learning: Algorithms, techniques, and applications

S. Pouyanfar, S. Sadiq, Y. Yan, H. Tian, Y. Tao, M.P. Reyes, M.L. Shyu, S.C. Chen, S.S. Iyengar

Published: 2018

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Single image backdoor inversion via robust smoothed classifiers

M. Sun, Z. Kolter

Published: 2023

2019 IEEE symposium on security and privacy (SP)

Neural cleanse: Identifying and mitigating backdoor attacks in neural networks

Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, Ben Y Zhao

Published: 2019

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Bppattack: Stealthy and efficient trojan attacks against deep neural networks via image quantization and contrastive adversarial learning

Z. Wang, J. Zhai, S. Ma

Published: 2022

Advances in Neural Information Processing Systems

Backdoorbench: A comprehensive benchmark of backdoor learning

B. Wu, H. Chen, M. Zhang, Z. Zhu, S. Wei, D. Yuan, C. Shen

Published: 2022

Advances in Neural Information Processing Systems

Adversarial neuron pruning purifies backdoored deep models

Dongxian Wu, Yisen Wang

Published: 2021

IEEE Access

Machine learning security: Threats, countermeasures, and evaluations

M. Xue, C. Yuan, H. Wu, Y. Zhang, W. Liu

Published: 2020

Proceedings of the IEEE/CVF international conference on computer vision

Rethinking the backdoor attacks’ triggers: A frequency perspective

Y. Zeng, W. Park, Z. M. Mao, R. Jia

Published: 2021

arxiv

Cited by 1

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

Published: 5.1.2020

Mode connectivity provides novel geometric insights on analyzing loss landscapes and enables building high-accuracy pathways between well-trained neural networks. In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. When network models are tampered with backdoor or error-injection attacks, our results demonstrate that the path connection learned using limited amount of bonafide data can effectively mitigate adversarial effects while maintaining the original accuracy on clean data. Therefore, mode connectivity provides users with the power to repair backdoored or error-injected models. We also use mode connectivity to investigate the loss landscapes of regular and robust models against evasion attacks. Experiments show that there exists a barrier in adversarial robustness loss on the path connecting regular and adversarially-trained models. A high correlation is observed between the adversarial robustness loss and the largest eigenvalue of the input Hessian matrix, for which theoretical justifications are provided. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.

Backdoor Attack Robustness Improvement Method Poisoning

Computer Vision - ECCV

Data-free backdoor removal based on channel lipschitzness

R. Zheng, R. Tang, J. Li, L. Liu

Published: 2022

arXiv preprint

Enhancing fine-tuning based backdoor defense with sharpness-aware minimization

M. Zhu, S. Wei, L. Shen, Y. Fan, B. Wu

Published: 2023