Purifying Adversarial Perturbation with Adversarially Trained Auto-encoders

TOP Literature Database Purifying Adversarial Perturbation with Adversarially Trained Auto-encoders

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/1905.10729

PDF

https://arxiv.org/pdf/1905.10729

Paper Information

Author: Hebi Li,Qi Xiao,Shixin Tian,Jin Tian
Published: 5-26-2019
Affiliation: Iowa State University
Country: United States of America
Conference: Computing Research Repository (CoRR)

Labels Estimated by AI

Vulnerability of Adversarial Examples Machine Learning Method Attack Type

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Machine learning models are vulnerable to adversarial examples. Iterative adversarial training has shown promising results against strong white-box attacks. However, adversarial training is very expensive, and every time a model needs to be protected, such expensive training scheme needs to be performed. In this paper, we propose to apply iterative adversarial training scheme to an external auto-encoder, which once trained can be used to protect other models directly. We empirically show that our model outperforms other purifying-based methods against white-box attacks, and transfers well to directly protect other base models with different architectures.