Assessing Robustness via Score-Based Adversarial Image Generation

Authors: Marcel Kollovieh, Lukas Gosch, Yan Scholten, Marten Lienen, Stephan Günnemann | Published: 2023-10-06

2023.10.062025.04.03

Authors: Marcel Kollovieh, Lukas Gosch, Yan Scholten, Marten Lienen, Stephan Günnemann
Published: 2023-10-06

Source: https://arxiv.org/abs/2310.04285

PDF: https://arxiv.org/pdf/2310.04285

AIにより推定されたラベル

データ生成実験的検証防御手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Most adversarial attacks and defenses focus on perturbations within small ℓ_p-norm constraints. However, ℓ_p threat models cannot capture all relevant semantic-preserving perturbations, and hence, the scope of robustness evaluations is limited. In this work, we introduce Score-Based Adversarial Generation (ScoreAG), a novel framework that leverages the advancements in score-based generative models to generate adversarial examples beyond ℓ_p-norm constraints, so-called unrestricted adversarial examples, overcoming their limitations. Unlike traditional methods, ScoreAG maintains the core semantics of images while generating realistic adversarial examples, either by transforming existing images or synthesizing new ones entirely from scratch. We further exploit the generative capability of ScoreAG to purify images, empirically enhancing the robustness of classifiers. Our extensive empirical evaluation demonstrates that ScoreAG matches the performance of state-of-the-art attacks and defenses across multiple benchmarks. This work highlights the importance of investigating adversarial examples bounded by semantics rather than ℓ_p-norm constraints. ScoreAG represents an important step towards more encompassing robustness assessments.