DUMB and DUMBer: Is Adversarial Training Worth It in the Real World? | AI Security Portal

JA

JA

EN

TOP Literature Database DUMB and DUMBer: Is Adversarial Training Worth It in the Real World?

arxiv

DUMB and DUMBer: Is Adversarial Training Worth It in the Real World?

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2506.18516

PDF

https://arxiv.org/pdf/2506.18516

Paper Information

Author: Francesco Marchiori,Marco Alecci,Luca Pajola,Mauro Conti
Published: 6-23-2025
Affiliation: University of Padova
Country: Italy
Conference: European Symposium on Research in Computer Security (ESORICS)

Labels Estimated by AI

Adversarial Attack Analysis Certified Robustness Model Architecture

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Adversarial examples are small and often imperceptible perturbations crafted to fool machine learning models. These attacks seriously threaten the reliability of deep neural networks, especially in security-sensitive domains. Evasion attacks, a form of adversarial attack where input is modified at test time to cause misclassification, are particularly insidious due to their transferability: adversarial examples crafted against one model often fool other models as well. This property, known as adversarial transferability, complicates defense strategies since it enables black-box attacks to succeed without direct access to the victim model. While adversarial training is one of the most widely adopted defense mechanisms, its effectiveness is typically evaluated on a narrow and homogeneous population of models. This limitation hinders the generalizability of empirical findings and restricts practical adoption. In this work, we introduce DUMBer, an attack framework built on the foundation of the DUMB (Dataset soUrces, Model architecture, and Balance) methodology, to systematically evaluate the resilience of adversarially trained models. Our testbed spans multiple adversarial training techniques evaluated across three diverse computer vision tasks, using a heterogeneous population of uniquely trained models to reflect real-world deployment variability. Our experimental pipeline comprises over 130k evaluations spanning 13 state-of-the-art attack algorithms, allowing us to capture nuanced behaviors of adversarial training under varying threat models and dataset conditions. Our findings offer practical, actionable insights for AI practitioners, identifying which defenses are most effective based on the model, dataset, and attacker setup.

External Datasets

Bing

Google