These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Adversarial examples are small and often imperceptible perturbations crafted
to fool machine learning models. These attacks seriously threaten the
reliability of deep neural networks, especially in security-sensitive domains.
Evasion attacks, a form of adversarial attack where input is modified at test
time to cause misclassification, are particularly insidious due to their
transferability: adversarial examples crafted against one model often fool
other models as well. This property, known as adversarial transferability,
complicates defense strategies since it enables black-box attacks to succeed
without direct access to the victim model. While adversarial training is one of
the most widely adopted defense mechanisms, its effectiveness is typically
evaluated on a narrow and homogeneous population of models. This limitation
hinders the generalizability of empirical findings and restricts practical
adoption.
In this work, we introduce DUMBer, an attack framework built on the
foundation of the DUMB (Dataset soUrces, Model architecture, and Balance)
methodology, to systematically evaluate the resilience of adversarially trained
models. Our testbed spans multiple adversarial training techniques evaluated
across three diverse computer vision tasks, using a heterogeneous population of
uniquely trained models to reflect real-world deployment variability. Our
experimental pipeline comprises over 130k evaluations spanning 13
state-of-the-art attack algorithms, allowing us to capture nuanced behaviors of
adversarial training under varying threat models and dataset conditions. Our
findings offer practical, actionable insights for AI practitioners, identifying
which defenses are most effective based on the model, dataset, and attacker
setup.