The bulk of existing research in defending against adversarial examples
focuses on defending against a single (typically bounded Lp-norm) attack, but
for a practical setting, machine learning (ML) models should be robust to a
wide variety of attacks. In this paper, we present the first unified framework
for considering multiple attacks against ML models. Our framework is able to
model different levels of learner's knowledge about the test-time adversary,
allowing us to model robustness against unforeseen attacks and robustness
against unions of attacks. Using our framework, we present the first
leaderboard, MultiRobustBench, for benchmarking multiattack evaluation which
captures performance across attack types and attack strengths. We evaluate the
performance of 16 defended models for robustness against a set of 9 different
attack types, including Lp-based threat models, spatial transformations, and
color changes, at 20 different attack strengths (180 attacks total).
Additionally, we analyze the state of current defenses against multiple
attacks. Our analysis shows that while existing defenses have made progress in
terms of average robustness across the set of attacks used, robustness against
the worst-case attack is still a big open problem as all existing models
perform worse than random guessing.