These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Security-critical machine-learning (ML) systems, such as face-recognition
systems, are susceptible to adversarial examples, including real-world
physically realizable attacks. Various means to boost ML's adversarial
robustness have been proposed; however, they typically induce unfair
robustness: It is often easier to attack from certain classes or groups than
from others. Several techniques have been developed to improve adversarial
robustness while seeking perfect fairness between classes. Yet, prior work has
focused on settings where security and fairness are less critical. Our insight
is that achieving perfect parity in realistic fairness-critical tasks, such as
face recognition, is often infeasible -- some classes may be highly similar,
leading to more misclassifications between them. Instead, we suggest that
seeking symmetry -- i.e., attacks from class $i$ to $j$ would be as successful
as from $j$ to $i$ -- is more tractable. Intuitively, symmetry is a desirable
because class resemblance is a symmetric relation in most domains.
Additionally, as we prove theoretically, symmetry between individuals induces
symmetry between any set of sub-groups, in contrast to other fairness notions
where group-fairness is often elusive. We develop Sy-FAR, a technique to
encourage symmetry while also optimizing adversarial robustness and extensively
evaluate it using five datasets, with three model architectures, including
against targeted and untargeted realistic attacks. The results show Sy-FAR
significantly improves fair adversarial robustness compared to state-of-the-art
methods. Moreover, we find that Sy-FAR is faster and more consistent across
runs. Notably, Sy-FAR also ameliorates another type of unfairness we discover
in this work -- target classes that adversarial examples are likely to be
classified into become significantly less vulnerable after inducing symmetry.