These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Despite numerous attempts to defend deep learning based image classifiers,
they remain susceptible to the adversarial attacks. This paper proposes a
technique to identify susceptible classes, those classes that are more easily
subverted. To identify the susceptible classes we use distance-based measures
and apply them on a trained model. Based on the distance among original
classes, we create mapping among original classes and adversarial classes that
helps to reduce the randomness of a model to a significant amount in an
adversarial setting. We analyze the high dimensional geometry among the feature
classes and identify the k most susceptible target classes in an adversarial
attack. We conduct experiments using MNIST, Fashion MNIST, CIFAR-10 (ImageNet
and ResNet-32) datasets. Finally, we evaluate our techniques in order to
determine which distance-based measure works best and how the randomness of a
model changes with perturbation.