These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Recent work has investigated the vulnerability of local surrogate methods to
adversarial perturbations on a machine learning (ML) model's inputs, where the
explanation is manipulated while the meaning and structure of the original
input remains similar under the complex model. Although weaknesses across many
methods have been shown to exist, the reasons behind why remain little
explored. Central to the concept of adversarial attacks on explainable AI (XAI)
is the similarity measure used to calculate how one explanation differs from
another. A poor choice of similarity measure can lead to erroneous conclusions
on the efficacy of an XAI method. Too sensitive a measure results in
exaggerated vulnerability, while too coarse understates its weakness. We
investigate a variety of similarity measures designed for text-based ranked
lists, including Kendall's Tau, Spearman's Footrule, and Rank-biased Overlap to
determine how substantial changes in the type of measure or threshold of
success affect the conclusions generated from common adversarial attack
processes. Certain measures are found to be overly sensitive, resulting in
erroneous estimates of stability.