These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
Explainable AI (XAI) and interpretable machine learning methods help to build
trust in model predictions and derived insights, yet also present a perverse
incentive for analysts to manipulate XAI metrics to support pre-specified
conclusions. This paper introduces the concept of X-hacking, a form of
p-hacking applied to XAI metrics such as Shap values. We show how an automated
machine learning pipeline can be used to search for 'defensible' models that
produce a desired explanation while maintaining superior predictive performance
to a common baseline. We formulate the trade-off between explanation and
accuracy as a multi-objective optimization problem and illustrate the
feasibility and severity of X-hacking empirically on familiar real-world
datasets. Finally, we suggest possible methods for detection and prevention,
and discuss ethical implications for the credibility and reproducibility of XAI
research.