X Hacking: The Threat of Misguided AutoML

TOP Literature Database X Hacking: The Threat of Misguided AutoML

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2401.08513

PDF

https://arxiv.org/pdf/2401.08513

Paper Information

Author: Rahul Sharma;Sergey Redyuk;Sumantrak Mukherjee;Andrea Sipka;Sebastian Vollmer;David Selby
Published: 1-17-2024
Updated: 2-12-2024
Affiliation: Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI)
Country: Germany
Conference: International Conference on Machine Learning (ICML)

Labels Estimated by AI

XAI (Explainable AI) Model Interpretability Bias

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how an automated machine learning pipeline can be used to search for 'defensible' models that produce a desired explanation while maintaining superior predictive performance to a common baseline. We formulate the trade-off between explanation and accuracy as a multi-objective optimization problem and illustrate the feasibility and severity of X-hacking empirically on familiar real-world datasets. Finally, we suggest possible methods for detection and prevention, and discuss ethical implications for the credibility and reproducibility of XAI research.

External Datasets

breast-w

credit-approval

credit-g

diabetes

pc4

pc3

jm1

kc2

kc1

pc1

bank-marketing

blood-transfusion

ilpd

madelon

qsar-biodeg

wdbc

adult

Bioresponse

numerai28.6

churn

wilt

climate-model

cardiac-disease