X Hacking: The Threat of Misguided AutoML

TOP 文献データベース X Hacking: The Threat of Misguided AutoML

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2401.08513

PDF

https://arxiv.org/pdf/2401.08513

文献情報

作者: Rahul Sharma;Sergey Redyuk;Sumantrak Mukherjee;Andrea Sipka;Sebastian Vollmer;David Selby
公開日: 2024-1-17
更新日: 2024-2-12
所属機関: Deutsches Forschungszentrum für Künstliche Intelligenz GmbH (DFKI)
所属の国: Germany
会議名: International Conference on Machine Learning (ICML)

AIにより推定されたラベル

XAI（説明可能なAI）モデルの解釈性バイアス

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how an automated machine learning pipeline can be used to search for 'defensible' models that produce a desired explanation while maintaining superior predictive performance to a common baseline. We formulate the trade-off between explanation and accuracy as a multi-objective optimization problem and illustrate the feasibility and severity of X-hacking empirically on familiar real-world datasets. Finally, we suggest possible methods for detection and prevention, and discuss ethical implications for the credibility and reproducibility of XAI research.

外部データセット

breast-w

credit-approval

credit-g

diabetes

pc4

pc3

jm1

kc2

kc1

pc1

bank-marketing

blood-transfusion

ilpd

madelon

qsar-biodeg

wdbc

adult

Bioresponse

numerai28.6

churn

wilt

climate-model

cardiac-disease