Attacks on Explainability

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

Authors: Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju | Published: 2019-11-06 | Updated: 2020-02-03
XAI (Explainable AI)
Adversarial Learning
Attacks on Explainability

Explanations can be manipulated and geometry is to blame

Authors: Ann-Kathrin Dombrowski, Maximilian Alber, Christopher J. Anders, Marcel Ackermann, Klaus-Robert Müller, Pan Kessel | Published: 2019-06-19 | Updated: 2019-09-25
Model Interpretability
Robustness Evaluation
Attacks on Explainability

Interpretation of Neural Networks is Fragile

Authors: Amirata Ghorbani, Abubakar Abid, James Zou | Published: 2017-10-29 | Updated: 2018-11-06
Relationship between Robustness and Privacy
Adversarial Attack Analysis
Attacks on Explainability