Fooling Partial Dependence via Data Poisoning

TOP Literature Database Fooling Partial Dependence via Data Poisoning

ECML/PKDD

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2105.12837

PDF

https://arxiv.org/pdf/2105.12837

Paper Information

Author: Hubert Baniecki;Wojciech Kretowicz;Przemyslaw Biecek
Published: 5-27-2021
Updated: 7-11-2022
Affiliation: Warsaw University of Technology
Country: Poland
Conference: ECML/PKDD

Labels Estimated by AI

Vulnerability Assessment Method Poisoning Data Contamination Detection

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Many methods have been developed to understand complex predictive models and high expectations are placed on post-hoc model explainability. It turns out that such explanations are not robust nor trustworthy, and they can be fooled. This paper presents techniques for attacking Partial Dependence (plots, profiles, PDP), which are among the most popular methods of explaining any predictive model trained on tabular data. We showcase that PD can be manipulated in an adversarial manner, which is alarming, especially in financial or medical applications where auditability became a must-have trait supporting black-box machine learning. The fooling is performed via poisoning the data to bend and shift explanations in the desired direction using genetic and gradient algorithms. We believe this to be the first work using a genetic algorithm for manipulating explanations, which is transferable as it generalizes both ways: in a model-agnostic and an explanation-agnostic manner.