Poisoning Attacks on Fair Machine Learning

TOP 文献データベース Poisoning Attacks on Fair Machine Learning

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2110.08932

PDF

https://arxiv.org/pdf/2110.08932

文献情報

作者: Minh-Hao Van;Wei Du;Xintao Wu;Aidong Lu
公開日: 2021-10-18
所属機関: University of Arkansas at Fayetteville
所属の国: United States of America
会議名: International Conference on Database Systems for Advanced Applications (DASFAA)

AIにより推定されたラベル

機械学習アルゴリズム敵対的攻撃手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Both fair machine learning and adversarial learning have been extensively studied. However, attacking fair machine learning models has received less attention. In this paper, we present a framework that seeks to effectively generate poisoning samples to attack both model accuracy and algorithmic fairness. Our attacking framework can target fair machine learning models trained with a variety of group based fairness notions such as demographic parity and equalized odds. We develop three online attacks, adversarial sampling , adversarial labeling, and adversarial feature modification. All three attacks effectively and efficiently produce poisoning samples via sampling, labeling, or modifying a fraction of training data in order to reduce the test accuracy. Our framework enables attackers to flexibly adjust the attack's focus on prediction accuracy or fairness and accurately quantify the impact of each candidate point to both accuracy loss and fairness violation, thus producing effective poisoning samples. Experiments on two real datasets demonstrate the effectiveness and efficiency of our framework.