Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

TOP 文献データベース Semi-Leak: Membership Inference Attacks Against Semi-supervised Learning

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/2207.12535

PDF

https://arxiv.org/pdf/2207.12535

文献情報

作者: Xinlei He;Hongbin Liu;Neil Zhenqiang Gong;Yang Zhang
公開日: 2022-7-26
所属機関: CISPA Helmholz Center for Information Security
所属の国: Germany
会議名

AIにより推定されたラベル

メンバーシップ推論モデル性能評価 DFLに対する攻撃手法

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Semi-supervised learning (SSL) leverages both labeled and unlabeled data to train machine learning (ML) models. State-of-the-art SSL methods can achieve comparable performance to supervised learning by leveraging much fewer labeled data. However, most existing works focus on improving the performance of SSL. In this work, we take a different angle by studying the training data privacy of SSL. Specifically, we propose the first data augmentation-based membership inference attacks against ML models trained by SSL. Given a data sample and the black-box access to a model, the goal of membership inference attack is to determine whether the data sample belongs to the training dataset of the model. Our evaluation shows that the proposed attack can consistently outperform existing membership inference attacks and achieves the best performance against the model trained by SSL. Moreover, we uncover that the reason for membership leakage in SSL is different from the commonly believed one in supervised learning, i.e., overfitting (the gap between training and testing accuracy). We observe that the SSL model is well generalized to the testing data (with almost 0 overfitting) but ''memorizes'' the training data by giving a more confident prediction regardless of its correctness. We also explore early stopping as a countermeasure to prevent membership inference attacks against SSL. The results show that early stopping can mitigate the membership inference attack, but with the cost of model's utility degradation.