These labels were automatically added by AI and may be inaccurate. For details, see About Literature Database.
Abstract
The vulnerability of machine learning models to Membership Inference Attacks
(MIAs) has garnered considerable attention in recent years. These attacks
determine whether a data sample belongs to the model's training set or not.
Recent research has focused on reference-based attacks, which leverage
difficulty calibration with independently trained reference models. While
empirical studies have demonstrated its effectiveness, there is a notable gap
in our understanding of the circumstances under which it succeeds or fails. In
this paper, we take a further step towards a deeper understanding of the role
of difficulty calibration. Our observations reveal inherent limitations in
calibration methods, leading to the misclassification of non-members and
suboptimal performance, particularly on high-loss samples. We further identify
that these errors stem from an imperfect sampling of the potential distribution
and a strong dependence of membership scores on the model parameters. By
shedding light on these issues, we propose RAPID: a query-efficient and
computation-efficient MIA that directly \textbf{R}e-lever\textbf{A}ges the
original membershi\textbf{P} scores to m\textbf{I}tigate the errors in
\textbf{D}ifficulty calibration. Our experimental results, spanning 9 datasets
and 5 model architectures, demonstrate that RAPID outperforms previous
state-of-the-art attacks (e.g., LiRA and Canary offline) across different
metrics while remaining computationally efficient. Our observations and
analysis challenge the current de facto paradigm of difficulty calibration in
high-precision inference, encouraging greater attention to the persistent risks
posed by MIAs in more practical scenarios.