The vulnerability of machine learning models to Membership Inference Attacks
(MIAs) has garnered considerable attention in recent years. These attacks
determine whether a data sample belongs to the model's training set or not.
Recent research has focused on reference-based attacks, which leverage
difficulty calibration with independently trained reference models. While
empirical studies have demonstrated its effectiveness, there is a notable gap
in our understanding of the circumstances under which it succeeds or fails. In
this paper, we take a further step towards a deeper understanding of the role
of difficulty calibration. Our observations reveal inherent limitations in
calibration methods, leading to the misclassification of non-members and
suboptimal performance, particularly on high-loss samples. We further identify
that these errors stem from an imperfect sampling of the potential distribution
and a strong dependence of membership scores on the model parameters. By
shedding light on these issues, we propose RAPID: a query-efficient and
computation-efficient MIA that directly \textbf{R}e-lever\textbf{A}ges the
original membershi\textbf{P} scores to m\textbf{I}tigate the errors in
\textbf{D}ifficulty calibration. Our experimental results, spanning 9 datasets
and 5 model architectures, demonstrate that RAPID outperforms previous
state-of-the-art attacks (e.g., LiRA and Canary offline) across different
metrics while remaining computationally efficient. Our observations and
analysis challenge the current de facto paradigm of difficulty calibration in
high-precision inference, encouraging greater attention to the persistent risks
posed by MIAs in more practical scenarios.