The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models

TOP Literature Database The Tail Tells All: Estimating Model-Level Membership Inference Vulnerability Without Reference Models

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2510.19773

PDF

https://arxiv.org/pdf/2510.19773

Paper Information

Author: Euodia Dodd,Nataša Krčo,Igor Shilov,Yves-Alexandre de Montjoye
Published: 10-23-2025
Affiliation: Imperial College London
Country: United Kingdom
Conference

Labels Estimated by AI

Low-Cost Membership Inference Method Privacy-Preserving Machine Learning Model Robustness

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

Membership inference attacks (MIAs) have emerged as the standard tool for evaluating the privacy risks of AI models. However, state-of-the-art attacks require training numerous, often computationally expensive, reference models, limiting their practicality. We present a novel approach for estimating model-level vulnerability, the TPR at low FPR, to membership inference attacks without requiring reference models. Empirical analysis shows loss distributions to be asymmetric and heavy-tailed and suggests that most points at risk from MIAs have moved from the tail (high-loss region) to the head (low-loss region) of the distribution after training. We leverage this insight to propose a method to estimate model-level vulnerability from the training and testing distribution alone: using the absence of outliers from the high-loss region as a predictor of the risk. We evaluate our method, the TNR of a simple loss attack, across a wide range of architectures and datasets and show it to accurately estimate model-level vulnerability to the SOTA MIA attack (LiRA). We also show our method to outperform both low-cost (few reference models) attacks such as RMIA and other measures of distribution difference. We finally evaluate the use of non-linear functions to evaluate risk and show the approach to be promising to evaluate the risk in large-language models.