Large Language Models (LLMs) are increasingly used in a variety of
applications, but concerns around membership inference have grown in parallel.
Previous efforts focus on black-to-grey-box models, thus neglecting the
potential benefit from internal LLM information. To address this, we propose
the use of Linear Probes (LPs) as a method to detect Membership Inference
Attacks (MIAs) by examining internal activations of LLMs. Our approach, dubbed
LUMIA, applies LPs layer-by-layer to get fine-grained data on the model inner
workings. We test this method across several model architectures, sizes and
datasets, including unimodal and multimodal tasks. In unimodal MIA, LUMIA
achieves an average gain of 15.71 % in Area Under the Curve (AUC) over previous
techniques. Remarkably, LUMIA reaches AUC>60% in 65.33% of cases -- an
increment of 46.80% against the state of the art. Furthermore, our approach
reveals key insights, such as the model layers where MIAs are most detectable.
In multimodal models, LPs indicate that visual inputs can significantly
contribute to detect MIAs -- AUC>60% is reached in 85.90% of experiments.