EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs

Membership inference attacks (MIA) aim to infer whether a particular data point is part of the training dataset of a model. In this paper, we propose a new task in the context of LLM privacy: entity-level discovery of membership risk focused on sensitive information (PII, credit card numbers, etc). Existing methods for MIA can detect the presence of entire prompts or documents in the LLM training data, but they fail to capture risks at a finer granularity. We propose the ``EL-MIA'' framework for auditing entity-level membership risks in LLMs. We construct a benchmark dataset for the evaluation of MIA methods on this task. Using this benchmark, we conduct a systematic comparison of existing MIA techniques as well as two newly proposed methods. We provide a comprehensive analysis of the results, trying to explain the relation of the entity level MIA susceptability with the model scale, training epochs, and other surface level factors. Our findings reveal that existing MIA methods are limited when it comes to entity-level membership inference of the sensitive attributes, while this susceptibility can be outlined with relatively straightforward methods, highlighting the need for stronger adversaries to stress test the provided threat model.

Key Contributions

EL-MIA framework for auditing membership inference risk at the entity level (PII, credit card numbers, etc.) rather than full documents/prompts
Benchmark dataset for evaluating entity-level MIA methods against LLMs, enabling systematic comparison of existing and novel techniques
Empirical analysis of entity-level MIA susceptibility as a function of model scale, training epochs, and other factors, showing existing methods are insufficient while simple approaches reveal non-trivial risks

🛡️ Threat Analysis

Membership Inference Attack

Core contribution is a new membership inference framework (EL-MIA) that operates at entity/PII granularity in LLMs — the adversary asks 'was this sensitive entity in the training data?', the canonical ML04 threat model. Paper also proposes two novel MIA methods and a benchmark dataset for evaluating this task.