Fast-MIA: Efficient and Scalable Membership Inference for LLMs

We propose Fast-MIA (https://github.com/Nikkei/fast-mia), a Python library for efficiently evaluating membership inference attacks (MIA) against Large Language Models (LLMs). MIA against LLMs has emerged as a crucial challenge due to growing concerns over copyright, security, and data privacy, and has attracted increasing research attention. However, the progress of this research is significantly hindered by two main obstacles: (1) the high computational cost of inference in LLMs, and (2) the lack of standardized and maintained implementations of MIA methods, which makes large-scale empirical comparison difficult. To address these challenges, our library provides fast batch inference and includes implementations of representative MIA methods under a unified evaluation framework. This library supports easy implementation of reproducible benchmarks with simple configuration and extensibility. We release Fast-MIA as an open-source (Apache License 2.0) tool to support scalable and transparent research on LLMs.

Key Contributions

Open-source Python library implementing multiple MIA methods (LOSS, PPL/zlib, Min-K% Prob, Min-K%++) under a unified, extensible evaluation framework
Fast batch inference leveraging vLLM and caching achieving ~5x speedup over standard implementations with negligible change in results
YAML-based configuration enabling reproducible, large-scale MIA benchmarks across models, datasets, and languages beyond English

🛡️ Threat Analysis

Membership Inference Attack

The paper's entire purpose is evaluating membership inference attacks against LLMs — determining whether specific data points were in a model's pre-training dataset. Fast-MIA implements representative MIA methods (LOSS, PPL/zlib, Min-K% Prob, Min-K%++) under a unified framework explicitly targeting this threat.