OpenLVLM-MIA: A Controlled Benchmark Revealing the Limits of Membership Inference Attacks on Large Vision-Language Models
Ryoto Miyamoto 1, Xin Fan 1, Fuyuko Kido 1, Tsuneo Matsumoto 2, Hayato Yamana 1
Published on arXiv
2510.16295
Membership Inference Attack
OWASP ML Top 10 — ML04
Key Finding
State-of-the-art MIA methods approach chance-level accuracy on the controlled benchmark, revealing that previously reported high attack success rates were artifacts of distributional bias rather than true membership signal.
OpenLVLM-MIA
Novel technique introduced
OpenLVLM-MIA is a new benchmark that highlights fundamental challenges in evaluating membership inference attacks (MIA) against large vision-language models (LVLMs). While prior work has reported high attack success rates, our analysis suggests that these results often arise from detecting distributional bias introduced during dataset construction rather than from identifying true membership status. To address this issue, we introduce a controlled benchmark of 6{,}000 images where the distributions of member and non-member samples are carefully balanced, and ground-truth membership labels are provided across three distinct training stages. Experiments using OpenLVLM-MIA demonstrated that the performance of state-of-the-art MIA methods approached chance-level. OpenLVLM-MIA, designed to be transparent and unbiased benchmark, clarifies certain limitations of MIA research on LVLMs and provides a solid foundation for developing stronger privacy-preserving techniques.
Key Contributions
- Controlled benchmark of 6,000 images with carefully balanced member/non-member distributions and verified ground-truth labels across three LVLM training stages
- Demonstrates that high MIA success rates reported in prior work arise from distributional bias in dataset construction rather than genuine membership detection
- Shows state-of-the-art MIA methods collapse to near-chance performance when distributional confounds are removed
🛡️ Threat Analysis
The paper's entire contribution centers on evaluating membership inference attacks against large vision-language models — it builds a controlled benchmark with balanced member/non-member distributions and ground-truth labels specifically to stress-test MIA methods.