attack 2026

Membership Inference on LLMs in the Wild

Jiatong Yi , Yanyang Li

0 citations · 33 references · arXiv

α

Published on arXiv

2601.11314

Membership Inference Attack

OWASP ML Top 10 — ML04

Key Finding

SimMIA outperforms prior black-box MIA baselines by 15.7 AUC points on average and achieves parity with logit-based methods that require access to model internals.

SimMIA

Novel technique introduced


Membership Inference Attacks (MIAs) act as a crucial auditing tool for the opaque training data of Large Language Models (LLMs). However, existing techniques predominantly rely on inaccessible model internals (e.g., logits) or suffer from poor generalization across domains in strict black-box settings where only generated text is available. In this work, we propose SimMIA, a robust MIA framework tailored for this text-only regime by leveraging an advanced sampling strategy and scoring mechanism. Furthermore, we present WikiMIA-25, a new benchmark curated to evaluate MIA performance on modern proprietary LLMs. Experiments demonstrate that SimMIA achieves state-of-the-art results in the black-box setting, rivaling baselines that exploit internal model information.


Key Contributions

  • SimMIA: a black-box MIA framework using word-by-word sampling to eliminate error propagation (distribution drift) and a semantic scoring mechanism to capture soft membership signals beyond exact token matching
  • WikiMIA-25: a curated benchmark derived from 2025 Wikipedia dumps for evaluating MIA performance against modern proprietary LLMs
  • Outperforms existing black-box baselines by 15.7 AUC points on average, rivaling logit-based (gray-box) methods that require model internals

🛡️ Threat Analysis

Membership Inference Attack

Paper directly proposes SimMIA, a membership inference attack framework designed to determine whether specific texts were in an LLM's training set — the canonical ML04 threat.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxinference_time
Datasets
WikiMIAMIMIRWikiMIA-25
Applications
llm training data auditingcopyright infringement detectiondata contamination detection